Classification

Concepts

MINST, the dataset

Dataset, data + target

Training Set v/s Test Set. Apply Decision Function on Training Set to get hyper-parameters of Machine Learning Models, which will be verified and measured by applying Loss Function to Test Set.

Decision function. A function that assigns a positive or negative class to the instances.

Binary Classifier: to classify if it's A or not A.

Multiclass classifier, or multinominal classifier: to classify instances to ( A, B, C, ... ).

Performance Measures: Is the classifier good or not?

Cross-Validation, cv

K-Fold, run training sets over K groups to perform cross validation.

Cross-validation (statistics) - Wikipedia

Skewed dataset. When some classes are much more frequent than others. Accuracy is generally not the preferred performance measure for classifiers.

Confusion Matrix. The general idea is to count the number of times instances of class A are classified as a class B.

False positives
False negative
True positive
True negative

Precision | Accuracy of positive.

TP / ?

Recall | Sensitivity, or True Positive Rate.

TP / ?

True Negative Rate | Specificity, or TNR.

TN / ?

Precision Recall Trace-Off. Increasing position reduces recall and vice versa. Plotting a precision recall curve to determine the best threshold .