Info
The dataset should be downloaded automatically when you load the libraries.
In this exercise, you will experiment with neural networks for a multiclass-classification on images of clothes using PyTorch . The goal is to experiment with multilayer-perceptrons (MLP) and convolutional neural network (CNN) architectures and in particular how network topology and/or optimisation strategies may impact performance. Your main tasks will be to:
Additionally, this page revises evaluation metrics for multiclass classification.
This exercise uses the FashionMNIST dataset, which contains images of clothes article (from Zalando). Each image is a $28\times 28$ pixel grayscale image having a label of one of ten classes of clothing articles. A total of $60,000$ training samples and $10,000$ test samples are provided. A subset of the images sorted by class is shown in Figure 1.
As you will see, the FashionMNIST dataset is sufficiently small for teaching purposes, but it is not a trivial dataset to work with.
The dataset should be downloaded automatically when you load the libraries.
Most of the code is contained in python scripts. Refer to the docstrings whenever you are in doubt.
The file networks.py
contains a selection of neural architectures with different topologies. Inspect the predefined networks in the file.
PyTorchTrainer
in trainers.py
is the class used for performing training and evaluation. Inspect the source code.
The class MetricLogger
in metrics.py
contains methods for calculating the evaluation metrics:
reset()
sets the entries of the confusion matrix to zero. log(predicted, target)
adds a log entry to the confusion matrix based on the predicted and target values.one_hot
argument in the constructor is needed since Scikit-learn provides numerical predictions while PyTorch provides one-hot encoded predictions. This exercise will use evaluation metrics for multiclass classification. The confusion matrix $C$ is used to define metrics for binary classification. Figure 2 shows the $10\times 10$ confusion matrix $C$ for the FashionMNIST using a support vector machine. The true class is given on the x-axis and the predicted class on the y-axis.
Below is a description of the evaluation metrics for a specific class $i$. Note that precision and recall for multiclass classification are vectors describing the metric per each class.
Accuracy: The ratio of correct predictions and the total number of samples.
$$ accuracy_i = \frac{\sum_{i=1}^{10} C_{i,i}}{\sum_{i=1}^{10}\sum_{j=1}^{10} C_{i, j}} $$Precision: The ratio of correct predictions for class $i$ and the total number of predictions for that class.
$$ precision_i = \frac{C_{i, i}}{\sum_{j=1}^{10} C_{i, j}} $$Recall: The ratio of correct predictions for a certain class to the number of samples belonging to that class.
$$ recall_i = \frac{C_{i, i}}{\sum_{j=1}^{10} C_{j, i}} $$