In this experiment, I “invert” a simple two-layer MNIST model to visualize what the final hidden layer representations look like when projected back into the original sample space.
[Note 2017/03/05: At the time of writing this post, I did not know what an autoencoder was.]
Model Setup
This is a fully-connected model with two-hidden layers of 100 hidden neurons. The model also contains inverse weight matrices (w2_inv
and w1_inv
) that are trained after the fact by minimizing the l1 difference (x_inv_similarity
) between the inverse projection of a sample and the original sample.
1 |
|
Training the Model
First, we train the model, and then we train the inverse operations. The model achieves an accuracy of about 95%. Because we don’t want to confuse the inverse training with bad samples, we only train the model using samples that the model itself is confident it has classified correctly. This reduces noise in the inverse projections.
1 |
|
Visualizing inverse projections
We now show a visual comparison of the first 36 test samples and their projections.
1 |
|
1 |
|
I think the most interesting this about this is how the model completely transforms the misclassified digits. For example, the 9th sample and the 3rd to last sample each get transformed to a 6.
It’s also interesting that the inverse projections are somewhat “idealized” versions of each digit. For example, the orientations of the inversely projected 3s and 9s and the stroke width of the inversely projected 0s are now all the same.
Generating samples
Here we generate samples of digits 1-9 by first optimizing the hidden representation so that the neural network is confident that the representaton is of a specific class, and then outputting the inverse projection.
1 |
|
Here we see that the network is over 99.5% confident that each of its hidden layer representations are good representations. Below that we see their inverse projections.
1 |
|
A bit noisy, but it works!
Visualizing Features
We will now show the inverse projection of each of the 100 features of the hidden representation, to get an idea of what the neural network has learned. Unfortunately, the noise is overwhelming, but we can sort of make out shadows of the learned features.
1 |
|