## My artificial neural networks are not human

In the first two chapter write-ups of Michael Nielsen's book, I've commented a couple of times on the parallels between the neural networks on the computer in front of me and the workings of human brains. Part of this might be well-justified – we do really have lots of neurons in our brains that interact with one another, and the artificial networks I've learned how to code may be correctly abstracting some of those biological processes, and hence correctly recognising handwritten digits.

On the other hand, I think that the terms *neural* and *neuron* prime me to make these comparisons. Would I be so quick to compare
the code's performance to my own if it was called "many-matrix non-linear optimisation"? I'm not so sure.

In this post I describe a couple of ways in which the networks I've trained, which correctly classify about 98% of the digits in the MNIST test set, show some decidedly non-human behaviour. I'll tell this story in the approximate chronological order that I worked through it, instead of a more logical order; eventually I meander my way to reproducing some of Nguyen et al.'s paper on this subject.

### PCA

I don't often use principal component analysis, and my understanding of it is weak. But I know that it exists, and I expected that it would be a reasonable way to classify the MNIST digits, thereby giving me a benchmark to compare the neural networks against. The idea is to treat each of the 784 pixels as a random variable, and then calculate the (784 × 784) covariance matrix using the training images. The eigenvectors of this matrix are the principal components of the data, and the hope is that the first "few" (in descending order of the corresponding eigenvalues) PC's explain most of the variance of the sample data.

(Usually you convert the sample data to z-scores before calculating the covariance matrix. i.e., for each pixel, you subtract the mean over the training images of that pixel, and the divide by the standard deviation of that pixel. I did the first bit but not the second because I was lazy and R's prcomp function spat at me, complaining that one of the pixels had a standard deviation of zero and refusing to divide by it. Unrelatedly, note that since the covariance matrix is symmetric, the eigenvectors are real and orthogonal.)

What do these eigenvectors look like? Well, since the dot product of any pair of them is zero, they usually have some negative entries. But re-scaled so that the entries all lie in [0, 1], the first few eigen-digits are ghostly but recognisable:

I'd like to delve into these principal digits, but that can wait for a separate post (so far I've got to 70% classification accuracy with them). These pictures made me wonder: what would a digit created (somehow) from a neural network look like?

### Neural network digits

The first idea I had was to start with a random input image, and use the back-propagation idea to gradient-descend the input pixels while holding constant the weights and biases of the network. I wasn't sure what would happen, and had three guesses:

- The algorithm wouldn't converge because the pixel values are constrained to lie in [0, 1] and I wasn't going to work out how to handle that properly. Instead I was just going to calculate the gradient as though the pixel values could take any real value, and just truncate to [0, 1] at each step. Alternatively, it wouldn't converge because the back-propagation only works well when it has the huge number of weights and biases to optimise over, instead of the relatively small number of input pixels.
- The algorithm would converge and generate wonderful idealised digits.
- The algorithm would converge and generate images barely distinguishable from random noise.

The middle guess was partly inspired by Google's DeepDream blog post, which showed the neural nets generating, e.g., amusing pictures of dumbbells with arms attached. The latter guess was informed by my memory of this Tumblr post, which (I learned after finding the post again) took the pictures from Nguyen et al., arXiv:1412.1897.

Reality was a mix of 1 and 3. Sometimes it didn't converge (to be fair, I didn't put much effort into make it work), and when it did get, it was difficult to discern much in the resulting pictures. The following are the digits 0 to 9 created by this process. The network classifies nine of them as the desired digit with "probability" from 96% to 99.8%; one of them spectacularly failed to converge, with no output neuron firing past 12%. Which is the odd one out?

The '1' is the one that didn't converge (second from left in the top row). Of the rest, only the 7 stands out to me as plausibly a digit – and even there, it's cheating to say that it worked, because I forgot to invert the pictures, so it should be a light digit on a dark background, when instead it converged the other way.

If I'd read that Google post again, I'd have seen that, about the approach I tried here: "By itself, that doesn't work very well". Instead I figured that the approach was doomed, and went to the Nguyen et al. paper to see what they had done (I was a bit annoyed at the lack of convergence – not just the '1', but the 96% or 98% cases weren't really close enough to 1 for my liking).

A quick skim of the paper revealed that, while also doing gradient descent, they used a genetic algorithm to generate noisy images that the network gave very confident classifications for. So I spent a couple of minutes on Wikipedia, then coded something that, although probably not properly implemented, is at least genetic-ish. It often returned images which the network classified with "probability" greater than 99.99%, which is the important thing.

(I'm typing this up a few days after generating these images, having since made many changes to my code, and I don't use a proper version control system. It looks to me like I might have started inverting the images so that they should be black ink on a white background when making the '2' and the '9' here. Also the base64 encoding of the '2' image looks like it has made it lighter still? These details aren't important for the main thread of this post though.)

None of these images is clearly recognisable as a digit, though most have some sort of visible structure, you can see a bit of a zero in the '0'
image. Generally, though, the backgrounds are incredibly noisy. I thought of an idea: perhaps the backgrounds are noisy because usually the pixels
near the edge of the image are plain white, so the network gains no information from those areas while training. As a result, any mishmash of pixel
values away from the central area won't change the network's classifications much, and the random initialisation of the images in the genetic-y
algorithm means that much of the final images is random noise.^{*} That doesn't explain why the central area is also quite noisy, but there *is*
usually some discernible structure near the middle of the pictures above.

^{*}*I want to emphasise that this is totally a post-hoc just-so story, even if it happens to be true. I'd have been just as convinced by
the opposite story: almost all the training images had plain white pixels near the edge, therefore the network will only give high-confidence
classifications if the background is mostly plain white.*

I figured that if I averaged over many different output images for the same digit, the noisiness would cancel out and I'd see the essence of what the genetic-y algorithm was converging towards. For whatever reason, the '2's seemed to converge a lot faster than the others, so I asked for a thousand of them, and took the average:

That's... not obviously a '2', but you can sort of see it. Interestingly, while the background is the grey average across random noise, there's definitely whiter pixels near the middle of the image, contrasting with the drawn digit itself. (I haven't made the equivalent images for the other digits, because it'd take more computation time than I think is really warranted on this side-alley.)

To try to investigate what happens when background pixels get scrambled, I opened up the Ubuntu Paint-equivalent and drew a 2:

Just to check that all was working, I ran it through my network, and it classified it, with greater than 99.9% probability, as a 5. Well OK, maybe my mouse-writing isn't very good. So I drew a 9:

The network basically rejected this image. The highest output neuron was the '3', at about 2%. Now, my mouse-writing's not that bad, so in addition to seeing unrecognisably noisy images classified very strongly as digits, we've now also seen very clear digits either strongly misclassified or not classified.

What I think is going on with these hand-drawn images is that I drew to the edge of the image, instead of keeping a clear border like there is in most of the training images. Indeed, when I drew a smaller 2 (albeit a different style, but that shouldn't be an issue), the network called it correctly with "probability" greater than 99.99%:

I never spent much time investigating my original purpose along this line of thought (changing background pixels). As a single data point, though, the network's output layer fired at the 2 and 3 neurons at 46% and 22% respectively (others negligible) with this image as input:

### More failed images and code

Earlier I truncated a quote from the Google blog post on making images from neural networks. "By itself, that doesn't work very well, but it does if we impose a prior constraint that the image should have similar statistics to natural images, such as neighboring pixels needing to be correlated."

I haven't got this to work very well. My genetic-y algorithm, now informed by the need to get the overall stats right, goes like this:

- Initialise a population of images by drawing pixel values randomly from the approximate desired distribution.
- Calculate the cost function for each image. The cost function is the sum of abs(output - wanted_output), plus a sum of pixel-neighbour terms, comparing pixel-neighbour statistics to the desired values. (These pixel-neighbours are only calculated along rows and columns of the image, which I think explains some artefacts in the images below.)
- Pick the best few images from the current population and randomly generate a new population by breeding them. Pixels for the new images will either be a random selection from the two parents, or a random value drawn from the desired distribution of pixel values.
- Loop.

Here are some 2's, allegedly:

And here is the mean of 1000 such 2's:

That one's actually not so bad if you zoom in. The biggest problem is that the right-hand part of the arc extends vertically too far (presumably caused by me trying to enforce pixel-neighbour correlations irrespective of location), and the horizontal stroke at the bottom also extends too far.

It was an interesting exercise working through this, even though the results are pretty useless. It was really boring to write this up though.

Here's the R code to calculate the histogram and pixel-neighbour stats (the latter are inspired by the
variograms of my day job, though I take absolute values rather than squares). The `g_x`

and `g_y`

are the lists which hold these stats. I haven't tidied this code up, and the histograms are stored as a matrix rather than a
list.