Two hidden layers with parametrizable size. Two possible transfer
functions, defaulting to reLU for now.
Initialize weights and biases randomly. This gives totally random
classifications of course, but at least makes sure that the data
structures and computations work.
Also already add a function to recognize the test images and count
correct ones. Without trainingh, 10% of the samples are expected to be
right by pure chance.