Building a Number Detecting Neural Network

Riya Kumar
5 min readMar 19, 2019

--

Robot doing homework?

What if I could get a robot to do my math homework? It’d have to be able todo a few things, detect the numbers in my homework, detect the operations in the equation, and any variables. So I built a number detector, which detects hand written digits. By using a preinstalled data set from the Keras library of a few handwritten digits, I was able to make a deep neural network model and train it to detect the numbers in the images!

When building a neural network that can detect handwritten numbers, there’s a few layers to the process and the network. It’s called a deep learning neural network since it has more than one layer, and I’ll explain the layers later when I get into the code, but first, how does it work?

Image result for neural network
A basic model for deep neural nets

Since our neural network has multiple layers/neurons, it has multiple weights and biases too. In a simple neural network with just one neuron, the inputs are all multiplied by the weights, summed together, and then they go through an activating function. An activating function is a function which takes the sum and gives a value indicating if the neuron is being fired or not. Since in a deep neural network there are multiple hidden layers, each neuron passes along its results from the activating function, all the way until the output layer is reached.

So the code!

I started off by importing TensorFlow, TensorFlow is an open source library through which we can access a bunch models and algorithms. I then defined the variable mnist, as the dataset we are importing which is a very basic dataset of hand-written numbers from 0 to 9. Since the images are already all formatted to be 28x28 we don’t need it and we can right into the next step which is the x_train, y_train, and x_test, y_test. x_train is the pixel values of the images being fed into the neural network, while y_train is the actual number that image is like a 3 or a 9. The x_test and y_test are parts of the data I’ve reserved aside to test the model with.

An example of the pixel values, or what x_train would be equal to.

Since the values in x_train are quite large ranging from about 0 to 255, it’s best for us to normalize the data. What that means is that we scale the data to a number between something like 1 and 0 or -1 and 1.

In our case, we can use the normalize functions to bring both x_train and x_test to numbers between 0 and 1, so our data start looking like this:

x_train after being normalized!

Now to start our model off the model I used a sequential model which is essentially a type of neural network model where it is feedforward and there’s no going backwards in the model. Then, the image is currently 28 by 28 and I need to it to be 1 by 784 to feed into our neural network, using the flatten function built into Keras a library, I can do it in one step!

Since the data is now flattened, and there’s a basic model set up, I can now add hidden layers to it. I used a Dense layer since it is fully connected and each node is connected to the one after and before it. The 128 indicates that it has 128 units, and the activation function is being defined as a relu function which is a useful default function to use. The last line is the output layer and it had 10 nodes (for each possible number that the image could be classified as), and the activation function being used is the softmax function. The softmax function gives us a probability of what the image most likely is while other activation functions give us something more definite like a 0 or a 1.

Now to optimize the current model I have, I used an Adam optimizer which is a basic and easy to use optimizer for neural networks. I chose a cross-entropy as a classifier since the network is trained to minimize loss rather than maximizing accuracy.

To train the model I had to run it through 3 times, so it went through 3 epochs and with each epoch the accuracy seems to improve and loss goes down.

But to make sure the model I made didn’t overfit, and simply memorize the data it got, I had to test it with out-of-sample data. This is data that wasn’t used when training the model and will prove if the model picked up patterns(generalized) or if it simply memorized the data(overfit). So I used the x_test and y_test sets to test the model.

To conclude the model I printed out the predictions of the neural network for the test data, which gives us this:

It doesn’t make a lot of sense does it, so to fix that I put the predictions variable through an argmax function to give us a more comprehensible result.

And ta-da! The prediction is that the number is 7, and below I put in 2 lines of code to display the image of the data that the neural network classified and it is indeed a 7!

Key Takeaways

Building a number classifying neural network is a simpler task than it seems, and it can be applied to many things ranging from making a bot that can do your math homework to a text to speech bot that can aid those with visual impairments. In addition to that the model used in my neural network can be altered slightly and used again for other image classifying neural networks, and the possible applications for image classifiers are endless!

--

--

No responses yet