Exploring Neural networks, the black box

         In this blog, I'm going to share some of the insights I gained from the deeplearning.ai course from Coursera by Prof. Andrew Ng. This specialization consists of five separate courses: Neural networks, how to hyper tune the model, structuring machine learning projects, convolutional neural network, and sequence modeling. I am not going to blow by blow of the course. I aim to describe the neural network as a means of fundamental concepts and how they help to build a better model ( as far I know). In the end, I want to add a hack to explore the neural network architecture of any pre-built model.


Ok, enough with the intro, let's get started, 


Why deep learning is a hype in the '21s?

In recent times "deep learning" creates a buzz in the AI industry. We have so many points to add to prove neural networks are performing well than other traditional methods. It's getting more hype because it faces every industrial problem without explicitly spoon feed everything. Like neural networks have the power of extracting features layer-wise, it resembles human intelligence and performs various tasks based on the data given. But neural networks too prone to things like overfitting, high amount of data, high processing units, etc ,. But there are alternative algorithms, methods are developing day by day, and various problems are overcoming throughout the days.


What is a Neural Network?

The neural network was developed from the idea to resemble the structure and functions of the human brain. I want to start with pictorial representation for better understanding.

This is how neural networks work to solve the various task. Like the picture above, the neural network has the power of prediction based on the inputted data. We can do things like object detection, letter recognition, speech recognition, language translation, music generation, and a lot more. There is a lot more to learn the neural network (black box in the middle), which is challenging to create and to explore.


The common architecture of the neural network looks like
                               
where the first layer is the input layer where we pass the data to the neural network and the last one is the output layer where we get the predicted output. The layers in between the two are hidden layers where the operations are being performed on the data. It is something similar to 

                                               x[n]  => Hidden layers => y 

Here x[n] is the inputs and y is the predicted output.

We all have the question "Really what are the hidden layers doing ?"

Let's see how the neural networks are working,

The circle in the above picture is called neurons. Each neuron is associated with some values. Those values are called weights or trainable parameters and they are randomly initialized in the network. To get the value of y, we are performing various calculations in the neurons.

In short, it will be like 

                                             Y = X(input) * weights + bias 


We all have the question "Really what are the hidden layers doing ?"

But before going to that, let's take a use case for a proper explanation, 
for the problem of predicting a house price(y), the input features will be square feet of the house (x1), no of bedrooms(x2), km away from the city (x3), Nearby showrooms(x4)

The weights play a very major role in the computations. The weights decide how much the input influence on the output ( in other words it strengthens the connection between the two neurons when the values are passed) 

If the weight is near to zero, it means the input feature does not give any impact to the output eg: Nearby showrooms do not affect the price of the house.

If the weight is larger, it means the output highly depends on the input feature. eg:  No of bedrooms 

If the weight is negative, it denotes decreasing the input feature value increases the output value. eg: when km away from the city decreases, then the price of the house will increase.

Bias is always 1 and it makes sure if ever other neurons have zero value it gives some activation to the next neuron.

Optimization:

        We cannot ensure that the neural networks are giving correct predictions because the weights are initialized randomly. To get the optimal solution, we are calculating the loss function, it determines how good the predicted value (y^) from the original value (y). This is called loss error function. But in order to calculate overall loss, we are performing calculations similar to the average for all outputs. While doing so, we are updating the weight parameter in order to bring the predicted value close to the original value from the learning.(using a derivative of the cost function with respect to weight and bias).

We can able to see all these concepts in the below picture.


This is my simple view of neural network fundamentals.

Fine, we will move to the final interesting part of how to explore the neural network, the black box

Exploring the black box:
        From the last several years, I have a doubt how god really created humans at first and I think of whether he had any design and dimensions of how to create bones, parts, joints, etc......... Likewise, the neural networks make me think about how these things are going inside the neural network. Then one day I saw an interesting thing on the internet to view the architecture of any model file, which I was thinking long what's the purpose of creating all those .h5 file.

There is an app called Netron, where we can able to input the .h5 (other kinds of similar files are supported there). We can open the model file there like the picture below


Then it will give the architecture of the model like 



By using this, we can able to find which architecture (like vgg16,vgg19,resnet, etc) of the model was built. 
One more thing, we can able to see what the kernel is doing by clicking the + icon on the right side and save as NumPy array.



Then with the python script, we can able to see that kernel visually as an image like vertical filter, horizontal filter ,etc,.



You can able to see the python code for that in my github page. Here is the link
https://github.com/bagavathypriyanavaneethan/Data-science-Python/raw/master/netron%20img.py

ok, it's a long way. Thanks for reading patiently, if anyone knows any other way to visualize or can able to see how the neural networks work, share your ideas in the comment section. And let me know your thoughts or any inconvenience regarding this blog.

Regards,
Bagavathy Priya N






Comments

Popular posts from this blog

Why Statistics is piece of a data science pie?