Anyone and everyone who is new to the field of AI starts at Neural Networks, thus it seems fit for me to start with a post about them. This is a very intuitive, layman level approach to neural nets and will only talk about the basics and we’ll leave the math for later.
I have tried to model this around the questions I had as a beginner, sometimes the smallest things take up most of the time and that is exactly what happened with me. Being from a non-STEM background I had difficulty in comprehending everything about a neural network, and starting with the math first definitely did not help. I then took a step back, broke down the process and first defined it intuitively before tackling the math. So, here we go…
What are neural networks? What do they do?
An Artificial Neural Network (ANN), simply called a Neural Network, is a network of units, also called neurons, that takes an input, performs a computation on it and gives a desired or expected output. This desired output can be in the form of yes/no, 0 or 1, or it can predict specific values.
What are these inputs? How are they fed to an ANN?
The most commonly used example is the housing price example, where the network is trained to learn from various features and is expected to predict the price of a housing unit based on those features. These features are the input of our Neural Network. If the features are numerical values they are fed as they are, but if they are categorical values they are first encoded and then fed as input. This forms the first layer of our network and is called the input layer.
Note: Encoding and other forms of preprocessing are a separate topic, for now its important to understand and remember that data is preprocessed and one of the steps of preprocessing is encoding. We will talk about encoding and how it is done once we have a grasp of what an ANN actually is.
What happens after the input layer? How does the network learn?
After the input layer, a lot of computation happens on the input data and this is where the “learning” happens. Weights are applied to the input and it goes through more layers, these are called the hidden layers. Once it reaches the output, the network matches it’s predicted output, which is price of the house for our example, to the actual price in the data; if there is a lot of difference between the prediction and the output, the network retraces its steps, adjusts its weights and learns new weights on how to reach close to the actual price. Once the network can accurately predict the price of the house, new data which the network has never seen before is fed to see if the accuracy is maintained, if this accuracy is good, it means our network is ready to predict!
Some things to keep in mind,
1- The number of hidden layers depends on the problem type and data size; it should be chosen carefully.
2- The process from input layer to output layer is called the feed-forward step.
3- The retracing and adjusting of weights back to the input layer is called the back-propogation step and is the most important component.
4- The data is divided into three parts, training, validation and testing. The network is trained on the training dataset, validation dataset is used to check model performance and find discrepancies and the test set is touched only and only when we are confident about our model’s performance.
Lastly, why bother with a neural net?
A neural network has the ability to process a large amount of data and effectively learn from it. This learning doesn’t only lead to predictions and classifications but also offers solutions to bigger problems by means of a variety of network architectures which can be built upon or tweaked to suit our own needs.
Thank you for your time. All input is welcome, constructive criticism is deeply appreciated.