Hello Everyone,
Welcome to the 36th edition of my newsletter ML & AI Cupcakes!
In this newsletter, I have listed down 15 basic things you should know about neural networks.
Neural networks are inspired from the working of human brain.
Each neural network is made up of three layers: input layer, hidden layers and output layer.
A two-step calculation process happens in each neuron. Step one is calculating weighted sum of input features and adding bias to it. Step two is applying activation function to the value obtained in step one.
The main power of neural networks comes from the use of activation functions. They help neural networks to capture non-linear patterns in the data. Without them, neural networks are nothing more than a stack of linear regression models.
Weights and biases are the primary learnable parameters in a neural network. Their optimal value is obtained through an iterative process.
Neural networks training majorly includes these four steps: forward pass, loss calculation, backward pass and weights/biases update. These steps are iteratively repeated until a stopping criteria is met.
Forward pass is used to give predictions using current weights and biases.
Loss calculation is done using the chosen loss function. It measures the difference between actual and predicted values.
Backward pass includes calculating gradients using chain rule of differentiation.
The most commonly used optimization algorithms for backpropagation are Stochastic Gradient Descent and its variants, RMSProp, AdaGrad, Adam optimizer etc.
Learning rate is an important hyperparameter which determines the amount of udate in weights/biases. If it is too high, it may lead to divergence or instability. If it is too low, it may lead to slow convergence.
Regularization techniques like L1, L2, dropout etc. help to reduce overfitting in neural networks.
Epochs and iterations are not same in neural networks. Check out their differences below:
https://kavitagupta.substack.com/p/epochs-and-iterations-in-neural-networks
Neural networks are universal function approximators. It means that they can approximate any continuous mathematical function.
Choice of activation functions is really crucial. Sigmoid function often leads to “vanishing gradient” problem and ReLU leads to “dying neurons” problem. Select better alternatives based on the problem at hand.
What would you like to add to the list?
Writing each newsletter takes a lot of research, time and effort. Just want to make sure it reaches maximum people to help them grow in their AI/ML journey.
It would be great if you could share this newsletter with your network.
Also, please let me know your feedbacks and suggestions in the comments section. That will help me keep going. Even a “like” on my posts will tell me that my posts are helpful to you.
See you soon!
-Kavita