ML Optimization Beyond Gradient Descent! #4

Mar 29, 2024

In machine learning, the goal of optimization is to make better models. This goal is achieved by adjusting model parameters iteratively to minimize errors or maximize performance.

Now, what is meant by making better models?

In simple words, it means reducing the gap between predicted values and actual values. We want our predictions to be as close to the actual values as possible. This helps the model to capture meaningful patterns from the data and increase its generalization capabilities (performance on the unseen data)

Gradient Descent is a common name when it comes to optimization algorithms in machine learning. This algorithm always steals the spotlight, but there is also another lesser known player in this domain. This player is called Newton Raphson method.

Before we dig deeper into Newton Raphson method, just wanted to share the link for detailed post on Gradient Descent. It explains the components and working of Gradient Descent in a very simple manner. Check it out if you are interested.

Gradient Descent Simplified for Machine Learning Mastery!

Kavita Gupta, PhD

March 2, 2024

Read full story

Newton Raphson is an iterative numerical method to approximate the roots of a real-valued function. It is widely used in the cases when we need to find the roots of higher degree polynomials (generally degree greater than 3) or transcendental (non-algebraic) equations .

The idea behind Newton Raphson is simple. Start with an initial guess. Draw tangent at that point. Notice the point where this tangent cuts the x-axis. This will be our next point of consideration in the iteration and this will be closer to the actual root. Keep repeating this process until some stopping criteria is met.

Steps for Newton Raphson Method

1). Select an initial value say x0 using Bolzano’s theorem.

2). If f(x0 )=0, then x0 is a root. Otherwise calculate,

x1=x0-f(x0)/(f′ (x0) ) such that f′(x0)≠0

3). If f(x1)=0, then x1 is a root. Otherwise calculate,

x2=x1-f(x1)/(f′ (x1) ) such that f′(x1)≠0

4). Keep repeating the same process until stopping criteria is met.

Newton Raphson method fails to converge if at some point, the gradient of the function is zero and the value of the function at that point is not zero (clear from the formula mentioned in the step 2). Then, we may need to proceed with another initial value and repeat the process.

How Newton Raphson is used in machine learning?

Take the example of Logistic Regression.

Logistic Regression is a standard algorithm to solve classification problems. The parameters of the algorithm are found by maximizing the log-likelihood function.

In order to find the points of maxima (critical points) of the log-likelihood function, the derivative is set equal to zero. Finding the roots of this equation will give the optimal values of parameters. Newton Raphson method can be used to find the roots of this equation.

Why learning about Newton Raphson is important?

You may ask why do we need to worry about so much mathematical stuff if there are already inbuilt libraries like sklearn in Python which can implement logistic regression directly.

Your doubt makes sense, but there are situations where knowing the mathematical working of an algorithm helps.

Suppose, you are sitting in a machine learning interview and the interviewer asks you to explain logistic regression from scratch using pen and paper.

If you know how Newton Raphson works, you can use it to maximize the log-likelihood function in logistic regression.

Fundamental understanding of an algorithm always increases your chances of selection in an interview.

So, now you have two optimization algorithms in your pocket to tackle machine learning problems using pen and paper: Gradient Descent and Newton Raphson.

Curious about a specific ML topic? Let me know in comments.

Also, please share your feedbacks and suggestions. That will help me keep going.

See you next Friday!

-Kavita

“There is no end to education. It is not that you read a book, pass an examination, and finish with education. The whole of life, from the moment you are born to the moment you die, is a process of learning.” — J. Krishnamurti

P.S. Let’s grow our tribe. Know someone who is curious to dive into ML and AI? Share this newsletter with them and invite them to be a part of this exciting learning journey.

ML & AI Cupcakes

Gradient Descent Simplified for Machine Learning Mastery!

Discussion about this post