Part2: Lagrange multipliers & Optimizations
website: https://eliteaihub.com/
We can find the maximum or minimum of a multivariable function with some constraint using the Lagrange Multiplier. It’s important to understand Lagrange Multiplier to solve constraint optimisation problems, like we have in SVM.
Langrange Multipliers??
A contour plot is a way to visualise a three-dimensional function into two dimensions. Let’t say we want to find the maximum value of the multivariate function as shown below,
A contour plot is a graphical technique for representing a 3-dimensional surface by plotting constant z slices, called contours, on a 2-dimensional format. That is, given a value for z, lines are drawn for connecting the (x,y) coordinates where that z value occurs.
The gradient of a function can be visualised as a vector field, with the arrow pointing in the direction where the function is increasing: (In image to left: arrow pointing outward)they are perpendicular to a contour line they should point in the direction where the function is increasing.
Let’s understand this using an example, For given example f(x, y) is the function to optimise and g(x, y) is the constraint.
Draw some gradient vectors of the objective function (in blue) and some gradient vectors of the constraint function (in red).We can see that there is only one point where two vectors point in the same direction: it is the minimum of the objective function, under the constraint.
Draw some gradient vectors of the objective function (in blue) and some gradient vectors of the constraint function (in red).We can see that there is only one point where two vectors point in the same direction: it is the minimum of the objective function, under the constraint.
In order to find the minimum value of the function, we need to find points where,
∇f(x,y)=λ∇g(x,y)
λ: is called lagrange multiplier
∇f(x,y)−λ∇g(x,y)=0 — — — — (1)
L(x,y,λ)=f(x,y)−λg(x,y) — — — -(2)
then its gradient is:
∇L(x,y,λ)=∇f(x,y)−λ∇g(x,y) — (3)
This function L is called the Lagrangian, and solving for the gradient of the Lagrangian (solving ∇L(x,y,λ)=0) means finding the points where the gradient of f and g are parallels.
Solving, previous optimization problem,
We have found a solution :
x=1/2, y=1/2 and λ=1
Thanks for reading. Stay tuned for next story — Part3: SVM Optimisation using lagrange multipliers from scratch using python.