Logistic Regression: Gradient Descent

Ashna Jain
2 min readDec 22, 2020

--

Gradient Descent is a process that occurs when trying to find the minimal cost on a Cost Function graph. The process utilizes 3 features: the independent variable (x), the weight (w), and the learning rate (⋉).

Initially, the weight (w) for the respective independent variable (x) is set to 0. Then, the cost of the function is determined — based on the slope of the cost, the weight (w) of the independent variable (x) is updated. This process is repeated until the cost is very close to 0.

In order to determine how much to update each weight so that the Cost of the function is minimized, the derivate of the cost function with respect to each weight is calculated (dL/dwj).

Applying Gradient Descent for Training Datasets with Multiple Independent Variables:

Our goal during Gradient Descent is to find the derivative of the Loss function with respect to each weight (w) >> dL/dwj >> ‘dwj’. Based on the chain rule in calculus, it is imperative to also solve for ‘dz’ and ‘da’ in order to calculate ‘dwj’ for each independent variable (x).

Based on the rules of the chain rule, we determine that ‘dwj’ =(xj * dz), and db = (dz).

With this information, we can rewrite the cost function to calculate how much of the total cost each weight contributes:

--

--

No responses yet