Logistic Regression: The Cost Function
What is the Cost Function? The Cost Function measures the performance of a model by quantifying the error between predicted values and expected values. The bigger your loss is, the more different your predictions (ŷ) are from the true values (y). In deep learning, you use optimization algorithms like Gradient Descent to train your model and to minimize the cost.
The Graph of Cost Functions:
In Data Science, the cost of a function should be minimized, therefore, the model strives to attain the minimum on the graph. The minimum represents the optimal weights and (b) value that the independent variables (x) should use.
A Primitive Cost Function:
This cost function does not guarantee that the resulting graph will be convex, but can instead produce a cost function with multiple local optima. So instead a more efficient equation is used that is guaranteed to always produce a convex cost function.
Why the Loss Function equation makes sense:
- if y = 1: L(y_hat, y) = -log(y_hat) → in this case you want y_hat to be close to 1 so that the function can be close to 0
- if y = 0: L(y_hat, y) = -log(1-y_hat) → in this case you want y_hat to be close to 0 so that the function can be close to 0
The Loss Function vs The Cost Function: the Loss Function is used for 1 training set, whereas the Cost Function is used for the entire training dataset.