Logistic Regression: The Basics
Logistic Regression is a mathematical model used to estimate the probability (y_hat) of an event occurring. It is used on binary data in which the result is either ‘yes’ (1) or ‘no’ (0).
The sigmoid function ensures that none of the predictions (y_hat) values exceed 1 or are less than 0.
Example: Given a picture, whats is the probability (y_hat) that it is a picture of a Cat?
In Machine Learning, data is represented in training examples: (x,y) > where x is the input, and y is the determined output. The total amount of training examples is denoted by the variable, ‘m’.
To make the data more readable for the machine, all the x inputs and y outputs are separated and placed into their own individual vectors.
The x values are placed in an (nx by m) dimensional vector (vector X). And all the y values are placed in (1 by m) dimensional vector (vector Y).
- nx represents the total amount of independent variables for each training set
- m represents the total number of training examples
Notation for indexing the vectors:
- Superscript (x ↑ i) = the ith training example ← horizontal →
- Subscript (x ↓ j) = the jth independent variable ↑ verticle ↓
Logistic Regression requires 2 parameters:
- w : an nx dimensional vector
- b : a real number
The Logistic Regression Equation:
In the equation above, z represents a linear quantity, and the sigmoid of z represents the activation functions to make the output between 0 and 1.