Primers • Partial Derivative of the Cost Function for Logistic Regression
- The partial derivative of the logistic regression cost function with respect to \(\theta\) is:
-
Let’s begin with the cost function used for logistic regression, which is the average of the log loss across all training examples, as given below:
\[J(\theta)=-\frac{1}{m} \sum_{i=1}^{m} y^{(i)} \log \left(h_{\theta}\left(x^{(i)}\right)\right)+\left(1-y^{(i)}\right) \log \left(1-h_{\theta}\left(x^{(i)}\right)\right)\]- where the logs are natural logarithms and \(h_{\theta}(x)\) is defined as:
- We use the notation:
- Since our original cost function is the form of:
- Now,
- Plugging in the two simplified expressions above in our original cost function, we obtain:
-
which can be simplified to:
\[\boxed{J(\theta)=-\frac{1}{m} \sum_{i=1}^{m}\left[y_{(i)} \theta x^{(i)}-\theta x^{(i)}-\log \left(1+e^{-\theta x^{(i)}}\right)\right]=-\frac{1}{m} \sum_{i=1}^{m}\left[y_{(i)} \theta x^{(i)}-\log \left(1+e^{\theta x^{(i)}}\right)\right]}\]- where the second equality follows from:
- Now, all you need is to compute the partial derivative of the boxed equation above w.r.t. \(\theta_{j}\), using the following:
- Finally, plugging in the two components above in the expression for \(\frac{\partial J(\theta)}{\partial \theta_j}\), we obtain the end result: