CS229: Machine Learning

A distilled compilation of my notes for Stanford's CS229: Machine Learning.
Notes
Linear Regression
the supervised learning problem; update rule; probabilistic interpretation; likelihood vs. probability
Locally Weighted Linear Regression
weighted least squares; bandwidth parameter; cost function intuition; parametric learning; applications
Netwon's Method
Netwon's method; update rule; quadratic convergence; Newton's method for vectors
Logistic Regression
the classification problem; motivation for logistic regression; logistic regression algorithm; update rule
Perceptron
perceptron algorithm; graphical interpretation; update rule
Generalized Linear Models
exponential family; constructing GLMs; case studies: LMS, logistic regression, softmax regression
Gaussian Discriminant Analysis
generative learning algorithms; Gaussian discriminant analysis (GDA); GDA vs. logistic regression
Naive Bayes
naive bayes; laplace smoothing
Support Vector Machines
support vector machines; kernels
Learning Theory
data splits; bias-variance trade-off; case of infinite/finite \(\mathcal{H}\); deep double descent
Regularization and Model Selection
cross-validation; feature selection; bayesian statistics and regularization
Decision Trees
non-linearity; selecting regions; defining a loss function
Ensemble Methods
bagging; boostrap; boosting; Adaboost; forward stagewise additive modeling; gradient boosting
Neural Networks
basics; backprop; improving neural network accuracy
Debugging and Error Analysis
debugging ML models (overfitting, underfitting); error analysis
The k-Means Clustering Algorithm
k-means; convergence guarantee
Expectation Maximization Algorithms
mixture of Gaussians (non EM); expectation maximization
Factor Analysis
the factor analysis model; expectation maximization for the factor analysis model
Principal Component Analysis
PCA algorithm; eigenvectors; eigenvalues
Independent Component Analysis
ambiguities; densities and linear transformations; ICA algorithm
Intro to Reinforcement Learning and Adaptive Control
MDPs; Bellman equation; value and policy iteration; continuous state MDP; value function approximation
Linear Quadratic Regulation, Differential Dynamic Programming and Linear Quadratic Gaussian
finite-horizon MDPs; LQR; from non-linear dynamics to LQR; LQG; DDP; LQG
Course Info
Course description:
  • Stanford's CS229 provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance trade-offs, practical advice); reinforcement learning and adaptive control.
  • The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing.
Credits
The in-line diagrams are taken from the CS229 lecture notes, unless specified otherwise. Reproduced with permission.
Citation
If you found our work useful, please cite it as:
@misc{Chadha2020DistilledNotesCS229,
  author        = {Chadha, Aman},
  title         = {Distilled Notes for Stanford CS229: Machine Learning},
  howpublished  = {\url{https://www.aman.ai}},
  year          = {2020},
  note          = {Accessed: 2020-08-01},
}

A. Chadha, Distilled Notes for Stanford CS229: Machine Learning, https://www.aman.ai, 2020, Accessed: Aug 1 2020.