CS229: Machine Learning

A distilled compilation of my notes for Stanford's CS229: Machine Learning.
Linear Regression
the supervised learning problem; update rule; probabilistic interpretation; likelihood vs. probability
Locally Weighted Linear Regression
weighted least squares; bandwidth parameter; cost function intuition; parametric learning; applications
Netwon's Method
Netwon's method; update rule; quadratic convergence; Newton's method for vectors
Logistic Regression
the classification problem; motivation for logistic regression; logistic regression algorithm; update rule
perceptron algorithm; graphical interpretation; update rule
Generalized Linear Models
exponential family; constructing GLMs; case studies: LMS, logistic regression, softmax regression
Gaussian Discriminant Analysis
generative learning algorithms; Gaussian discriminant analysis (GDA); GDA vs. logistic regression
Naive Bayes
naive bayes; laplace smoothing
Support Vector Machines
support vector machines; kernels
Learning Theory
data splits; bias-variance trade-off; case of infinite/finite \(\mathcal{H}\); deep double descent
Regularization and Model Selection
cross-validation; feature selection; bayesian statistics and regularization
Decision Trees
non-linearity; selecting regions; defining a loss function
Ensemble Methods
bagging; boostrap; boosting; Adaboost; forward stagewise additive modeling; gradient boosting
Neural Networks
basics; backprop; improving neural network accuracy
Debugging and Error Analysis
debugging ML models (overfitting, underfitting); error analysis
The k-Means Clustering Algorithm
k-means; convergence guarantee
Expectation Maximization Algorithms
mixture of Gaussians (non EM); expectation maximization
Factor Analysis
the factor analysis model; expectation maximization for the factor analysis model
Principal Component Analysis
PCA algorithm; eigenvectors; eigenvalues
Independent Component Analysis
ambiguities; densities and linear transformations; ICA algorithm
Intro to Reinforcement Learning and Adaptive Control
MDPs; Bellman equation; value and policy iteration; continuous state MDP; value function approximation
Linear Quadratic Regulation, Differential Dynamic Programming and Linear Quadratic Gaussian
finite-horizon MDPs; LQR; from non-linear dynamics to LQR; LQG; DDP; LQG
Course Info
Course description:
  • Stanford's CS229 provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance trade-offs, practical advice); reinforcement learning and adaptive control.
  • The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing.
The in-line diagrams are taken from the CS229 lecture notes, unless specified otherwise. Reproduced with permission.
If you found our work useful, please cite it as:
  author        = {Chadha, Aman},
  title         = {Distilled Notes for Stanford CS229: Machine Learning},
  howpublished  = {\url{https://www.aman.ai}},
  year          = {2020},
  note          = {Accessed: 2020-08-01},

A. Chadha, Distilled Notes for Stanford CS229: Machine Learning, https://www.aman.ai, 2020, Accessed: Aug 1 2020.