Aman's AI Journal • CS229: Machine Learning

Linear Regression

the supervised learning problem; update rule; probabilistic interpretation; likelihood vs. probability

Locally Weighted Linear Regression

weighted least squares; bandwidth parameter; cost function intuition; parametric learning; applications

Netwon's Method

Netwon's method; update rule; quadratic convergence; Newton's method for vectors

Logistic Regression

the classification problem; motivation for logistic regression; logistic regression algorithm; update rule

Perceptron

perceptron algorithm; graphical interpretation; update rule

Generalized Linear Models

exponential family; constructing GLMs; case studies: LMS, logistic regression, softmax regression

Gaussian Discriminant Analysis

generative learning algorithms; Gaussian discriminant analysis (GDA); GDA vs. logistic regression

Naive Bayes

naive bayes; laplace smoothing

Support Vector Machines

support vector machines; kernels

Learning Theory

data splits; bias-variance trade-off; case of infinite/finite \(\mathcal{H}\); deep double descent

Regularization and Model Selection

cross-validation; feature selection; bayesian statistics and regularization

Decision Trees

non-linearity; selecting regions; defining a loss function

Ensemble Methods

bagging; boostrap; boosting; Adaboost; forward stagewise additive modeling; gradient boosting

Neural Networks

basics; backprop; improving neural network accuracy

Debugging and Error Analysis

debugging ML models (overfitting, underfitting); error analysis

The k-Means Clustering Algorithm

k-means; convergence guarantee

Expectation Maximization Algorithms

mixture of Gaussians (non EM); expectation maximization

Factor Analysis

the factor analysis model; expectation maximization for the factor analysis model

Principal Component Analysis

PCA algorithm; eigenvectors; eigenvalues

Independent Component Analysis

ambiguities; densities and linear transformations; ICA algorithm

Intro to Reinforcement Learning and Adaptive Control

MDPs; Bellman equation; value and policy iteration; continuous state MDP; value function approximation

Linear Quadratic Regulation, Differential Dynamic Programming and Linear Quadratic Gaussian

finite-horizon MDPs; LQR; from non-linear dynamics to LQR; LQG; DDP; LQG

RL Debugging and Diagnostics

Course website

Lectures - Autumn 2018

Full syllabus

Course description:

Stanford's CS229 provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance trade-offs, practical advice); reinforcement learning and adaptive control.
The course will also discuss recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing.

The in-line diagrams are taken from the CS229 lecture notes, unless specified otherwise. Reproduced with permission.

If you found our work useful, please cite it as:

@misc{Chadha2020DistilledNotesCS229,
  author        = {Chadha, Aman},
  title         = {Distilled Notes for Stanford CS229: Machine Learning},
  howpublished  = {\url{https://www.aman.ai}},
  year          = {2020},
  note          = {Accessed: 2020-08-01},
}

A. Chadha, Distilled Notes for Stanford CS229: Machine Learning, https://www.aman.ai, 2020, Accessed: Aug 1 2020.