Why Machine Learning course CS229 from Stanford is still one of my favorite

There has been a few years since I’ve encountered Stanford course CS 229 Machine Learning (ML) from Professor Andrew NG and it still remains one of my favorite courses. Last year I watched some of his lectures again to refresh my memories on particular topics. Even though I was familiar with applied machine learning I found the course useful. Today I will try to explain why I still think it may be the best free online ML course.

The whole set of 20 lectures of CS229 was uploaded on Stanford Youtube channel in July 2008. I discovered the course later in 2009 when searching for online programming course CS106A aimed towards Software Engineering. At that time I was doing Bachelor studies in Computer Science and was looking for ways to supplement regular courses from my University. The quality of course material was high and all important concepts were explained in math formulas so it was clear even for people without perfect English. The timing was also good because I finished my calculus courses recently so the memories were still fresh. I got an overview of major ML methods, why they work and had some ideas on how I could use it for personal projects.

My second encounter with CS229 happened in 2012 in somewhat similar circumstances. I was a student again, also pursuing a degree in Computer Science, only this time it was Master’s degree and in a totally different country. All my study courses had been already completed and I was looking for a topic for my thesis. The topic I selected was a fusion between Software engineering and Machine Learning, both of which were interesting for me. At that point I had more experience in Software Engineering and was looking for papers, books and courses in ML. However, after getting familiar with the resources I got I still found CS229 the most efficient for refreshing my ML theory. Sure, it wasn’t advanced but it was solid and consistent, covering the topics it outlined without hasting the reader and leaving gaps between concepts. After re-watching the course I could move forward to more advanced materials in a particular topic.

This year my colleague with background in Software Engineering needed to learn ML theory in order to work on a ML project. I reviewed the available introduction courses and found that CS229 got a short and updated version on Coursera. After skimming through the course I found it to be similar to the old CS229. Although, it seems like a lighter version, that covers less methods, less heavy on math and more friendly for beginners. After my colleague finished it, he was familiar with common concepts in ML and could start watching the old CS229 for better understanding of some parts. Don’t get me wrong, the new version isn’t worse it is just different and have different target. I don’t think it directly competes with CS 229 because the content and its delivery are more academic in the latter. In the new course the lecturer gives detailed explanation of every step in formulas, while explaining things that are supposed to be a prerequisite for the course e.g. Linear Algebra. Such detailed explanation of the core methods like Linear and Logistic Regression imposes some limits on the depth the course could go. In order to move to more complicated models, the author would have to dedicate significantly more time per each advanced topic, following the same approach as with the simpler ones. At its current state, the new course is more compact compared to CS229 at the cost of significantly fewer topics. However, the most essential ones are still there and it helps total beginners to get an idea of ML and see if they need to continue with CSS 229.

Interestingly enough, after looking at the available course the old CS 229 from 2008 is still of the best for getting into ML. The course and its cousin on Coursera are in high demand with Machine Learning gaining more attention in the last few years. For example, it seems to be the most popular course in Stanford in Fall 2013 according to Forbes article “Why Is Machine Learning (CS 229) The Most Popular Course At Stanford?”. There may be a few reasons why the course is so popular. First, CS229 covers wide range of ML methods both in problems they solve and methods complexity. Starting from simple linear regression, it moves to more advanced SVM related topics, to Multivariate Gaussians and Markov Decision Processes and models based on them (e.g. MDP -> Partially Observable MDPs; Multivariate Gaussians -> Kalman Filter; Kalman Filter and Linear Quadratic Regulator -> Linear Quadratic Gaussian; etc.). Newly introduced concepts are based on the previous methods and it gives the course sense of a complete experience, where you get consistent content from start till the end.

Another reason for the popularity could be because the course reaches a balance where ML methods are explained at sufficient depth for industrial application or further research of a particular topic. After this course you have enough knowledge to read papers on the topic or join a more advanced course focusing on a specific set of methods in ML. After obtaining my MSc I got a job in the industry and this course help me in building a solid foundation in ML. With such foundation I can read scientific papers in ML without feeling overwhelmed by math heavy theory and ML specific terminology. As a result, can I apply new methods in practice, using state of the art algorithms and switching between Software Engineer and Researcher roles.

To summarize, if you are interested in learning about Machine Learning and have STEM background, CS 229 from Stanford might be your best option. However, the number of MOOC, especially in Computer Science, is increasing rapidly. The landscape could be different in a year or two when more universities start publish their courses online.