Here is the Colab Notebook used in the class.
This document contains a list of resources to learn theoretical concepts in Machine Learning and to practice writing code online.
Here is a list of courses that are a part of Stanford’s AI Graduate Certificate, along with the resources (Slides / Notes / Lecture Videos etc.) for the courses.
Courses. (Required: CS221 and 3 elective courses from the rest)
- 4 units. CS221, CS224N, CS224U, CS228, CS229, CS230, CS231A, CS231N, AA228
- 3 units. CS157, CS223A, CS234, CS236, CS330
- CS221 Artificial Intelligence: Principles and Techniques Materials
- CS224N Natural Language Processing with Deep Learning YouTube / Other materials
- CS224U Natural Language Understanding Materials
- CS228 Probabilistic Graphical Models: Principles and Techniques Official Notes
- CS229 Machine Learning Coursera / Official Notes
- CS230 Deep Learning Slides & Materials / Coursera
- CS231A Computer Vision: From 3D Reconstruction to Recognition Official Notes
- CS231N Convolutional Neural Networks for Visual Recognition Youtube / Slides & Code
- AA228 Decision Making Under Uncertainty Reading / HW Exercises / Other materials
- CS157 Computational Logic Github Notes
- CS223A Introduction to Robotics All Materials
- CS234 Reinforcement Learning Slides & Notes / Lecture Videos
- CS236 Deep Generative Models Notes / Slides & HW
- CS330 Deep Multi-task and Meta Learning Notes, Slides & HW
Written for PyTorch with CUDA compatibility.
Use Case: Instead of doing a multivariate normal sampling (available in torch.distributions.multivariate_normal), one could also do a random sampling within a specified confidence region of the multivariate gaussian function.
Although it will be an approximation, one could obtain a confidence region using the variance across each dimension (diagonal of the covariance matrix). Suppose we define a 3*s boundary (Upto 3 x variance across each dimension is an accepted confidence interval), then EllipsoidSampler can be used to construct such an ellipsoid (with given mean = mu and lengths of axes = axes. This utility will further help sample from within the ellipsoid in a random fashion (instead of a random normal fashion).
We use Minimax algorithm to predict the next optimal move after every move by the user. This work demonstrates how a complete search done by Minimax algorithm can always yield optimal results. To speed up the search, alpha-beta prunning is implemented to prune moves that do no better than the currently explored moves. We test two methods using Minimax algorithm for game playing – with and without Alpha-Beta Prunning. A 14x faster first move is obtained using alpha-beta prunning.
The code is publicly available on github.
In this work, we study the application of bayesian networks for probabilistic inference. We consider a hypothetical real-world scenario where we answer queries regarding various events (health problems, accidents etc.) caused by factors such as air pollution, bad road conditions etc.
Each event/factor is modeled as a random variable with a certain probability distribution function (given as input). Variable dependence graph is constructed and bayes rule is applied on the markov blanket of the query variables to reduce the computational effort. Detailed documentation can be found in the code.
The code is publicly available at github.
This program visualizes the learning process of a perceptron. For simplicity, we consider the perceptron to learn the identity function. We give a 2 dimensional input <x, y> and classify each point as being below the line or above the line (binary classification). We update the weights of the perceptron whenever misclassification occurs. Over several examples, the perceptron learns the identity mapping.
The code is publicly available on github.
The Curse of Dimensionality, introduced by Bellman, refers to the explosive nature of spatial dimensions and its resulting effects, such as, an exponential increase in computational effort, large waste of space and poor visualization capabilities. Higher number of dimensions theoretically allow more information to be stored, but practically rarely help due to the higher possibility of noise and redundancy in real world data. In this article, the effects of high dimensionality is studied through various experiments and the possible solutions to counter or mitigate such effects are proposed. The source code of the experiments performed is available publicly on github.
Turing Machines are mathematical abstractions of a computing systems. Suppose you are given an input string of letters of a given alphabet, and you want to do some computation on it, then given enough time and memory, a Turing Machine can do the job for you. The only caveat is, the language that you want to compute upon should be “computable”. What this means is, there are certain languages that are not computable in reasonable time by a Turing Machine. Such languages are called “hard”, because it is hard to write an algorithm to compute in reasonable time. This is a very abstract version of what a Turing Machine is. Let us explore it in brief. Continue reading