The catalog says, "Survey of recent research on learning in artificial intelligent systems. Topics include learning based on examples, instructions, analogy, discovery, experimentation, observation, problem solving and explanation. The cognitive aspects of learning will also be studied."
Actually, we will focus on statistical machine learning and reinforcement learning methods that have had a major impact on the machine learning field over the past decade or so. We will also attend to the problem of connecting these representations to the symbolic knowledge representation methods that have been at the core of Artificial Intelligence.
The course textbooks are:
The content of the course will be organized in two parallel tracks, Theory and Practice, that will run throughout the semester.
In the Theory track, we will work our way through much of the Bishop text, and then through parts of Sutton and Barto. There will be homework problems, and an exam or two. The majority of lecture time will be spent on the Theory track.
In the Practice track, we will implement learning algorithms in MATLAB and apply them to large bodies of data. Machine Learning is applicable to a huge range of problems, but this semester we will focus on problems of learning from visual data. We will spend some lecture time on the Practice track, but much of this will be done by you outside of class. The following two excellent papers will be assigned early in the course, so reading these would be a good way to get a head start.
A: In the Bishop textbook, read sections 1.1, 1.2, and 1.3 carefully. This should take a fair amount of care, time, and effort. You can't skim this kind of book. But if you can't follow it, even after spending a lot of time and effort on it, then you're probably not ready for this course.
A: Yes, in MATLAB.
Each class meeting will be important. Don't miss any of them.
In each 80-minute class, the plan is to spend:
Here is the plan for the lecture topics during the semester.
|Lecture topic||Read before class||Exercises|
|1/13||Curve fitting and probability theory||Bishop 1.1-1.3||1.1,1.2,1.11|
|1/15||Decision theory and information theory||Bishop 1.4-1.6||1.35,1.37,1.41|
|1/22||Distributions: binary, multinomial, conjugate priors||Bishop 2.1-2.2||2.6,2.7*|
|1/27||Distributions: Gaussians||Bishop 2.3-2.5||2.36,2.39*|
|1/29||Linear models for regression (1)||Bishop 3.1-3.2||3.4*|
|2/3||Linear models for regression (2)||Bishop 3.3-3.5||3.8*|
|2/5||Linear models for classification (1)||Bishop 4.1-4.2||4.5,4.6*|
|2/10||Linear models for classification (2)||Bishop 4.3-4.5||4.12,4.13,4.14*|
|2/12||Kernel methods: radial basis functions||Bishop 6.1-6.3|
|??||Gaussian processes||Bishop 6.4|
|2/17||Support vector machines||Bishop 7.1|
|2/21 - 3/1||Winter break|
|3/3||Bayesian networks||Bishop 8.1-8.2|
|3/5||Markov random fields||Bishop 8.3|
|3/10||Inference in graphical models||Bishop 8.4|
|3/12||Mixture models and EM (1)||Bishop 9.1-9.2|
|3/17||Mixture models and EM (2)||Bishop 9.3-9.4|
|3/19||Sampling methods||Bishop 11.1|
|3/24||Monte Carlo algorithms||Bishop 11.2-11.6|
|3/27||Principal component analysis||Bishop 12.1-12.2|
|3/31||Nonlinear PCA||Bishop 12.3-12.4|
|4/2||Hidden Markov models||Bishop 13.1-13.2|
|4/7||Linear dynamical systems||Bishop 13.3|
|4/9||Reinforcement learning (1)||Sutton-Barto 1; 2-2.2; 3-3.3, 3.6, 3.7; 10|
|4/14||Reinforcement learning (2)||Sutton-Barto 3.8-end; 4; 5|
|4/16||Reinforcement learning (3)||Sutton-Barto 6, 7, except 6.6, 6.7, 7.7, 7.8, 7.10|
|4/21||Reinforcement learning (4)||Sutton-Barto 8; 7.8; 9-9.2, 9.8; 10|
|4/30||Final exam||(1:30-3:30 pm)|
Bishop emphasizes the importance of solving the problems at the end of the chapter, in order to understand the material in the readings.
We will organize the class into groups of 2-4 students each.
For each class, there will be a small number (1-3) of homework problems (to be announced) from the reading assignment. Each person should do the problems, and write up the solutions (in pencil or blue or black ink). Another member of the same group will correct the solution (in red ink). Both sign the homework paper, review it, correct it as needed, and hand the marked-up assignment in. (You get credit for handing this in. We'll do spot checks for bogosity.)
For each class, one of the groups will be responsible for presenting the solution to a specified problem to the class.
During the first class, we will select groups who are able to meet together to discuss the material between classes, to grade each other's homework. Will will also identify which groups are presenting in the next few classes.
When a fixed camera views a static scene, each pixel always has the same value. Typical real scenes are mostly static, with dynamic foreground objects occasionally obscuring the static background. In this assignment, you will treat the dynamic foreground as noise, and attempt to recover a useful model of the static background.
In the simplest approach, you will model each pixel with its most common (modal) value. To handle slightly more complex scenes, for example with moving tree branches, you will learn a distribution in color space for each pixel.
Result: A clean image of the background, with foreground "noise" eliminated. Do this both with the video provided, and with one you collect with your own camera.
A pixel belongs to the foreground if it is not "explained" by the background model. Once you have a good background model, you can check each pixel in the video stream, to see whether it is explained by the background, or whether it is part of the foreground.
Cluster and track the unexplained pixels, to define individual foreground objects. Note that these "objects" will not persist, but will merge and split according to how well the clustering process is doing.
Result: The static background model, plus dynamic foreground object descriptions, each with the history of its own time-varying position and extent in the image.
The pixel-based method above fails for a hand-held video camera, since the pose (position and orientation) of the camera changes continuously. To build a background model successfully, we must identify the camera pose trajectory as well as the model itself.
We will simplify the problem by assuming that everything visible is distant (eliminating parallax) and that the camera operator is attempting to hold the camera fixed (small deviations from constant).
Fortunately, we can exploit the small motions of the camera to create a "super-resolution" model of the background: higher resolution in the model than in any individual image. This exploits the redundancy in the multiple frames of the video, but depends on them not being aligned in every frame. Can you use this to read distant license plates (on parked cars)?
Result: A clean, super-resolution image of the background, taken from a hand-held video stream.
Foreground objects are, by definition, moving against the static background. By individuating and tracking them (Assignment 2), they are somewhat stabilized in the tracker frame of reference. Now apply the methods you developed for Assignment 3 to further stabilize the image and build super-resolution models of the foreground objects.
Note that we are still treating objects as two-dimensional regions in the image, so this method will only work when three-dimensional objects present the same two-dimensional image as they move. The general problem, including creating a suitable three-dimensional model of the object and determining the object's pose, is much harder, and is not part of this assignment.
The example video we provide will include lateral traffic as moving objects. Try more difficult examples, to see what happens. The case of a vehicle moving toward or away from the camera is an interesting intermediate case, where the two-dimensional image varies by scale.
Result: Clean, super-resolution images of each foreground object, extracted from a hand-held video stream.
The grades will be determined as follows:
|Programming assignments (4)||40%|