EECS 442: Computer Vision (Succeeding in EECS 442)
Why write this?
EECS 442 is a fairly demanding course in terms of project-based homework to
start with. It is often the one of the first courses that people experience with
fairly large leeway in how the assignments are done, less guidance in implementation,
and a great deal more self-teaching. These issues don't go away when you're dealing with
programming beyond the classroom.
After finding myself teaching various things in office hours, I've decided to
put it in one document in the hopes of providing guidance and pointers that
will hopefully make your life easier at any time (and you can get my first line of advice
at whatever hour you happen to be doing things). Some of the advice here is obvious to
some people, and so you can skip the sections that are obvious.
The guiding principle for everything other than dealing with incomplete specifications
is that you should think of your work for this class as something
like a science experiment. The pheomenon you are studying
is the fact that your program produces bad output, runs too slow, or crashes. So,
don't panic. Instead, take two deep breaths and then tackle things as analytically
as possible.
Python tools & linear algebra resources
You should invest ten minutes each in figuring out how to the following tools
to work at the level of just calling them when you're stuck or need an output. You don't
need to be an expert but having a handle on these will make your life easier.
pdb:
To make a breakpoint, insert import pdb; pdb.set_trace()
To run a script and launch a debugger when there's an error, python -m ipdb -c continue NAME.py
matplotlib
imagemagick (available on most systems): e.g., convert input.png -resize 800x800 output.png
A very good linear algebra review and reference from Zico Kolter (and Chuong Do) here. You can safely ignore:
determinants, the Hessian, gradients/Hessians for quadratic and linear functions, gradients of the determinant, and eigenvalues as optimization.
The Matrix Cookbook here. It's organized thematically and often the identities are very useful when you're stuck.
Meta-points
Here are a few things that may improve your results:
Come to class. You're adults and this is a 400-level class. I don't take attendance and I also record.
You can experience this class by watching back-to-back computer vision lectures at midnight. It's not a good way to learn
and will cost you time in the long run.
Stay on top of things. This stuff builds fairly cumulatively. If you look away for a while, your experience may look
like this. If you skip a few critical classes and don't keep up, you may miss an entire abstraction that's
used throughout a block of classes.
Success is time x effectiveness. You can put in lots of ineffective hours and get nothing done. Invest 30 minutes at the beginning
of the semester to get comfortable with using pdb, matplotlib, and an effective workflow, and this will pay off.
Work and study in teams. If you're already doing well, then explaining to
people is really the best way to cement your understanding. If you're not doing well, then
you can find people who will explain things.
Start early on assignments
If you work for 4 hours and hit a brick wall, if you call it a night and spend 2 hours the next day, often the solution is obvious. If you don't have
a next day, then you can't rely on this. In general, spending 4+2 hours on something is far better than spending 6 hours on it.
Your implementation may be slow
If you're not sure whether it's needed, include it. Asking “is it necessary”
and waiting for an answer on piazza often takes as much time as including it.
Submit something for everything We can't give partial credit for blank pages, but if you write something about how you'd tackle the
problem, these are the easiest points to give.
Things to invest in
These take a little bit of time to do, but will pay themselves
back over many times.
A debugger. At a minimum, you should be ok with print-style debugging.
Ideally you should invest the time to learn pdb (hint: just insert import pdb; pdb.set_trace() wherever you want to stop the program and get a
python prompt) or some editor with a built-in debugger (if your editor shows
line numbers, try clicking them).
A way of looking at things. Figure out how to use matplotlib (or Jupyter notebooks) to see what things look like. If you're trying to look at an image
by printing it out in a terminal, you're going to suffer needlessly.
An effective workflow. You want a workflow than ensures you can quickly write something and test it. If you have to transfer your files somewhere
each time you change something, you will waste time. If you are doing something on a remote machine,
invest in learning vi/vim, emacs, or something else that works in a terminal.
These reduce the amount of time that you spend waiting to get answers. Contrast these two options:
Sticking pdb.set_trace() before the function call that causes your program to crash, and trying changes (argument types, orders, etc) 5 times before it works;
Updating the code, scping/copypasting it, then running it from scratch 5 times until you get the thing to run.
Which runs faster? This can make orders of magnitude difference in debugging time.
Dealing with incomplete specifications, ambiguity
Most serious programming intrinsically involves incomplete specifications:
if it was a complete specification, you'd ask a computer to do this.
Unlike research or launching a startup, we're asking you to do something that we're certain is feasible. This is
incredibly useful information. As an analogy, consider taking a math exam without a calculator: if you get a horrible
expression, you know you're on the wrong track.
Incomplete specifications are difficult for two reasons:
You have to figure out how to write a fairly large, unconstrained program (see below);
You have to determine what parameters to use (see below).
They're somewhat orthogonal and you should tackle them independently.
Writing large, fairly unconstrained programs
Typically we give you some degree of flexibility in terms of how things are
implemented. If you're used to more structured problems, this can difficult.
Here are some strategies:
Break things into conceptually simple functions that do one thing and do only
one thing. It's really easy to write a huge mess of code. This code tends to be
easy to write and brutally difficult to debug. Give yourself a budget of maybe 30 lines per
function and don't exceed this unless you absolutely have to. Be sure you can
test each of the functions. Breaking things into conceptually simple components dramatically accelerates your
debugging. If things are broken into pieces, you
verify each component and the calling code, which is much easier than verifying
all of the code stuck together since the functions prevent interactions (apart from
global variables) between the pieces of code.
Save intermediate results if it's a long-running computation. Use either
np.save or pickle in Python, or use jupyter (that said, beware of jupyter
caching things you don't intend! This is another bug waiting to happen). This caching
reduces your time-to-find-bug.
Here's an example:
import numpy as np
import pickle
def stage1(args):
...
return results
def stage2(args):
...
return results
def stage3(args):
...
return results
if __name__ == "__main__":
cache = True
inputs = readSomething()
#stage 1
if not os.path.exists("output1.pck") or cache == False:
output1 = stage1(inputs)
pickle.save(output1,open("output1.pck",w"))
else:
output1 = pickle.load(open("output1.pck"))
#stage 2
if not os.path.exists("output2.pck") or cache == False:
output2 = stage2(output1)
pickle.save(output1,open("output2.pck",w"))
else:
output2 = pickle.load(open("output2.pck"))
#stage 3
output3 = stage3(output2)
Dealing with programs that crash
So your program crashes! The TAs and I have typically implemented each of the assignments ourselves, but
possibly with other packages or languages. We also don't walk around with the
documentation to python, numpy, and opencv in our heads. We usually have more experience
than you debugging, but we are often seeing an error for the first time ourselves. Moreover,
we didn't write the code that caused the error. Thus, even if you started programming in the past few years, you have a serious head-start on us.
Everytime you have an error:
Isolate the line of code that you've written that causes the error. This is in the
traceback in python.
Read the error message. Even if it's cryptic: the person who wrote the error message
thought it was potentially helpful. For instance, if it says something is wrong with the
type of the argument, see if changing the type makes the error go away. If it says
one of three things is potentially wrong, verify that your arguments satisfy all three
requirements. If you attach a debugger right before the line that causes the program to
crash, you can play with the arguments to the function that cause it to crash and dramatically
reduce your debug time.
Google it, copy pasting the message. This is important! You don't want ‘‘python program crashes’’,
you want ‘‘python numpy TypeError: data type not understood’’.
Again, think of it as an investigation: there is probably a github issue, stack overflow question, website called ‘‘things I hate about
numpy’’, or twitter post that contains your answer. Your job is to find it. Think of the challenge as generating the query that will find you the answer you need.
Attach a debugger right before the line and print out every single variable and look at them. Is anything sketchy?
Common things that I spotted in office hours or have been told as gotchas:
Wrong type
Wrong size of matrix
Nondeterministic functions being assumed to be deterministic (e.g., os.listdir)
Cached variables or a variable being used from further up in the code
Transposed results (looking at row instead of column and vice-versa)
Modifying things in place vs sending a copy back
Dealing with programs that don't do what you want (i.e., incorrect output)
So the program you wrote produces incorrect results. Your job now is to (a) identify where
in the program the incorrect outputs come from and (b) identify why. Searching
for why throughout the program is really hard; searching for where is often easier.
Here are a few strategies:
Assume libraries are correct. You should (largely) assume that numpy, python, etc. are correctly implemented. HOWEVER, you may be simply calling
something incorrectly: you may have the order or types of the functions incorrect.
What parts of the program don't work? If your program can't be broken into
small pieces, then you can't test each one. If you don't know what the input and output
of each function is, then you'll have more success debugging if you identify this.
Make fake input data that you know the answer to and that the program should work on.
Typically you're working with data that you don't fully understand and a specification you're
working to get working. So start with something really simple so you get rid of one variable of uncertainty. If you're trying to:
line up images: take the same image and use np.roll to offset them
classify things: make a small version of the dataset where each ‘‘class’’ is just a color like
blue, red, green.
find correspondences in images: use np.roll to make an unaligned image
fit a transformation between points: generate a set of random points
and their corresponding points for some translation (or some affine transformation you pick)
fit a regression model: generate some random NxF data matrix , a random Fx1 vector , Nx1 noise vector
and set . Do you get back?
fit a classification model: generate positive feature vectors according to a normal distribution centered at in all dimensions and
negatives according to a normal distribution centered at in all dimensions.
find something involving epipolar geometry: generate an image that is rolled to the right along the x axis.
fit something on top of something extracted from the image: hand-mark what you want (e.g., correspondences) and try
I'll add more as I suggest more
Print the program state If you've broken your code into simple functions, at every step, print out everything and see if it looks reasonable. If you have
concerns or nagging feelings, go back.
Once you know where, identify why Once you've identified the parts of the program that are likely ok and ones that aren't, start trying to identify why the part of the program
isn't working. Go line-by-line and try to explain what each line does verbally
no, really.
If you're really stuck, you think that you conceptually understand what the function is supposed to do, but there's some bug that won't go away,
consider rewriting it from scratch.
Dealing with programs that have knobs on the sides (i.e., what parameters do I use)
Often there are parameters that are left as design decisions. This is because there really aren't particular numbers that work
or are mathematically elegant.
What are default parameters? If you're expected to use something, read the documentation for parameters. If you use the
parameters that are specified as default values or are suggested in the documentation, you probably won't have many issues.
If you're implementing something, look at the manual for something that has already implemented it. Does it provide a knob to turn for
the parameter you're curious about? What value does it set by default?
To see effects, try orders of magnitude. If you want to know if twiddling the knob on a parameter will make a difference
in the output, don't try changing things by a factor of 20%. Be bold! Multiply the parameter by 10 or 100! The worst case is
that the program crashes or behaves unexpectedly; however its unexpected behavior may tell you about what that knob does.
Dealing with slow programs
Invest in a profiler (python comes with one) or put a bunch of print statements. Don't prematurely optimize anything.
It's almost always faster if you can call some function in numpy or scipy to do your
work. It'll call code that was written in Fortran or C and is well-optimized.
In vision with python (or anything that's not compiled), avoid for loops. Given a 1000x1000
image, the for loop inner body executes 1 million times. The overhead on this loop is
not insubstantial
Tricks:
Remember data structures and algorithms: iterating through lists is SLOW. Looking up in a hash table / dictionary is fast.
Use masks to modify things conditionally inside a matrix. For instance, if X is a matrix, X[X<0] = 0 sets all the negative entries
to zero.
Store things as a numpy array whenever possible. This forces you to try to write things with
numpy rather than with a list and will guide you to writing reasonably fast code.
See any of these: 1 2
3
|