Tom Olson
Texas Instruments


Computer vision is about to go mainstream. Over the next decade I believe that computer vision systems will become commonplace, and that vision technology will applied across a broad range of business and consumer products. For the next decade (at least!), however, computer vision systems will continue to be specialized to particular domains, and will require tailoring by experts to the tasks they are to perform. Thus there will be strong industry demand for computer vision engineers - for people who understand vision technology and know how to apply it to real-world problems.

In order to train students for industrial careers, we need to understand the differences between industrial and academic values. Computer vision has historically been an academic discipline governed by academic values. A piece of work is considered good if it is publishable. To be publishable, it must:
a) be novel: it must describe a previously unknown technique, or provide new information about a known technique or problem;
b) be based on a sound theory: there must be a coherent story about why the new technique should work and/or behave the way it does; and
c) be reasonably general: the theory upon which it is based should apply to a broad or otherwise interesting class of scenes.

The technique or theory should also, of course, _work_ in some sense. Specifically, publishable papers should give evidence (e.g. by experiment) that the underlying theory holds up under the acid test of application.

Industry (including industrial research) is governed by a rather different set of values. The 'gold standard' of quality is not publishability, but marketability: a piece work is good if it is worth money to someone. To be marketable, it must:
a) be useful: it must satisfy some need; and
b) work reliably in some specified domain.

In addition, it is a Good Thing if the work is novel enough to be protectable under patent law.

These two sets of values aren't as different as they might appear. I doubt that you can have reliable systems without sound theoretical underpinnings, so both camps value intellectual rigor and reasoning from principled models of the problem. The difference in a nutshell is that industry cares less about novelty, and much, much more about how well your algorithm works. Industrial computer vision systems must really, truly, work.

So what does this say about what should we teach? I assume that we can't teach ill-defined attributes such as creativity - or if we can, we should do it in first-year courses that everyone takes. I don't see any need to have students take business or economics courses either. What students will need in industry is the ability to:
a) analyze a domain and problem,
b) find or develop applicable techniques, and
c) use them to build a solution.

I claim that most computer vision curricula already do a good job of teaching problem analysis. We do less well at helping students find applicable techniques - some very useful material is often skipped, and new students often lack the practical experience needed to tell them what will really work in a given situation. We do a very bad job of preparing students to build vision systems, and it is here that I think we should concentrate.

Guidelines for teaching computer vision:

1) Teach what works. The field of computer vision is subject to fads, and curricula and textbooks tend to follow along. This results in students whose education quickly becomes dated, and who know a great deal about techniques that don't work. We should teach what works, even if it isn't very glamorous. Pattern recognition and clustering techniques are an example: they aren't terribly exciting, but they work very well in their domains. Every student should know about them and know when to use them.
2) Teach implementation. Writing computer vision programs isn't easy. Debugging them (or even telling whether they contain a bug) is harder. Conventional programming courses don't really give students the tools they need to tackle these problems. Plan to spend at least a little time talking about how to write and debug vision programs. If your students come from non-CS backgrounds, make them take some large-scale programming courses. They'll need them.
3) Provide _lots_ of practical experience In a real compiler course, you write a small compiler. In a serious OS course, you write a toy OS (or modify a real one). The theory is that until you've massaged the code, you don't really understand the algorithms or what is involved in translating them into systems. I claim that the same applies to computer vision. Yes, we can't do this as elegantly as the compiler people. But our students need the experience of putting together vision systems that perform non-trivial tasks.
4) Provide good tools. If we're going to have student build serious applications, we have to give them adequate tools. Otherwise they'll spend the whole semester beating their heads against low-level C bugs, and will never have the experience of seeing their systems run correctly. Yes, everyone should have the experience of writing a convolution loop once; but nobody should have the experience twice. At the least, provide a solid library of low-level IP and display routines (e.g. HIPS, Vista, Khoros). Try to provide higher level structures and operations as well, e.g. line, curve and region-finding routines.
5) Stress evaluation. Treat computer vision as an experimental science. After students have built a system (and convinced themselves that it performs the computation it's supposed to), have them do a serious evaluation of how well it works. Never let them turn in 'a program'; demand a writeup that discusses the theory behind the computation, the implementation itself, the testing strategy, and the results. They should be able to tell you the conditions under which their system should succeed or fail.

Back to Teaching Resources homepage