Vic Nalwa
Bell Laboratories


What is currently lacking most in the education of students in computer vision, I believe, is a solid grounding in the following two topics:

1. Image Formation

1.1 Geometry

How are images formed in a pinhole camera?
Why are lenses used and what do they do?
What are the parameters of a lens?
What are the limitations of a lens?

What geometric approximations to image formation are possible and when?
What are perspective, orthographic, and parallel projections?
When do orthographic and parallel projections approximate perspective projection?
What are vanishing points and vanishing lines?

1.2 Radiometry

What is the relationship between the brightness and color of a scene and those of its image?
How do we perceive color?
How do we measure color?

1.3 Calibration

How do we calibrate and represent the geometry of image formation?
How do we calibrate and represent the relationship between the scene and image brightness and color?
What are homogeneous coordinates and how are they useful?

2. Image Processing

2.1 Sensing

How is an optical image converted into an electrical digital image, both monochrome and color?

2.2 Sampling

What is the Sampling Theorem?
What is the relationship between an analog image and its discrete counterpart?
What is aliasing and how do we recognize it?

2.3 Quantization

How do we obtain a digital image from a discrete image?
What is false contouring and how do we recognize it?

2.4 Noise

What is noise?
How do we characterize noise and other random variables?
How do we measure the effect of an image operation on noise?
What is the effect of differentiation on noise?
What is the effect of integration on noise?

2.5 Linear Shift-Invariance and Convolution

What is a linear shift-invariant operation?
What is convolution?
What is the Fourier Transform and how is it useful?

Other topics that students of computer vision must be familiar with include the following:

3. Edge Detection and Image Segmentation

3.1 Edgel Detection

3.2 Edgel Aggregation and Edge Description

3.3 Image Segmentation


4. Line-Drawing Interpretation

4.1 Polygonal Planar Surfaces

4.2 Nonplanar Surfaces


5. Shading

5.1 Quantitative Shape Recovery

5.2 Qualitative Shape Recovery


6. Texture

6.1 Discrimination

6.2 Shape Recovery


7. Stereo

7.1 Theoretical Basis

7.2 Correspondence Establishment


8. Motion

8.1 Motion-Field Estimation

8.2 Motion-Field Analysis


My two favorite books on computer vision are these:

1. A Guided Tour of Computer Vision, V. S. Nalwa, Addison-Wesley, 1993.

2. Robot Vision, B. K. P. Horn, MIT Press, 1986.

Although both Horn and I, through our books, attempt to cover Topics 1 and 2, I believe much better coverage is possible than provided by either book. With regard to Topics 3 through 8, I make a stab at covering them in my book. Horn does a good job of discussing the topics he covers -- particularly through questions at the end of each chapter -- but his coverage is limited. My book, on the other hand, which provides more uniform coverage, has no questions to offer. Another distinction between the two books is the relatively greater emphasis of Horn's book on mathematical formulations, and the relatively greater emphasis of mine on the elucidation of concepts through figures.


Three non-computer vision books I refer to often are these:

1. Optics, E. Hecht, Addison-Wesley.

2. Pattern Classification and Scene Analysis, R. O. Duda and P. E. Hart, Wiley.

3. Digital Image Processing, W. K. Pratt, Wiley.

The first provides a lucid introduction to optics and the second to pattern recognition. The third is a standard reference in image processing, but like other books in image processing, it is mired in mathematical notation.


It is my belief -- ingrained in me during my undergraduate education at IIT, Kanpur, and reinforced ever since -- that only when a student has understood the fundamentals of a discipline well, can the student understand, appreciate, and apply the particulars of that discipline successfully. Then, in educating students in computer vision, we would seem well advised to pay at least as much attention to Topics 1 and 2 as we pay to other topics. Unfortunately, however, the typical duration of a computer-vision course does not permit this, and, further, the diversity of the backgrounds of students who typically enroll in computer-vision courses makes it unrealistic to expect or require prior exposure of all students to basic camera optics and image processing. This state of affairs probably can be rectified with regard to students pursuing M.S. and Ph.D. degrees in computer vision, perhaps by offering these students courses in computer vision different than those we offer to others. You will notice here that I am focusing only on how we may provide students an understanding of computer vision, and not on how we might train such students to implement, effectively and efficiently, computer-vision algorithms they might devise.


Given that I am from "industry," I guess I must add a few words on what skill set is useful in industry.

It is desirable that fresh graduates have had at least some prior exposure to accomplishing a (preferably useful) task that requires capturing images using one or more (preferably color) cameras from which the images are transferred to a computer (preferably PC) on which they are processed to accomplish the sought task using a program (preferably in C) that is invoked through a user-friendly interface.

Such exposure would stand students in good stead not only in finding employment in industry, but also in starting their own industries. The latter is quite desirable out of the self interest of both the students and the field. As is widely recognized of late, the continued vibrance of research in computer vision depends to no small extent on its ability to add value to society through the development of its research into technology.


Finally, on a personal note, lest I leave you with the wrong impression, I must confess that my exposure to the fundamentals of image formation both during and before I received my Ph.D. was meager, and that, until recently, I had never used a video camera or a PC. My programming skills remain weak, and I am still discovering the beauty of basic optics.


Back to Teaching Resources homepage