Learning Viewpoint Invariant Face Representations from Visual Experience by Temporal Association

Marian Stewart Bartlett and Terrence J. Sejnowski

In press: H. Wechsler, P.J. Phillips, V. Bruce, S. Fogelman-Soulie, T. Huang (Eds.), Face Recognition: From Theory to Applications, NATO ASI Series F. Springer-Verlag.


In natural visual experience, different views of an object or face tend to appear in close temporal proximity. A set of simulations is presented which demonstrate how viewpoint invariant representations of faces can be developed from visual experience by capturing the temporal relationships among the input patterns. The simulations explored the interaction of temporal smoothing of activity signals with Hebbian learning (Foldiak, 1991) in both a feed-forward system and a recurrent system. The recurrent system was a generalization of a Hopfield network with a lowpass temporal filter on all unit activities. Following training on sequences of graylevel images of faces as they changed pose, multiple views of a given face fell into the same basin of attraction, and the system acquired representations of faces that were approximately viewpoint invariant.