Learning Viewpoint Invariant Representations of Faces in an
Attractor Network
Marian Stewart Bartlett and Terrence J. Sejnowski
Paper presented at the 1st Annual Meeting of the University of California
Multicampus Research Group in Vision Modeling, UC Irvine, October 16,
1996.
Abstract
In natural visual experience, different views of an object or face tend to
appear in close temporal proximity as an animal manipulates the object or
navigates around it, or as a face changes expression or pose. One way to
learn to recognize objects despite changes in viewpoint would be to learn
to associate patterns that occur close together in time. Capturing the
temporal relationships among patterns is a way to automatically associate
different views of an object without requiring three dimensional structural
descriptions. We present a set of simulations demonstrating how viewpoint
invariant representations can be developed from visual experience with
unsupervised learning by capturing these kinds of temporal relationships
among the input patterns. We explored two mechanisms for developing
viewpoint invariant representations of graylevel images of faces:
1. Competitive Hebbian learning of feedforward connections with a lowpass
temporal filter on the activity of the post-synaptic unit (Bartlett &
Sejnowski, 1996; Foldiak, 1991). 2. An attractor network that combines
Hebbian learning with a lowpass temporal filter on unit activities. When
the input patterns to an attractor network are passed through a lowpass
temporal filter, then a basic Hebbian weight update rule takes a form
related to Griniasty, Tsodyks & Amit (1993), which associates temporally
proximal input patterns into basins of attraction. We implement these two
mechanisms in a model with both feedforward and lateral components.
Following training on sequences of graylevel images of faces as they change
pose, multiple views of a given face fall into the same basin of
attraction, and the system acquires representations of faces that are
largely independent of pose.