ICA Projects at CNL


Te-Won Lee, Michael Lewicki, Tony Bell(at Interval Research) Terry Sejnowski

ICA algorithms

  • Information Maximization approach (Tony Bell)

  • Blind separation of recordings in a real environment (Te-Won Lee)
    The ICA formulation can be extended to separated mixtures of convolved and time-delayed sources. The goal is to extract sources from a mixture which were recorded in a real environment such as an office room or conference room. read papers and listen to some audio-demos

  • Extended Infomax Algorithm (Te-Won Lee, Mark Girolami)
    This algorithm is able to blindly separate mixed signals with sub- and super-Gaussian source distributions. This was achieved by using a simple type of learning rule first derived by (Girolami, 1997) by choosing negentropy as a projection pursuit index. Here we use a general stability analysis to switch between sub- and super-Gaussian regimes. The algorithm can separate a variety of source distributions and is effective at separating artifacts such as eye blinks and line noise from EEG recordings. see our paper

  • ICA Mixture Model (Te-Won Lee, Michael Lewicki)
    An extension of ICA using EM for unsupervised classification. The algorithm finds the independent sources, the mixing matrix for each class and also computes the class membership probability for each data point. This method is well suited to modeling structure in high-dimensional data and has many potential applications. see our papers

  • Overcomplete ICA (Michael Lewicki)
    In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input. The representation of an input is not a unique combination of basis vectors, however, overcomplete representations have greater robustness in the presence of noise, are more sparse, and have greater flexibility in matching structure in the data.

  • Scott Makeig, Tzyy-Ping Jung, Martin McKeown Colin Humphries, Terry Sejnowski

    Applications of ICA to electrophysiological data

    Definitions of terms -- EEG, MEG, ERP, ERF

    Electromagnetic fields associated with brain processes and recorded outside the head produce electroencephalographic (EEG) and magnetoencephalographic (MEG) data. Averages of EEG epochs time-locked to a set of experimental events of interest are called event-related potentials (ERPs). Similar magnetic averages are known as event-related fields or ERFs.

    Suitability for ICA decomposition

    In the EEG/MEG frequency range (roughly 0.1-100 Hz) the mixing of brain fields at the scalp electrodes is basically linear. Although skull attenuates EEG signals strongly and "smears" (low-pass filters) them spatially, this does not affect the linear relation between potential in the brain and potential at the scalp. Fields propagate to the sensors (electrodes or SQUID coils) through volume conduction without significant delays. This makes EEG and MEG data suited to linear decomposition via ICA. A number of "frequently asked questions" about the application of ICA to averaged or spontaneous EEG/MEG data are answered in Frequently Asked Questions about ICA applied to EEG/MEG data.

    First Applications

    The ICA algorithm of Bell & Sejnowski was first applied to EEG and ERP data in Makeig S, Bell AJ, Jung T-P, and Sejnowski TJ, "Independent component analysis of electroencephalographic data." Advances in Neural Information Processing Systems 8, 145-151,1996. This paper demonstrated the successful decomposition of 14-channel ERP data consisting of only 624 data points. Further details have now been published in a PNAS paper on ICA applied to ERP data. Preprint html and Postscript versions of this paper are also available for review and download from this site.


    A Matlab toolbox for EEG/MEG analysis using ICA is also available for download. The toolbox consists of scripts for ICA decomposition and plotting of results, together with general-purpose EEG plotting and computational routines. A demo script (icademo) illustrates application of the ICA routines to both synthetic and actual ERP data.


    View summary of recent changes to the toolbox.


    Bibliography of publications on biomedical applications of ICA

    Martin McKeown, Tzyy-Ping Jung, Scott Makeig, Terry Sejnowski

    ICA applied to functional Magnetic Resonance Imaging (fMRI) data analysis

    fMRI data

    fMRI data is a complicated mixture of different sources of variability: cardiac and respiratory pulsations, subtle head movements, task-related activity changes and machine noise. Changes related to the performance of psychomotor tasks may constitute as little as 10-15% of the variance of the Blood Oxygen Level Dependent (BOLD) contrast signal in a 1.5T magnet, so extracting the small task-related changes from the measured signal is difficult.

    ICA decomposition of fMRI data

    ICA, in the manner applied to ERP and EEG (see above), is inappropriate for fMRI analysis because the number of "channels" (i.e. voxels) greatly exceeds the number of time points in a typical fMRI experiment. In 1997, it was first proposed to look for spatially independent patterns of activity in fMRI data [ref]. This assumes that the spatial distributions associated with each of the above sources of variability are independent, and that the contributions from each spatial pattern sum linearly to represent the data. The time courses associated with the different spatial patterns can potentially be correlated, allowing for the detection of spatial patterns whose time courses are transiently task-related (TTR) as well as consistently task-related (CTR). The criteria of spatial independence appears to be a powerful way to separate task-related activations from other sources of variability making up the BOLD signal, as explained in Frequently Asked Questions about ICA applied to fMRI data.

    Further details have been published in a PNAS paper of ICA applied to fMRI data (which can be downloaded) and a Human Brain Mapping paper.



    Marni Bartlett, Terry Sejnowski

    Face recognition using ICA

    In a task such as face recognition, much of the important information may be contained in the high-order relationships among the image pixels. Some success has been attained using data-driven face representations based on principal component analysis, such as "Eigenfaces" (Turk & Pentland, 1991) and "Holons" (Cottrell & Metcalfe, 1991). Principal component analysis (PCA) is based on the second-order statistics of the image set, and does not address high-order statistical dependencies such as the relationships among three or more pixels. Independent component analysis (ICA) is a generalization of PCA which separates the high-order moments of the input in addition to the second-order moments. We developed image representations based on the independent components of the face images and compared them to a PCA representation for face recognition.

    ICA was performed on the face images under two different architectures. The first architecture provided a set of statistically independent basis images for the faces that can be viewed as a set of independent facial features. These ICA basis images were spatially local, unlike the PCA basis vectors. The representation consisted of the coefficients for the linear combination of basis images that comprised each face image. The second architecture produced independent coding variables (coefficients). This provided a factorial face code, in which the probability of any combination of features can be obtained from the product of their individual probabilities. The distributions of these coefficents were sparse and highly kurtotic. Classification was performed using nearest neighbor, with similarity measured as the cosine of the angle between representation vectors. Both ICA representations were superior to the PCA representation for recognizing faces across sessions, changes in expression, and changes in pose.

    Papers on face image analysis using ICA by Marian Stewart Bartlett.

    Michael Gray, Terry Sejnowski

    Lip-reading using ICA

    What is the appropriate spatial scale for image representation? In the primate visual system, receptive fields are small at early stages of processing (area V1), and larger at late stages of processing (areas MT, IT). In the current work, we explore the efficiency of local and global image representations on an automatic visual speech recognition task using an HMM as the recognition system. We compare local and global principal component and independent component image representations for the task. Local representations consistently and significantly outperformed global representations in terms of generalization to new speakers.

    Gray, M.S., Movellan, J.R., and Sejnowski, T. J. (1997). A comparison of local versus global image decompositions for visual speechreading. Proceedings of the 4th Annual Jount Symposium on Neural Computation, Pasadena, CA, May 17, 1997.