Indicative features of an fMRI data set can be evaluated by methods provided by theory of random matrices (RMT). RMT considers ensembles of matrices and yields statements on the properties of the typical eigensystems in these ensembles. The authors have studied the particular ensemble of random correlation matrices that can be used as a noise model for the empirical data and that allows us thus to select data features that significantly differ from the noisy background. In this sense RMT can be understood as offering a systematic approach to surrogate data. Interestingly, also the noise characteristics change between different experimental conditions. This is revealed by higher-order statistics available from RMT. The authors illustrate the RMT-based approach by an exemplary data set for the distinction between a visuomotor task and a resting condition. In addition it is shown for these data that the degree of sparseness and of localization can be evaluated in a strict way, provided that the data are sufficiently well described by the pairwise cross-correlations.
The aim of art is to represent not the outward appearance
of things, but their inward significance. — AristotleTop
In order to reveal features of interest in empirical data there is often no other option than a comparison to the corresponding quantities in surrogate data, that is, shuffled, boosted, randomized or otherwise rearranged data of the same kind. Surrogate data (Theiler, Eubank, Longtin, Galdrikian, & Farmer, 1992) provide a contrast or a baseline against which relevant data features are to be compared, while the actual generation process of surrogate data that provide the desired contrast remains a matter of an on-going debate. Not only may the shuffling of the data cause a level of randomness against which any feature appears significant, but also may the surrogate data in a high-dimensional problem become sparse and thus not sufficiently representative for the underlying distribution. By reference to random matrices we suggest a more systematic framework for providing baselines to data features of potential interest. This framework does not necessarily include the discrimination of artifacts from intrinsic features. It will, however, systematically reduce the data space such that later other methods may be invoked in order to further analyze the data. It will further provide sensitive means for the distinction of various scenarios at which seemingly similar data were obtained. If for a certain quantity a prediction from Random Matrix Theory (RMT) exists then it is possible to rate the difference between two data sets relative to their respective distance to the theoretical value. Of particular interest is, furthermore, that RMT provides descriptions of spatial properties of the data. These can be used for the discrimination of active and non-active brain voxels which forms an essential step in the analysis of fMRI data. Thus, suggestive data properties such as sparseness and localization of features can be expressed as well by quantities which are meaningful in the theory of random matrices.
Random matrix theory studies ensembles of matrices. An ensemble is a distribution over the set of all matrices. In the limit of high dimensions all matrices of the ensemble are similar in the sense that they share certain properties regardless of the dynamical principles or the interactions underlying the system. In this sense the properties of the ensemble are universal. In this way a random matrix approach to data processing does not only study a few sets of surrogate data, but allows us in principle to compare the given data set to a set of all possible surrogate samples.
Key Terms in this Chapter
Universality: In statistical mechanics refers to the observation that a large class of systems share properties that are independent of their dynamics.
Independent Component Analysis (ICA): A computational method for separating statistically independent sources that are linearly mixed.
Shannon Entropy: Or information entropy is a measure of the uncertainty associated with a random variable. It quantifies the amount of information conveyed per message.
Model Order Selection: The proper selection of the number of effective features underlying the data.
Principal Component Analysis (PCA): A linear orthogonal transformation that transforms the data to a new coordinate system such that the new directions point to the maximal variance of multivariate data. The objectives of PCA are 1) to identify meaningful underlying variables and 2) to possibly reduce the dimensionality of the data set.
Functional magnetic resonance imaging (fMRI): A non invasive neuroimaging technique that studies neural activity based on metabolic changes in the brain under the subject’s stimulation or task performance.
Random Matrix Theory (RMT): Concerned with questions about the statistical properties of the eigenvalues and eigenvectors of large matrices of various ensembles which are determined by certain symmetries such as real symmetric, Hermitian matrices etc.