Partial Least Squares (PLS)
Authors: Golia Shafiei
This tutorial explains the concept of Partial Least Squares (PLS) analysis in general, which was first introduced to the neuroimaging community in 1996 (McIntosh et al., 1996). In addition, we illustrate how to use PLS process on a sample data in Brainstorm.
PLS is a free toolbox that is available at Baycrest (https://www.rotman-baycrest.on.ca/index.php?section=84). The PLS code is written entirely in MATLAB (Mathworks Inc) and can be downloaded from https://www.rotman-baycrest.on.ca/index.php?section=345. To cite PLS Toolbox, please see the “References” section of this tutorial.
Partial Least Squares (PLS) analysis is a multivariate statistical technique that is used to find the relationship between two blocks of variables. PLS that has various applications and types (Krishnan et al., 2011); however, the focus of this tutorial is on Mean-Centered PLS analysis, which is a common type of PLS while working with neuroimaging data. In this type of PLS analysis, one data block is neural activity (e.g. MEG measurements/source data here) while the other one is the experiment design (e.g. different groups/conditions).
PLS analysis is based on extracting the common information between the two data blocks by finding a correlation matrix and linear combinations of variables in both data blocks that have maximum covariance with one another. In the example provided here, we find a contrast between different conditions as well as patterns of brain activity that maximally covary with that specific contrast.
For this purpose, we take the neural activity as one data block, matrix X, where the rows of matrix X are observations (participants/trials) nested in conditions or groups, and the columns of X are variables that are arranged in a way that time scales are nested within sources. The other data block, matrix Y, is a matrix of dummy coding that is related to experimental design (different groups or conditions) (Krishnan et al., 2011).
PLS analysis first calculates a mean-centered matrix using matrices X and Y. Then, singular value decomposition (SVD) is applied on the mean-centered matrix. The outcome of PLS analysis is a set of latent variables that are in fact linear combinations of initial variables of the two data blocks that maximally covary with the resulting contrasts (Krishnan et al., 2011, Misic et al., 2016).
Finally, the statistical significance of a latent variable is defined by a p-value calculated from permutation test. In addition, bootstrapping is used to assess the reliability of each original variable (e.g. a source at a time point) that contributes to the latent variable. Bootstrap ratios are calculated for each original variable for this purpose. More specifically, each latent variable consists of a set of singular values that describe the effect size, as well as a set of singular vectors, or weights, that define the contribution of each initial variable to the latent variables. The ratio of these weights to the standard errors estimated from bootstrapping is called bootstrap ratio. Therefore, the larger the magnitude of a bootstrap ratio, the larger the weight (i.e. contribution to the latent variable) and the smaller the standard error (i.e. higher stability) (McIntosh and Lobaugh, 2004, Misic et al., 2016). Bootstrap ratio can be equivalent to a z-score if we have an approximately normal bootstrap distribution (Efron and Tibshirani, 1986).
PLS analysis was explained in general in this section. However, this tutorial assumes that the users are already familiar with basics of PLS analysis. If PLS is new to you or if you want to read more about PLS and its applications in details, please refer to the articles introduced in “References” section.
Download and installation
In order to run PLS process in Brainstorm, the PLS Toolbox must be downloaded from here and added to MATLAB pathway.
Data, Pre-Processing and Source Analysis
The data processed here is the same dataset that is used in MEG visual tutorial: Single subject and MEG visual tutorial: Group analysis. This dataset consists in simultaneous MEG/EEG recordings of 19 subjects performing a simple visual task on a large number of famous, unfamiliar and scrambled faces. The detailed presentation of experiment is available in the MEG visual tutorial: Single Subject.
You can follow this tutorial after processing the data as illustrated in MEG visual tutorial: Single Subject. Then:
After you found all the averages across subjects, continue with Section 7 and filter the signals below 32Hz and extract time as it is explained. However, when filtering the sources, do not normalize the source values with respect to baseline (i.e. do not find z-score).
- Data is now ready for PLS analysis.
Input: the input is the channel data from two conditions (e.g. condA and condB) across time. Number of samples per condition should be the same for both condA and condB. Each of them should at least contain two samples.
Output: the output is a decoding curve across time, showing your decoding accuracy (decoding condA vs. condB) at time point 't'.
Classifier: Two methods are offered for the classification of MEG recordings across time: Support vector machine (SVM) and Linear discriminant analysis (LDA).
In the context of this tutorial, we have two condition types: faces, and scenes. We want to decode faces vs. scenes using 306 MEG channels. In the data, the faces are named as condition ‘201’; and the scenes are named as condition ‘203’.
Go to the Download page of this website, and download the file: sample_decoding.zip
- Unzip it in a folder that is not in any of the Brainstorm folders (program folder or database folder). This is really important that you always keep your original data files in a separate folder: the program folder can be deleted when updating the software, and the contents of the database folder is supposed to be manipulated only by the program itself.
- Start Brainstorm (Matlab scripts or stand-alone version).
Select the menu File > Create new protocol. Name it "TutorialDecoding" and select the options:
Import the recordings
- Go to the "functional data" view (sorted by subjects).
Right-click on the TutorialDecoding folder > New subject > Subject01
Leave the default options you defined for the protocol.
- We will not pay much attention to MEG/MRI registration because we are not going to compute any source models, the decoding is done on the sensor data.
- Select only two events: 201 (faces) and 203(scenes)
- Epoch time: [-100, 1000] ms
- Remove DC offset: Time range: [-100, 0] ms
- Do not create separate folders for each event type
You will get a message saying "some epochs are shorter than the others". Answer yes.
Select the Process2 tab at the bottom of the Brainstorm window.
Drag and drop 40 files from group 201 to the left (Files A).
Drag and drop 40 files from group 203 to the right (Files B).
Cross-validation is a model validation technique for assessing how the results of our decoding analysis will generalize to an independent data set.
Low-pass cutoff frequency: If set, it will apply a low-pass filter to all the input recordings.
Matlab SVM/LDA: Require Matlab's Statistics and Machine Learning Toolbox.
These methods do a k-fold stratified cross-validation for you, meaning that each fold will contain the same proportions of the two types of class labels (option "Number of folds").
LibSVM: Requires the LibSVM toolbox (download and add to your path).
The LibSVM cross-validation may be faster but will not be stratified.
This is an iterative procedure. The training and test data for the SVM/LDA classifier are selected in each iteration by randomly permuting the samples and grouping them into bins of size n (you can select the trial bin sizes). In each iteration two samples (one from each condition) are left out for test. The rest of the data are used to train the classifier with.
Trial bin size: If greater than 1, the training data will be randomly grouped into bins of the size you determine here. The samples within each bin are then averaged (we refer to this as sub-averaging); the classifier is then trained using the averaged samples. For example, if you have 40 faces and 40 scenes, and you set the trial bin size to 5; then for each condition you will have 8 bins each containing 5 samples. Seven bins from each condition will be used for training, and the two left-out bins (one face bin, one scene bin) will be used for testing the classifier performance.
This work was supported by the McGovern Institute Neurotechnology Program to PIs: Aude Oliva and Dimitrios Pantazis. http://mcgovern.mit.edu/technology/neurotechnology-program
Khaligh-Razavi SM, Bainbridge W, Pantazis D, Oliva A (2016)
From what we perceive to what we remember: Characterizing representational dynamics of visual memorability. bioRxiv, 049700.
Cichy RM, Pantazis D, Oliva A (2014)
Resolving human object recognition in space and time, Nature Neuroscience, 17:455–462.