Differences between revisions 20 and 108 (spanning 88 versions)

Tutorial 27: Workflows

[TUTORIAL UNDER DEVELOPMENT: NOT READY FOR PUBLIC USE]

Authors: Francois Tadel, Elizabeth Bock, Dimitrios Pantazis, Richard Leahy, Sylvain Baillet

This page provides some general recommendations for your event-related analysis. It is not directly related with the auditory dataset, but provides guidelines you should consider for any MEG/EEG experiment.
We do not provide standard analysis pipelines for resting or steady state recordings yet, but we will add a few examples soon in the section Other analysis scenarios of the tutorials page.

Contents

What is your question?
Common pre-processing pipeline
EEG recordings
MEG recordings
Constrained cortical sources
Unconstrained cortical sources [???]
Regions of interest (scouts) [???]
1. Statistics: Single subject [???]
2. Statistics: Group analysis, within subject [???]
Time-frequency maps [???]
Workflow: Current problems [TODO]

What is your question?

The most appropriate analysis pipeline for your data depends on the question you are trying to answer. Before defining what are the main steps of your analysis, you should be able to state clearly the question you want to answer with your recordings.

What dimension?

MEG/EEG recordings
Cortical sources
- Individual anatomy or template
- Constrained (one value per vertex) or unconstrained (three values per grid point)
- Full cortex or regions of interests
Frequency or time-frequency maps

What kind of experiment?

Single subject: Contrast two experimental conditions across trials, for one single subject.
- Files A: Single trials for condition A.
- Files B: Single trials for condition B.
Group analysis, within subject: Contrast two conditions A and B measured for each subject.
- Files A: Subject-level averages for condition A (all the subjects).
- Files B: Subject-level averages for condition B (all the subjects).
Group analysis, between subjects: Contrast two groups of subjects for one condition.
- Files A: Subject-level averages for group #1 (G1).
- Files B: Subject-level averages for group #2 (G2).

What level of precision?

Difference of averages
Statistically significant differences between conditions or groups

What statistical test?

A = B
- Tests the null hypothesis H0:(A=B) against the alternative hypothesis H1:(A≠B)
- Correct detection: Identify correctly where and when the conditions are different.
- Ambiguous sign: We cannot say which condition is stronger.
Power(A) = Power(B)
- Tests the null hypothesis H0:(Power(A)=Power(B)) against the alternative hypothesis H1:(Power(A)≠Power(B))
- Incorrect detection: Not sensitive to the cases where A and B have opposite signs.
- Meaningful sign: We can identify correctly which condition has a stronger response.
- Power(x) = |x|², where |x| represents the modulus of the values:
  - Absolute value for scalar values (recordings, constrained sources, time-frequency)
  - Norm of the three orientations for unconstrained sources.
Multiple comparisons: FDR is a good choice for correcting p-values for multiple comparisons.

Design considerations

Use within-subject designs whenever possible (i.e. collect two conditions A and B for each subject), then contrast data at the subject level before comparing data between subjects.
Such designs are not only statistically optimal, but also ameliorate the between-subject sign ambiguities as contrasts can be constructed within each subject.

Common pre-processing pipeline

Most event-related studies can start with the pipeline we've introduced in these tutorials.

Import the anatomy of the subject (or use a template for all the subjects).
Access the recordings:
- Link the continuous recordings to the Brainstorm database.
- Prepare the channel file: co-register sensors and MRI, edit type and name of channels.
- Edit the event markers: fix the delays of the triggers, mark additional events.
Pre-process the signals:
- Evaluate the quality of the recordings with a power spectral density plot (PSD).
- Apply frequency filters (low-pass, high-pass, notch).
- Identify bad channels and bad segments.
- Correct for artifacts with SSP or ICA.
Import the recordings in the database: epochs around some markers of interest.

How many trials to include?

Single subject: Include all the good trials (unless you have a very low number of trials).
See the averaging tutorial.
Group analysis: Use a similar numbers of trials for all the subjects (no need to be strictly equal), reject the subjects for which we have much less good trials.

EEG recordings

Average

Average the epochs across acquisition runs: OK.
Average the epochs across subjects: OK.
Electrodes are in the same standard positions for all the subjects (e.g. 10-20).
Never use an absolute value for averaging or contrasting sensor-level data.

Statistics: Single subject

A = B: Parametric or non-parametric t-test, independent, two-tailed.

Statistics: Group analysis, within subject

A = B
- First-level statistic: For each subject, sensor average for conditions A and B.
- Second-level statistic: Parametric or non-parametric t-test, paired, two-tailed.

Statistics: Group analysis, between subjects

A = B
- First-level statistic: For each subject, sensor average for the conditions to test.
- Second-level statistic: Parametric/non-parametric t-test, independent, two-tailed.

MEG recordings

Average

Average the epochs within each acquisition runs: OK.
Average across runs: Not advised because the head of the subject may move between runs.
Average across subjects: Strongly discouraged because the shape of the heads vary but the sensors are fixed. One sensor does not correspond to the same brain region for different subjects.
Tolerance for data exploration: Averaging across runs and subjects can be useful for identifying time points and sensors with interesting effects but should be avoided for formal analysis.
Note for Elekta/MaxFilter users: You can align all acquisition run to a reference run, this will allow direct channel comparisons and averaging across runs. Not recommended across subjects.
Never use an absolute value for averaging or contrasting sensor-level data.

Statistics: Single subject

A = B: Parametric or non-parametric t-test, independent, two-tailed.

Statistics: Group analysis

Not recommended with MEG recordings: do your analysis in source space.

Constrained cortical sources

Average: Single subject

Sensor average: Compute one sensor-level average per acquisition run and per condition.
Sources: Estimate sources for each average (constrained, no normalization).
Source average: Average the source-level run averages to get one subject average.
Compute a weighted average to balance for different numbers of trials across runs.
Normalize the subject min-norm averages: Z-score wrt baseline (no absolute value).
Justification: The amplitude range of current densities may vary between subjects because of anatomical or experimental differences. This normalization helps bringing the different subjects to the same range of values.
Low-pass filter your evoked responses (optional).
If you filter your data, do it after the noise normalization so the variance is not underestimated.
Do not rectify the cortical maps, but display them as absolute values if needed.

Average: Group analysis

Subject averages: Compute within-subject averages for all the subjects, as described above.
Rectify the cortical maps (apply an absolute value).
Justification: Cortical maps have ambiguous signs across subjects: reconstructed sources depend heavily on the orientation of true cortical sources. Given the folding patterns of individual cortical anatomies vary considerably, cortical maps have subject-specific amplitude and sign ambiguities. This is true even if a standard anatomy is used for reconstruction.
Project the individual source maps on a template (only when using the individual brains).
For more details, see tutorial: Group analysis: Subject coregistration.
Group average: Compute grand averages of all the subjects.
Do not use a weighted average: all the subjects should have the same weight in this average.
Smooth spatially the source maps (optional).
You can smooth after step #3 for computing non-parametric statistics with the subject averages. For a simple group average, it is equivalent to smooth before of after computing the average.

Difference of averages: Within subject

Sensor average: Compute one sensor-level average per acquisition run and condition.
Sources: Estimate sources for each average (constrained, no normalization).
Source average: Average the source-level session averages to get one subject average.
Subject difference: Compute the difference between conditions for each subject #i: (Ai-Bi)
Normalize the difference: Z-score wrt baseline (no absolute value): Z(Ai-Bi)
Low-pass filter the difference (optional)
Rectify the difference (apply an absolute value): |Z(Ai-Bi)|
Project the individual difference on a template (only when using the individual brains).
Group average: Compute grand averages of all the subjects: avg(|Z(Ai-Bi)|).
Smooth spatially the source maps (optional).

Difference of averages: Between subjects

Grand averages: Compute averages for groups #1 and #2 as in Average:Group analysis.
Difference: Compute the difference between group-level averages: avg(|G1|)-avg(|G2|)
Limitations: Because we rectify the source maps before computing the difference, we lose the ability to detect the differences between equal values of opposite signs. And we cannot keep the sign because we are averaging across subjects. Therefore, many effects are not detected correctly.

Statistics: Single subject

A = B:
- Compute source maps for each trial (constrained, no normalization).
- Parametric or non-parametric two-sample t-test, two-tailed.
  Identifies correctly where and when the conditions are different (sign not meaningful).
- Directionality: Additional step to know which condition has higher values.
  Compute the difference of rectified averages: |avg(Ai)|-|avg(Bi)|
  Combine the significance level (t-test) with the direction (difference): See details.
Power(A) = 0: Parametric [???]
- Power(s(t)) = sum_trials (s(t)^2);
- Power_baseline = mean_{t<0} (sum_trials(s_base(t)^2));
- F(large,large) = Power(s(t)) / Power_baseline;
- But more popular statistic: Power(s(t) – power(baseline) / power(baseline)
  Do not have parametric statistic

Statistics: Group analysis, within subject [???]

Power(A-B) = 0: Parametric
- First-level statistic: Rectified difference of normalized averages.
  Proceed as in Difference of averages: Within subjects, but stop before the group average (after step #8). You obtain one measure |Ai-Bi| per subject, test these values against zero.
- Second-level statistic: Parametric Chi2-test.
  Power = sum(|Ai-Bi|²), i=1..nSubj ~ Chi2(nSubj)
A = B: Parametric or non-parametric
- Parametric and non-parametric t-test, two-tailed.
- Not recommended because of the sign issue between subjects
|A| = |B|: Non-parametric
- First-level statistic: Rectified and normalized subject averages.
  Proceed as in Average: Group analysis to obtain two averages per subject: |Ai| and |Bi|.
- Second-level statistic: Non-parametric two-sample t-test, paired, two-tailed.
- Non recommended because it does not consider the sign difference within a subject.
Power(A) = 0: Parametric [???]
- First-level statistic: Rectified and normalized subject averages.
  Proceed as in Average: Group analysis to obtain one average per subject: |Ai|.
- Second-level statistic: Parametric Chi2-test.
  PowerA = sum(Ai²), i=1..nA ~ Chi2(nA)
- This tests if the power has increased from baseline.

Statistics: Group analysis, between subjects [???]

|G1| = |G2|: Non-parametric
- First-level statistic: Rectified and normalized subject averages.
  Proceed as in Average: Group analysis to obtain one average per subject.
- Second-level statistic: Non-parametric two-sample t-test, independent, two-tailed.
Power(G1) = Power(G2): Parametric [???]
- First-level statistic: Rectified and normalized subject averages.
  Proceed as in Average: Group analysis to obtain one average per subject: |Ai|.
- Second-level statistic: Parametric F-test.
  PowerG1 = sum(Ai²), i=1..n1 ~ Chi2(n1)
  PowerG2 = sum(Aj²), j=1..n2 ~ Chi2(n2)
  F(n1,n2) = PowerG1 / PowerG2

Unconstrained cortical sources [???]

Three values for each grid point, corresponding to the three dipoles orientations (X,Y,Z).
We want only one statistic and one p-value per grid point in output.

Average: Single subject [???]

Sensor average: Compute one sensor-level average per acquisition run and per condition.
Sources: Estimate sources for each average (unconstrained, no normalization).
Source average: Average the source-level run averages to get one subject average.
Low-pass filter your evoked responses (optional).
Normalize the subject min-norm averages: Z-score wrt baseline (no absolute value).
[???] HOW TO NORMALIZE UNCONSTRAINED MAPS WRT BASELINE?

Average: Group analysis [???]

Subject averages: Compute within-subject averages for all the subjects, as described above.
Flatten the cortical map: compute the norm of the three orientations at each grid point.
Project the individual source maps on a template (only when using the individual brains).
Group average: Compute grand averages of all the subjects.

Difference of averages: Within subject [???]

Subject averages: Compute within-subject averages for conditions A and B, as described above.
Subject difference: Compute the difference between conditions for each subject (A-B).
Flatten the cortical map: compute the norm of the three orientations at each grid point.
Project the individual difference on a template.
Group average: Compute grand averages of all the subjects: average_subjects(|Ai-Bi|).

Difference of averages: Between subjects [???]

Subject averages: Compute within-subject averages for conditions A and B, as described above.
Grand averages: Compute the group-level averages for groups #1 and #2 as described in "Average: Group analysis"
Difference: Compute the difference between group-level averages: avg(|G1|)-avg(|G2|)
Limitations: Because we rectify the source maps before computing the difference, we lose the ability to detect the differences between equal values of opposite signs. And we cannot keep the sign because we are averaging across subjects. Therefore, many effects are not detected correctly.

Statistics: Single subject [???]

Sources: Compute source maps for each trial (unconstrained, no normalization)
Statistics: Compare all the trials of condition A vs all the trials of condition B.
|A| = |B|
- Non-parametric tests only, independent, test norm, two-tailed.
- Indicates which condition corresponds to a stronger brain response (for a known effect).

Statistics: Group analysis, within subject [???]

|A - B| = 0 : Non-parametric
1. Rectified differences: Proceed as described in Difference of averages: Between subjects, but stop before the computation of the grand averages (#6) and compute a test instead.
  You obtain one |A_i-B_i| value for each subject, test these values against zero.
2. Non-parametric one-sample test, one-tailed.
3. Indicates when and where there is a significant effect (but not in which direction).
|A| = |B|: Non-parametric
1. Subject averages: Compute within-subject averages for A and B, as described above.
  You obtain two averages per subject (A_i and B_i).
2. Non-parametric two-sample test, paired, test absolute values, two-tailed.
3. Indicates which condition corresponds to a stronger brain response (for a known effect).

Statistics: Group analysis, between subjects [???]

|A| = |B|
- Subject averages: Compute within-subject averages for A and B, as described above.
  You obtain two averages per subject (A_i and B_i).
- Non-parametric two-sample test, independent, test absolute values, two-tailed.
- Indicates which condition corresponds to a stronger brain response (for a known effect).

Regions of interest (scouts) [???]

Statistics: Single subject [???]

Even within-subject cortical maps have sign ambiguities. MEG has limited spatial resolution and sources in opposing sulcal/gyral areas are reconstructed with inverted signs (constrained orientations only). Averaging activity in cortical regions of interest (scouts) would thus lead to signal cancelation. To avoid this brainstorm uses algorithms to manipulate the sign of individual sources before averaging within a cortical region. Unfortunately, this introduces an amplitude and sign ambiguity in the time course when summarizing scout activity.
As a result, perform any interesting within-subject average/contrast before computing an average scout time series.
Then consider as constrained or unconstrained source maps.

Statistics: Group analysis, within subject [???]

Comparison of scout time series between subjects is tricky because there is no way to avoid sign ambiguity for different subjects. Thus there are no clear recommendations. Rectifying before comparing scout time series between subjects can be a good idea or not depending on different cases. Having a good understanding of the data (multiple inspections across channels/sources/subjects) can offer hints whether rectifying the scout time series is a good idea. Using unconstrained cortical maps to create the scout time series can ameliorate ambiguity concerns.

Time-frequency maps [???]

Average: Single subject [???]

Time-frequency maps: Compute time-frequency maps for each trial.
- Apply the default measure: magnitude for Hilbert transform, power for Morlet wavelets.
- Do not normalize the source maps: no Z-score or ERS/ERD.
- The values are all strictly positive, there is no sign ambiguity as for recordings or sources.
Average all the time-frequency maps together, for each condition separately.
- If you are averaging time-frequency maps computed on sensor-level data, the same limitations apply as for averaging sensor level data (see sections about MEG and EEG recordings above).

Average: Group analysis [???]

Subject averages: Compute within-subject averages for all the subjects, as described above.
Normalize: [???] Zscore, ERD/ERS, or FieldTrip?
Justification: The amplitude range of current densities may vary between subjects because of anatomical or experimental differences. This normalization helps bringing the different subjects to the same range of values.
Group average: Compute grand averages of all the subjects.

Difference of averages [???]

Group average: Compute the averages for conditions A and B as in Average: Group analysis.
Difference: Compute the difference between group-level averages: avg(A)-avg(B).

Statistics: Single subject [???]

Time-frequency maps: Compute time-frequency maps for each trial.
- Apply the default measure: magnitude for Hilbert transform, power for Morlet wavelets.
- Do not normalize the source maps: no Z-score or ERS/ERD.
- The values are all strictly positive, there is no sign ambiguity as for recordings or sources.
Statistics: Compare all the trials of condition A vs all the trials of condition B.
A = B [???]
- Parametric or non-parametric t-test, independent, two-tailed. [???]
- Indicates both where there is a significant effect and what is its direction (no sign ambiguity).

Statistics: Group analysis, within subject [???]

A = B [???]
1. Subject averages: Compute within-subject averages for all subjects, as described above.
2. Parametric or non-parametric t-test, independent, two-tailed. [???]
3. Indicates both where there is a significant effect and what is its direction (no sign ambiguity).

Advanced

Workflow: Current problems [TODO]

The following inconsistencies are still present in the documentation. We are actively working on these issues and will update this tutorial as soon as we found solutions.

[Group analysis] Unconstrained sources: How to normalize wrt baseline with a Z-score?
- Zscore(A): Normalizes each orientation separately, we cannot take the norm of it after.
- Zscore(|A|): Gets rid of the signs, forbids the option of a signed test H0:(Norm(A-B)=0)
- See also the tutorial: Source estimation
- We need a way to normalize across the three orientations are the same time.
[Single subject] Unconstrained sources: How do compare two conditions with multiple trials?
- |A|-|B|: Cannot detect correctly the difference.
- |A-B|: Cannot be computed because the trials are not paired.
- We need a test for the three orientations at the same time.
[Group analysis] Unconstrained sources:Can we use parametric tests?
Time-frequency maps:
- Can we use parametric tests for (A-B=0) ? Does (A-B) ~ normal distribution?
- Do we need to normalize the time-frequency maps when testing across subjects?
- If yes, how to normalize the time-frequency maps? (Z-score, ERS/ERD, divide by std)

-  ⇤ ← Revision 20 as of 2016-02-08 19:38:07 → 
  Size: 13364
  Editor: FrancoisTadel
  Comment:
+   ← Revision 108 as of 2016-05-20 20:12:12 → ⇥
  Size: 23555
  Editor: FrancoisTadel
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 6:
-This page provides some general recommendations for your group analysis. It is not directly related with the auditory dataset, but provides guidelines that have to be considered for any MEG/EEG experiment.

<<TableOfContents(2,2)>>
+This  page provides some general recommendations for your event-related  analysis. It is not directly related with the auditory dataset, but  provides guidelines you should consider for any MEG/EEG experiment. <<BR>>We  do not provide standard analysis pipelines for resting or steady state  recordings yet, but we will add a few examples soon in the section [[http://neuroimage.usc.edu/brainstorm/Tutorials#Other_analysis_scenarios|Other analysis scenarios]] of the tutorials page.

<<TableOfContents(3,2)>>
 Line 11:
-The most appropriate analysis pipeline for your data depends on the question you are trying to answer.

What is the objective you have with your data?

 * Contrast two experimental conditions across trials, for one single subject
 * Contrast two experimental conditions across multiple subjects
 * Contrast two groups of subjects for one given experimental condition

What are the dimensions you want to explore?
+The most appropriate analysis pipeline for your data depends on the question you are trying to answer. Before defining what are the main steps of your analysis, you should be able to state clearly the question you want to answer with your recordings.

==== What dimension? ====
-Line 23:
+Line 16:
- * Time-frequency dimensions

What level of precisions you want to get?

 * Averages
+  * Individual anatomy or template
  * Constrained (one value per vertex) or unconstrained (three values per grid point)
  * Full cortex or regions of interests
 * Frequency or time-frequency maps

==== What kind of experiment? ====
 * '''Single subject''': Contrast two experimental conditions across trials, for one single subject.
  * Files A: Single trials for condition A.
  * Files B: Single trials for condition B.
 * '''Group analysis, within subject''': Contrast two conditions A and B measured for each subject.
  * Files A: Subject-level averages for condition A (all the subjects).
  * Files B: Subject-level averages for condition B (all the subjects).
 * '''Group analysis, between subjects''': Contrast two groups of subjects for one condition.
  * Files A: Subject-level averages for group #1 (G1).
  * Files B: Subject-level averages for group #2 (G2).

==== What level of precision? ====
-Line 29:
+Line 34:
- * Identify statistically significant differences

'''[TODO: WHEN TO USE WHAT]'''

== Important physical limitations and implications ==
Recommendations for averaging/constrasting different types of data.

==== MEG sensor data ====
 * MEG channels are not aligned across subjects (or sessions) because the physical position of channels varies with respect to the head. <<BR>>As a result, '''do not contrast/average MEG channel data across subjects or sessions'''.
 * However, even though this is not recommended for formal analysis, it can be extremely useful for data exploration. Most of channel patterns are spatially smooth and averaging across subjects will probably highlight interesting effects, and suggest time points and sensors with experimental effects. Examples include auditory/language signals (auditory cortices align reasonably well), attention effects (parietal/occipital alpha is fairly consistent across subjects) and most other perceptional/cognitive processes.
 * Note for maxfilter users: A good practice is to align all within-subject data to a reference fif file (align all sessions to a reference session). This will allow direct channel comparisons within-subject. Aligning data across subjects is not recommended since it can introduce large data distortions (though sometimes it may work well).
 * This does not apply to EEG because it uses standard channel configurations (e.g. 10-20).

==== Cortical maps ====
 * Cortical maps have ambiguous signs across subjects: reconstructed sources depend heavily on the orientation of true cortical sources. Given the folding patterns of individual cortical anatomies vary considerably, cortical maps have subject-specific amplitude and sign ambiguities (e.g. positive vs. negative sources). This is true even if a standard anatomy is used for reconstruction.
 * As a result, to average/contrast cortical maps:
  * '''Across subjects: Rectify the cortical maps''' (absolute values)
  * '''Within subject: Do not rectify the cortical maps'''

==== Regions of interest (scouts) ====
 * Even within-subject cortical maps have sign ambiguities. MEG has limited spatial resolution and sources in opposing sulcal/gyral areas are reconstructed with inverted signs (constrained orientations only). Averaging activity in cortical regions of interest (scouts) would thus lead to signal cancelation. To avoid this brainstorm uses algorithms to manipulate the sign of individual sources before averaging within a cortical region. Unfortunately, this introduces an amplitude and sign ambiguity in the time course when summarizing scout activity.
+ * Statistically significant differences between conditions or groups

==== What statistical test? ====
 * '''A = B'''
  * Tests the null hypothesis H0:(A=B) against the alternative hypothesis H1:(A<<HTML(&#8800;)>>B)
  * Correct detection: Identify correctly '''where and when''' the conditions are different.
  * Ambiguous sign: We cannot say which condition is stronger.
 * '''Power(A) = Power(B)'''
  * Tests the  null hypothesis H0:(Power(A)=Power(B)) against the alternative hypothesis H1:(Power(A)<<HTML(&#8800;)>>Power(B))
  * Incorrect detection: Not sensitive to the cases where A and B have opposite signs.
  * Meaningful sign: We can identify correctly which condition has a '''stronger response'''.
  * Power(x) = |x|<<HTML(<SUP>2</SUP>)>>, where |x| represents the modulus of the values: <<BR>> - Absolute value for scalar values (recordings, constrained sources, time-frequency) <<BR>> - Norm of the three orientations for unconstrained sources.
 * '''Multiple comparisons''': FDR is a good choice for correcting p-values for multiple comparisons.

==== Design considerations ====
 * Use  within-subject designs whenever possible (i.e. collect two conditions A  and B for each subject), then contrast data at the subject level before  comparing data between subjects.
 * Such designs are not only  statistically optimal, but also ameliorate the between-subject sign  ambiguities as contrasts can be constructed within each subject.

== Common pre-processing pipeline ==
Most event-related studies can start with the pipeline we've introduced in these tutorials.

 1. Import the anatomy of the subject (or use a template for all the subjects).
 1. Access the recordings:
  * Link the continuous recordings to the Brainstorm database.
  * Prepare the channel file: co-register sensors and MRI, edit type and name of channels.
  * Edit the event markers: fix the delays of the triggers, mark additional events.
 1. Pre-process the signals:
  * Evaluate the quality of the recordings with a power spectral density plot (PSD).
  * Apply frequency filters (low-pass, high-pass, notch).
  * Identify bad channels and bad segments.
  * Correct for artifacts with SSP or ICA.
 1. Import the recordings in the database: epochs around some markers of interest.

==== How many trials to include? ====
 * '''Single subject''': Include all the good trials (unless you have a very low number of trials). <<BR>>See the [[http://neuroimage.usc.edu/brainstorm/Tutorials/Averaging#Number_of_trials|averaging tutorial]].
 * '''Group analysis''':  Use a similar numbers of trials for all the subjects (no need to be  strictly equal), reject the subjects for which we have much less good  trials.

== EEG recordings ==
=== Average ===
 * Average the epochs across acquisition runs: OK.
 * Average the epochs across subjects: OK.
 * Electrodes are in the same standard positions for all the subjects (e.g. 10-20).
 * Never use an absolute value for averaging or contrasting sensor-level data.

=== Statistics: Single subject ===
 * '''A ='''''' B''': Parametric or non-parametric t-test, '''independent''', two-tailed.

=== Statistics: Group analysis, within subject ===
 * '''A ='''''' B'''
  * '''First-level statistic''': For each subject, sensor average for conditions A and B.
  * '''Second-level statistic''': Parametric or non-parametric t-test, '''paired''', two-tailed.

=== Statistics: Group analysis, between subjects ===
 * '''A ='''''' B'''
  * '''First-level statistic''': For each subject, sensor average for the conditions to test.
  * '''Second-level statistic''': Parametric/non-parametric t-test, '''independent''', two-tailed.

== MEG recordings ==
=== Average ===
 * Average the epochs within each acquisition runs: OK.
 * Average across runs: Not advised because the head of the subject may move between runs.
 * Average across subjects: Strongly discouraged because the shape of the heads vary but the sensors are fixed. One sensor does not correspond to the same brain region for different subjects.
 * Tolerance for data exploration: Averaging across runs and subjects can be useful for identifying time points and sensors with interesting effects but should be avoided for formal analysis.
 * Note for Elekta/MaxFilter users: You can align all acquisition run to a reference run, this will allow direct channel comparisons and averaging across runs. Not recommended across subjects.
 * Never use an absolute value for averaging or contrasting sensor-level data.

=== Statistics: Single subject ===
 * '''A ='''''' B''': Parametric or non-parametric t-test, '''independent''', two-tailed.

=== Statistics: Group analysis ===
 * Not recommended with MEG recordings: do your analysis in source space.

== Constrained cortical sources ==
=== Average: Single subject ===
 1. '''Sensor average''': Compute one sensor-level average''' '''per acquisition run and per condition.
 1. '''Sources''': Estimate sources for each average (constrained, no normalization).
 1. '''Source average''': Average the source-level run averages to get one subject average.<<BR>>Compute a weighted average to balance for different numbers of trials across runs.
 1. '''Normalize '''the subject min-norm averages: Z-score wrt baseline (no absolute value).<<BR>>Justification: The amplitude range of current densities may vary between subjects because of anatomical or experimental differences. This normalization helps bringing the different subjects to the same range of values.

 1. '''Low-pass filter''' your evoked responses (optional). <<BR>>If you filter your data, do it after the noise normalization so the variance is not underestimated.
 1. '''Do not rectify the cortical maps''', but display them as absolute values if needed.

=== Average: Group analysis ===
 1. '''Subject averages''': Compute within-subject averages for all the subjects, as described above.
 1. '''Rectify''' the cortical maps (apply an absolute value). <<BR>>Justification: Cortical  maps have ambiguous signs across subjects: reconstructed sources depend  heavily on the orientation of true cortical sources. Given the folding  patterns of individual cortical anatomies vary considerably, cortical  maps have subject-specific amplitude and sign ambiguities. This is true even if a standard anatomy is used  for reconstruction.
 1. '''Project '''the individual source maps on a template (only when using the individual brains). <<BR>> For more details, see tutorial: [[Tutorials/CoregisterSubjects|Group analysis: Subject coregistration]].
 1. '''Group average''': Compute grand averages of all the subjects.<<BR>>Do __not__ use a weighted average: all the subjects should have the same weight in this average.

 1. '''Smooth '''spatially the source maps (optional).<<BR>>You can smooth after step #3 for computing non-parametric statistics with the subject averages. For a simple group average, it is equivalent to smooth before of after computing the average.

=== Difference of averages: Within subject ===
 1. '''Sensor average''': Compute one sensor-level average per acquisition run and condition.
 1. '''Sources''': Estimate sources for each average (constrained, no normalization).
 1. '''Source average''': Average the source-level session averages to get one subject average.
 1. '''Subject difference''': Compute the difference between conditions for each subject #i: (Ai-Bi)
 1. '''Normalize '''the difference: Z-score wrt baseline (no absolute value): Z(Ai-Bi)
 1. '''Low-pass filter''' the difference (optional)
 1. '''Rectify''' the difference (apply an absolute value): |Z(Ai-Bi)|
 1. '''Project '''the individual difference on a template (only when using the individual brains).
 1. '''Group average''': Compute grand averages of all the subjects: avg(|Z(Ai-Bi)|).

 1. '''Smooth '''spatially the source maps (optional).

=== Difference of averages: Between subjects ===
 1. '''Grand averages''': Compute averages for groups #1 and #2 as in ''Average:Group analysis.''

 1. '''Difference''': Compute the difference between group-level averages: avg(|G1|)-avg(|G2|)
 1. '''Limitations''': Because we rectify the source maps before computing the difference, we lose the ability to detect the differences between equal values of opposite signs. And we cannot keep the sign because we are averaging across subjects. Therefore, many effects are not detected correctly.

=== Statistics: Single subject ===
 * '''A = B''':
  * Compute source maps for each trial (constrained, no normalization).
  * Parametric or non-parametric two-sample t-test, two-tailed.<<BR>>Identifies correctly '''where and when''' the conditions are different (sign not meaningful).
  * '''Directionality''': Additional step to know which condition has higher values.<<BR>>Compute the difference of rectified averages: |avg(Ai)|-|avg(Bi)|<<BR>>Combine the significance level (t-test) with the direction (difference): [[http://neuroimage.usc.edu/brainstorm/Tutorials/Statistics#Directionality:_Difference_of_absolute_values|See details]].

 * '''Power(A) = 0''': Parametric '''[???]'''
  * Power(s(t)) = sum_trials (s(t)^2);
  * Power_baseline = mean_{t<0} (sum_trials(s_base(t)^2));
  * F(large,large) = Power(s(t)) / Power_baseline;
  * But more popular statistic: Power(s(t) – power(baseline) / power(baseline)<<BR>>Do not have parametric statistic<<BR>>

=== Statistics: Group analysis, within subject [???] ===
 * '''Power(A-B) = 0''': Parametric
  * '''First-level statistic''': Rectified difference of normalized averages. <<BR>>Proceed as in ''Difference of averages: Within subjects'', but stop before the group average (after step #8). You obtain one measure '''|Ai-Bi|''' per subject, test these values against zero.
  * '''Second-level statistic''': Parametric Chi2-test. <<BR>>Power = sum(|Ai-Bi|^2^), i=1..nSubj ~ Chi2(nSubj)

 * '''A = B''': Parametric or non-parametric
  * Parametric and non-parametric t-test, two-tailed.
  * Not recommended because of the sign issue between subjects

 * '''|A| = |B|''': Non-parametric
  * '''First-level statistic''': Rectified and normalized subject averages. <<BR>>Proceed as in ''Average: Group analysis'' to obtain two averages per subject: |Ai| and |Bi|.
  * '''Second-level statistic''':  Non-parametric two-sample t-test, '''paired''', two-tailed.
  * Non recommended because it does not consider the sign difference within a subject.
 * '''Power(A) = 0''': Parametric  '''[???]'''
  * '''First-level statistic''': Rectified and normalized subject averages. <<BR>>Proceed as in ''Average: Group analysis'' to obtain one average per subject: |Ai|.
  * '''Second-level statistic''': Parametric Chi2-test. <<BR>>PowerA = sum(Ai^2^), i=1..nA ~ Chi2(nA)
  * This tests if the power has increased from baseline.

=== Statistics: Group analysis, between subjects [???] ===
 * '''|G1| = |G2|''': Non-parametric
  * '''First-level statistic''': Rectified and normalized subject averages. <<BR>>Proceed as in ''Average: Group analysis'' to obtain one average per subject.
  * '''Second-level statistic''':  Non-parametric two-sample t-test, '''independent''', two-tailed.

 * '''Power(G1) = Power(G2)''': Parametric  '''[???]'''
  * '''First-level statistic''': Rectified and normalized subject averages. <<BR>>Proceed as in ''Average: Group analysis'' to obtain one average per subject: |Ai|.
  * '''Second-level statistic''': Parametric F-test. <<BR>>PowerG1 = sum(Ai^2^), i=1..n1 ~ Chi2(n1)<<BR>>PowerG2 = sum(Aj^2^), j=1..n2 ~ Chi2(n2)<<BR>>F(n1,n2) = PowerG1 / PowerG2

== Unconstrained cortical sources  [???] ==
Three values for each grid point, corresponding to the three dipoles orientations (X,Y,Z). <<BR>>We want only one statistic and one p-value per grid point in output.

=== Average: Single subject [???] ===
 1. '''Sensor average''': Compute one sensor-level average''' '''per acquisition run and per condition.
 1. '''Sources''': Estimate sources for each average (unconstrained, no normalization).
 1. '''Source average''': Average the source-level run averages to get one subject average.
 1. '''Low-pass filter''' your evoked responses (optional).
 1. '''Normalize '''the subject min-norm averages: Z-score wrt  baseline (no absolute value).<<BR>>'''[???]''' HOW TO NORMALIZE UNCONSTRAINED MAPS WRT BASELINE?

=== Average: Group analysis [???] ===
 1. '''Subject averages''': Compute within-subject averages for all the subjects, as described above.
 1. '''Flatten''' the cortical map: compute the norm of the three orientations at each grid point.
 1. '''Project '''the individual source maps on a template (only when using the individual brains).

 1. '''Group average''': Compute grand averages of all the subjects.

=== Difference of averages: Within subject [???] ===
 1. '''Subject averages''': Compute within-subject averages for conditions A and B, as described above.
 1. '''Subject difference''': Compute the difference between conditions for each subject (A-B).
 1. '''Flatten''' the cortical map: compute the norm of the three orientations at each grid point.
 1. '''Project '''the individual difference on a template.
 1. '''Group average''': Compute grand averages of all the subjects: average_subjects(|Ai-Bi|).

=== Difference of averages: Between subjects [???] ===
 1. '''Subject averages''': Compute within-subject averages for conditions A and B, as described above.
 1. '''Grand averages''': Compute the group-level averages for groups #1 and #2 as described in "Average: Group analysis"
 1. '''Difference''': Compute the difference between group-level averages: avg(|G1|)-avg(|G2|)
 1. '''Limitations''':  Because we rectify the source maps before computing the difference, we  lose the ability to detect the differences between equal values of  opposite signs. And we cannot keep the sign because we are averaging  across subjects. Therefore, many effects are not detected correctly.

=== Statistics: Single subject [???] ===
 1. '''Sources''': Compute source maps for each trial (unconstrained, no normalization)
 1. '''Statistics''': Compare all the trials of condition A vs all the trials of condition B.
 1. '''|A| = |B|'''
  * '''Non-parametric''' tests only, '''independent''', test norm, two-tailed.
  * Indicates which condition corresponds to a stronger brain response (for a known effect).

=== Statistics: Group analysis, within subject [???] ===
 * '''|A - B| = 0 ''': Non-parametric
  1. '''Rectified differences''': Proceed as described in ''Difference of averages: Between subjects'', but stop before the computation of the grand averages (#6) and compute a test instead.<<BR>>You obtain one |A<<HTML(<SUB>)>>i<<HTML(</SUB>)>>-B<<HTML(<SUB>)>>i<<HTML(</SUB>)>>| value for each subject, test these values against zero.
  1. '''Non-parametric''' one-sample test, one-tailed.
  1. Indicates when and where there is a significant effect (but not in which direction).
 * '''|A| = |B|''': Non-parametric
  1. '''Subject averages''': Compute within-subject averages for A and B, as described above.<<BR>>You obtain two averages per subject (A<<HTML(<SUB>)>>i<<HTML(</SUB>)>> and B<<HTML(<SUB>)>>i<<HTML(</SUB>)>>).
  1. '''Non-parametric''' two-sample test, '''paired''', test absolute values, two-tailed.
  1. Indicates which condition corresponds to a stronger brain response (for a known effect).

=== Statistics: Group analysis, between subjects [???] ===
 * '''|A| = |B|'''
  * '''Subject averages''': Compute within-subject averages for A and B, as described above.<<BR>>You obtain two averages per subject (A<<HTML(<SUB>)>>i<<HTML(</SUB>)>> and B<<HTML(<SUB>)>>i<<HTML(</SUB>)>>).
  * '''Non-parametric''' two-sample test, '''independent''', test absolute values, two-tailed.
  * Indicates which condition corresponds to a stronger brain response (for a known effect).

== Regions of interest (scouts) [???] ==
=== Statistics: Single subject [???] ===
 * Even  within-subject cortical maps have sign ambiguities. MEG has limited  spatial resolution and sources in opposing sulcal/gyral areas are  reconstructed with inverted signs (constrained orientations only).  Averaging activity in cortical regions of interest (scouts) would thus  lead to signal cancelation. To avoid this brainstorm uses algorithms to  manipulate the sign of individual sources before averaging within a  cortical region. Unfortunately, this introduces an amplitude and sign  ambiguity in the time course when summarizing scout activity.
-Line 51:
+Line 239:
-==== Design considerations ====
 * Use within-subject designs whenever possible (i.e. collect two conditions A and B for each subject). Such designs are not only statistically optimal, but also ameliorate the between-subject sign ambiguities as contrasts can be constructed within each subject.
 * Contrast/average data within subject before comparing data between subjects.

== Summary of the analysis ==
==== Workflow within-subject    (for single trial analysis) ====
 1. Compute source map for each trial (constrained/unconstrained, no normalization)
 1. Estimate differences between two conditions A/B for which you have multiple trials

==== Workflow within-subject  (for group analysis) ====
 1. Compute sensor '''average '''per acquisition session    => Session-level average for each condition
 1. Compute '''source map''' for each session average (constrained or unconstrained, no normalization)
 1. '''Average '''source  maps across sessions   => Subject-level average for each condition
 1. '''Low-pass filter''' < 40Hz for evoked responses (optional)
 1. '''Normalize '''the subject min-norm averages: Z-score vs. baseline
 1. '''Absolute value''' or norm for display

==== Workflow group analysis ====
 1. '''Project '''the individual source maps on a template                           (no absolute value)
 1. Constrained sources: '''Smooth '''spatially the sources                       (no absolute value)
 1. Compute grand averages or other group-level statistics                 (signed or absolute)

== Within-subject statistics ==
For one unique subject, test for significant differences  between two experimental conditions:

 * Compare the '''single  trials''' corresponding to each condition.
 * In most cases, you '''do not need to normalize''' the data.
 * Use '''independent tests'''.
 * For help with the implications of testing the '''relative or absolute values''', see: [[Tutorials/Difference|Difference]].

==== Sensor recordings ====
 * Not advised for MEG with multiple sessions, correct for EEG.
 * '''A vs B''':
  * Never use an absolute value for testing recordings.
  * '''Parametric''' or '''non-parametric''' tests, independent, two-tailed, FDR-corrected.
  * Correct effect size, ambiguous sign.

==== Constrained source maps ====
 * One value per vertex.
 * Use the non-normalized minimum norm maps for all the trials (current density maps, no Z-score).
 * '''A vs B''':
  * Null hypothesis H0: (A=B).
  * '''Parametric''' or '''non-parametric''' tests, independent, two-tailed, FDR-corrected.
  * Correct effect size, ambiguous sign.
 * '''|A| vs |B|''':
  * Null hypothesis H0: (|A|=|B|).
  * '''Non-parametric''' tests only, independent, two-tailed, FDR-corrected.
  * Incorrect effect size, meaningful sign.

==== Unconstrained source maps ====
 * Three values per vertex.
 * Use the non-normalized minimum norm maps for all the trials (current density maps, no Z-score).
 * We need to test the '''norm '''of the three orientations instead of testing the orientations separately.
 * '''Norm(A) vs. Norm(B)''':
  * Null hypothesis H0: (|A|=|B|).
  * '''Non-parametric''' tests only, independent, two-tailed, FDR-corrected.
  * Incorrect effect size, meaningful sign.

==== Regions of interest (scouts) ====
 * Average/constrast cortical maps before summarizing scout activity.
-Line 114:
+Line 241:
-==== Time-frequency maps ====
 * Test the non-normalized time-frequency maps for all the trials (no Z-score or ERS/ERD).
 * The values tested are power or magnitudes, all positive, so (A=B) and (|A|=|B|) are equivalent.
 * '''|A| vs |B|''':
  * Null hypothesis H0: (|A|=|B|)
  * '''Non-parametric''' tests only, independent, two-tailed, FDR-corrected.
  * Correct effect size, meaningful sign.

== Between-subject statistics [TODO] ==
==== Subject averages ====
You need first to process the data separately for each subject:

 1. Compute the ''' subject-level averages''', using the '''same number of trials''' for each subject.<<BR>> Sources: Average the non-normalized minimum norm maps (current density maps, no Z-score).

 1. Sources and time-frequency: '''Normalize '''the  data to bring the  different subjects to the same range of values  (Z-score normalization  with respect to a baseline - never apply an  absolute value here).

 1. Sources computed on individual brains: '''Project '''the individual source maps on a template (see the [[Tutorials/CoregisterSubjects|coregistration tutorial]]). Not needed if the sources were estimated directly on the template anatomy. <<BR>>Note:  We evaluated the alternative order (project the sources and then  normalize): it doesn't seem to be making a significant difference. It's  more practical then to normalize at the subject level before projecting  the sources on the template, so that we have normalized maps to look at  for each subject in the database.

 1. Constrained sources: '''Smooth '''spatially the sources, to make sure the brain responses are aligned. '''Problem''':  This is only possible after applying an absolute value, smoothing in  relative values do not make sense, as the positive and negative signals  and the two sides of a sulcus would cancel out. [TODO]

==== Group statistic ====
Two group analysis scenarios are possible:

 * '''One condition''' recorded for multiple subjects, comparison between '''two groups of subjects''':
  * Files A: Averages for group of subjects #1.
  * Files B: Averages for group of subjects #2.
  * Use '''independent tests''': Exactly the same options as for the single subject (described above)

 * '''Two conditions''' recorded for multiple subjects, comparison across '''all subjects''':
  * Files A: All subjects, average for condition A.
  * Files B: All subjects, average for condition B.
  * Use '''paired tests''' (= dependent tests), special cases listed below.

==== Paired tests ====
 * '''Sensor recordings''': (not recommended in MEG)
  * '''(A-B=0)''': Parametric or non-parametric tests, two-tailed, FDR-corrected.
 * '''Constrained source maps''' (one value per vertex):
  * '''(A-B=0)''':  Parametric or non-parametric tests, two-tailed, FDR-corrected ('''sign issue?''').
  * '''(|A|-|B|=0)''': Non-parametric tests, two-tailed, FDR-corrected.
  * '''(|A-B| = 0)''': ???

 * '''Unconstrained source maps''' (three values per vertex):
  * '''(Norm(A-B)=0)''': Non-parametric tests, __'''one-tailed'''__ (non-negative statistic), FDR-corrected.
  * '''(Norm(A)-Norm(B)=0)''': Non-parametric tests, two-tailed, FDR-corrected.
 * '''Time-frequency maps''':
  * '''(|A|-|B|=0)''': Non-parametric tests, two-tailed, FDR-corrected.
 * '''Regions of interest''' (scouts):
  * Comparison of scout time series between subjects is tricky because there   is no way to avoid sign ambiguity for different subjects. Thus  there  are no clear recommendations. Rectifying before comparing scout  time  series between  subjects can be a good idea or not depending on  different cases. Having  a good understanding of the data (multiple  inspections across  channels/sources/subjects) can offer hints whether  rectifying the scout  time series is a good idea. Using unconstrained  cortical  maps to create the scout time series can ameliorate ambiguity  concerns.

 * For help with '''relative/absolute options''', read the previous tutorial: [[Tutorials/Difference|Difference]].

==== Averages ====
 * In order to compute grand averages (across subjects), you should '''rectify''' your source maps before averaging. Averaging the absolute values of the subject-level averages will help avoiding possible cancellation effects due to anatomical  differences between subjects.
 * If you have two conditions A and B to contrast, first compute the difference within-subject (A-B), then average the rectified differences: average_subjects(|Ai-Bi|).
+=== Statistics: Group analysis, within subject [???] ===
 * Comparison   of scout time series between subjects is tricky because there   is no   way to avoid sign ambiguity for different subjects. Thus  there  are no   clear recommendations. Rectifying before comparing scout  time  series   between  subjects can be a good idea or not depending on  different   cases. Having  a good understanding of the data (multiple  inspections   across  channels/sources/subjects) can offer hints whether  rectifying   the scout  time series is a good idea. Using unconstrained  cortical    maps to create the scout time series can ameliorate ambiguity  concerns.

== Time-frequency maps [???] ==
=== Average: Single subject [???] ===
 1. '''Time-frequency maps''': Compute time-frequency maps for each trial.
  * Apply the default measure: magnitude for Hilbert transform, power for Morlet wavelets.
  * Do not normalize the source maps: no Z-score or ERS/ERD.
  * The values are all strictly positive, there is no sign ambiguity as for recordings or sources.
 1. '''Average''' all the time-frequency maps together, for each condition separately.
  * If you are averaging time-frequency maps computed on sensor-level data, the same limitations apply as for averaging sensor level data (see sections about MEG and EEG recordings above).

=== Average: Group analysis [???] ===
 1. '''Subject averages''': Compute within-subject averages for all the subjects, as described above.
 1. '''Normalize''': '''[???]''' Zscore, ERD/ERS, or FieldTrip?<<BR>>Justification:  The amplitude range of current densities may vary between subjects  because of anatomical or experimental differences. This normalization  helps bringing the different subjects to the same range of values.

 1. '''Group average''': Compute grand averages of all the subjects.

=== Difference of averages [???] ===
 1. '''Group average''': Compute the averages for conditions A and B as in ''Average: Group analysis''.
 1. '''Difference''': Compute the difference between group-level averages: avg(A)-avg(B).

=== Statistics: Single subject [???] ===
 1. '''Time-frequency maps''': Compute time-frequency maps for each trial.
  * Apply the default measure: magnitude for Hilbert transform, power for Morlet wavelets.
  * Do not normalize the source maps: no Z-score or ERS/ERD.
  * The values are all strictly positive, there is no sign ambiguity as for recordings or sources.

 1. '''Statistics''': Compare all the trials of condition A vs all the trials of condition B.

 1. '''A = B''' '''[???]'''
  * '''Parametric''' or '''non-parametric''' t-test, '''independent''', two-tailed. '''[???]'''
  * Indicates both where there is a significant effect and what is its direction (no sign ambiguity).

=== Statistics: Group analysis, within subject [???] ===
 * '''A = B ''''''[???]'''
  1. '''Subject averages''': Compute within-subject averages for all subjects, as described above.
  1. '''Parametric''' or '''non-parametric''' t-test, '''independent''', two-tailed. '''[???]'''
  1. Indicates both where there is a significant effect and what is its direction (no sign ambiguity).
-Line 172:
+Line 284:
-The  following inconsistencies are still present in the documentation. We  are actively working on these issues and will update this tutorial as  soon as we found solutions.

 * [Group analysis] Unconstrained sources: How to compute a Z-score?
  * Zscore(A): Normalizes each orientation separately, which doesn't make much sense.
  * Zscore(Norm(A)): Gets rid of the signs, forbids the option of a signed test H0:(Norm(A-B)=0)
+The   following inconsistencies are still present in the documentation. We   are actively working on these issues and will update this tutorial as   soon as we found solutions.

 * [Group analysis] Unconstrained sources: How to normalize wrt baseline with a Z-score?
  * Zscore(A): Normalizes each orientation separately, we cannot take the norm of it after.
  * Zscore(|A|): Gets rid of the signs, forbids the option of a signed test H0:(Norm(A-B)=0)
-Line 178:
+Line 290:
-  * We would need a way to normalize across the three orientations are the same time.
 * [Group analysis] Constrained sources: How do we smooth?
  * Group analysis benefits a lot from smoothing the source maps before computing statistics.
  * However this requires to apply an absolute value first. How do we do?
+  * We need a way to normalize across the three orientations are the same time.
-Line 183:
+Line 292:
-  * Norm(A)-Norm(B): Cannot detect correctly the differences
  * (A-B): We test individually each orientation, which     doesn't make much sense.
  * We would need a test for the three orientations at once.
 * [Group analysis] Rectify source maps?
  * Recommended in Dimitrios' guidelines, which is incoherent with the rest of the page.
+  * |A|-|B|: Cannot detect correctly the difference.
  * |A-B|: Cannot be computed because the trials are not paired.
  * We need a test for the three orientations at the same time.
 * [Group analysis] Unconstrained sources:Can we use parametric tests?
 * Time-frequency maps:
  * Can we use parametric tests for (A-B=0) ? Does (A-B) ~ normal distribution?
  * Do we need to normalize the time-frequency maps when testing across subjects?
  * If yes, how to normalize the time-frequency maps? (Z-score, ERS/ERD, divide by std)

Feedback on the documentation (typos, unclear sections, missing information)
For questions, bug reports, and feature requests, please use the Brainstorm Forum.

Email address (if you expect an answer):