= Tutorial 25: Statistics = '''[TUTORIAL UNDER DEVELOPMENT: NOT READY FOR PUBLIC USE] ''' ''Authors: Francois Tadel, Dimitrios Pantazis, Elizabeth Bock, Sylvain Baillet'' Until now we have been computing measures of the brain activity in time or time-frequency domain. We were able to see clear effects or slight tendencies, but these observations were always dependent on an arbitrary amplitude threshold or on the configuration of the colormap. With appropriate statistical tests, we can go beyond these empirical observations and assess what are the significant effects in a more formal way. We are typically interested in comparing different groups of samples. We want to know what is significantly different in the brain responses fortwo experimental conditions or two groups of subjects. So we will be essentially estimating differences and testing if these differences are significantly different from zero. <> == Difference deviant-standard == In this auditory oddball experiment, we can test for the significant differences between the brain response to the deviant beeps and the standard beeps, time sample by time sample. Before running complicated statistical tests that will take weeks of computation, you can start by checking what the difference of the average responses looks like. If in this difference you observe obvious effects that are clearly not what you are expecting, it's not worth moving forward with finer analysis: either the data is not clean enough or your initial hypothesis is wrong. We are going to use the Process2 tab, at the bottom of the Brainstorm figure. It works exactly like the Process1 tab but with two lists of input files, referred to as FilesA (left) and FilesB (right). * In Process2, drag and drop the '''non-normalized deviant average''' on the left (FilesA) and the non-normalized '''standard average''' on the right (FilesB). * Run the process "'''Difference > Difference A-B'''".<
>Select the option "'''Use absolute values'''", which will convert the unconstrained source maps (three dipoles at each cortex location) into a flat cortical map by taking the norm of the three dipole orientations before computing the difference.<
><
> {{attachment:diff_process.gif||height="311",width="478"}} * Rename the new file in "Deviant-Standard", double-click on it and explore it in time. <
><
> {{attachment:diff_contact.gif||height="307",width="638"}} * We are looking at the difference (Deviant-Standard) so positive/red regions indicate higher activity levels for the deviant beeps, and negative/blue regions higher activity for the standard beeps. * '''Before 50ms''': The motor activity in the deviant is probably due to the previous stims. * '''P50''': Maybe a stronger response in the primary auditory cortex for the standard condition. * '''MMN '''(125ms): Stronger response for the deviant (left temporal, inferior frontal). * '''150ms''': Stronger response in the auditory system for the standard condition. * '''175ms''': Stronger response in the motor regions for the standard condition (motor inihibition). * '''After 200ms''': Stronger response in the deviant condition. == Difference of means == Another process can compute the average and the difference at the same time. We are going to compute the difference of all the trials from both runs at the sensor level. This is usually not recommended because the subject might have moved between the runs. Averaging the recordings across runs is not accurate but can give a good first approximation, in order to make sure we are on the right tracks. * In Process2, select all the '''deviant trials''' (Files A) and all the '''standard trials''' (Files B). * Run the process "'''Test > Difference of means'''", select the option "'''Arithmetic average'''". <
><
> {{attachment:diff_mean_process.gif||height="341",width="419"}} * Rename the new file in "Deviant-Standard", double-click on it to display it. The difference deviant-standard does not show anymore the early responses (P50, P100) but emphasizes the difference in the later process (MMN/P200 and P300). <
><
> {{attachment:diff_mean_ts.gif||height="164",width="608"}} == Parametric vs. non-parametric statistics [TODO] == Using a t-test instead of the difference of the two averages, we can reproduce similar results but with a significance level attached to each value. Assumptions / advantages for each approach == Parametric Student's t-test [TODO] == * In the Process2 tab, select the following files: * Files A: All the deviant trials, with the '''[Process sources]''' button selected. * Files B: All the standard trials, with the '''[Process sources]''' button selected. * Run the process "'''Test > Student's t-test'''", Equal variance, '''Absolute value of average'''.<
>This option will convert the unconstrained source maps (three dipoles at each cortex location) into a flat cortical map by taking the norm of the three dipole orientations before computing the difference. <
><
> {{attachment:ttest_process.gif||height="516",width="477"}} * Double-click on the new file for displaying it. With the new tab "Stat" you can control the p-value threshold and the correction you want to apply for multiple comparisons. <
><
> {{attachment:ttest_file.gif||height="243",width="492"}} * Set the options in the Stat tab: p-value threshold: '''0.05''', Multiple comparisons: '''Uncorrected'''. <
>What we see in this figure are the t-values corresponding to p-values under the threshold. We can make similar observations than with the difference of means, but without the arbitrary amplitude threshold (this slider is now disabled in the Surface tab). If at a given time point a vertex is red in this view, the mean of the deviant condition is significantly higher than the standard conditions (p<0.05).<
><
> {{attachment:ttest_contact.gif||height="303",width="512"}} * This approach considers each time sample and each surface vertex separately. This means that we have done Nvertices*Ntime = 15002*361 = 5415722 t-tests. The threshold at p<0.05 controls correctly for false positives at one point but not for the entire cortex. We need to correct the p-values for '''multiple comparisons'''. The logic of two types of corrections available in the Stat tab (FDR and Bonferroni) is explained in [[http://scan.oxfordjournals.org/content/4/4/417.full|Bennett et al (2009)]]. * Select the correction for multiple comparison "'''False discovery rate (FDR)'''". You will see that a lot less elementrs survive this new threshold. In the Matlab command window, you can see the average corrected p-value, that replace for each vertex the original p-threshold (0.05):<
>BST> Average corrected p-threshold: 0.000315138 (FDR, Ntests=5415722) * From the Scout tab, you can also plot the scouts time series and get in this way a summary of what is happening in your regions of interest. Positive peaks indicate the latencies when '''at least one vertex''' of the scout has a value that is significantly higher in the deviant condition. The values that are shown are the averaged t-values in the scout. <
><
> {{attachment:ttest_scouts.gif||height="169",width="620"}} * '''[TODO]''': Fix the test that is applied, this parametric t-test is probably not adapted for this norm of the three orientations in unconstrained models. == FieldTrip: Non-parametric cluster-based statistic [TODO] == We have the possibility to call some of the FieldTrip functions from the Brainstorm environment. For this, you need first to [[http://www.fieldtriptoolbox.org/download|install the FieldTrip toolbox]] on your computer and [[http://www.fieldtriptoolbox.org/faq/should_i_add_fieldtrip_with_all_subdirectories_to_my_matlab_path|add it to your Matlab path]]. For a complete description of non-parametric cluster-based statistics in FieldTrip, read the following article: [[http://www.sciencedirect.com/science/article/pii/S0165027007001707|Maris & Oostendveld (2007)]]. For an introduction to the method, watch this video: . <)>> '''[TODO] : Links to the options description on the FieldTrip website'''. Permuation-based non-parametric statistics are more flexible and do not require to do any assumption on the distribution of the data, but on the other hand they are a lot more complicated to process. Calling FieldTrip's function ft_sourcestatistics requires a lot more memory because all the data has to be loaded at once, and a lot more computation time because the same test is repeated many times. Running this function in the same way as the parametric t-test previously (full cortex, all the trials and all the time points) would require 45000*461*361*8/1024^3 = '''58 Gb of memory''' just to load the data. This is impossible on most computers, we have to give up at least one dimension and run the test only for one time sample or one region of interest. == FieldTrip: Process options [TODO] == Screen captures for the two processes: Description of the process options: The options available here match the options passed to the function ft_sourcestatistics. == FieldTrip: Example 1 [TODO] == We will run this FieldTrip function first on the scouts time series and then on a short time window. * Keep the same selection in Process2: all the deviant trials in FilesA, all the standard trials in FilesB. * Run process: '''Test > FieldTrip: ft_sourcestatistics''', select the options as illustrated below.<
><
> {{attachment:ft_process_scouts.gif||height="680",width="348"}} * Double-click on the new file to display it: <
><
> == FieldTrip: Example 2 [TODO] == short time window == Export to SPM == An alternative to running the statical tests in Brainstorm is to export all the data and compute the tests with an external program (R, Matlab, SPM, etc). Multiple menus exist to export files to external file formats (right-click on a file > File > Export to file). Two tutorials explain to export data specifically to SPM: * [[http://neuroimage.usc.edu/brainstorm/ExportSpm8|Export source maps to SPM8]] (volume) * [[http://neuroimage.usc.edu/brainstorm/ExportSpm12|Export source maps to SPM12]] (surface) == On the hard drive [TODO] == Right click one of the first TF file we computed > File > '''View file contents'''. == References == * Maris E, Oostendveld R, [[http://www.sciencedirect.com/science/article/pii/S0165027007001707|Nonparametric statistical testing of EEG- and MEG-data]] <
>J Neurosci Methods (2007), 164(1):177-90. * Pantazis D, Nichols TE, Baillet S, Leahy RM. [[http://www.sciencedirect.com/science/article/pii/S1053811904005671|A comparison of random field theory and permutation methods for the statistical analysis of MEG data]], Neuroimage (2005), 25(2):383-94. * Bennett CM, Wolford GL, Miller MB, [[http://scan.oxfordjournals.org/content/4/4/417.full|The principled control of false positives in neuroimaging]] <
> Soc Cogn Affect Neurosci (2009), 4(4):417-422. * FieldTrip video: Non-parametric cluster-based statistical testing of MEG/EEG data: <
> https://www.youtube.com/watch?v=vOSfabsDUNg == Additional discussions on the forum == * Forum: Multiple comparisons: http://neuroimage.usc.edu/forums/showthread.php?1297 * Forum: Cluster neighborhoods: [[http://neuroimage.usc.edu/forums/showthread.php?2132-Fieldtrip-statistics|http://neuroimage.usc.edu/forums/showthread.php?2132]] == Delete all your experiments == Before moving to the next tutorial, '''delete '''all the statistic results you computed in this tutorial. It will make it the database structure less confusing for the following tutorials. <)>> <> <>