Statistical differences between montage

Margaux · October 28, 2019, 9:53am

Hi,

I am working of cortical auditory evoked potentials for newborns with Brainstorm.
My EEGs were registered with a 128 electrodes cap
We analyse paired t-test for 2 montages : we "apply montage" and after we look t-test for 32 and 128 electrodes montage
But we don't find the same results when we compare ..
do you know why ??
The pre-process only contain notch filtering (50Hz) but no other filter.

Thanks a lot

Margaux

Francois · October 28, 2019, 11:50am

Hi Margaux,

I'm not sure I understand what your question is.
If you obtain different results before and after applying a re-reference montage to the data, it makes sense to obtain a different topography of significant electrodes.
If you obtain different results depending only on the number of electrodes, this has to be related with the correction for multiple comparisons. Uncorrected results should be the same (all t-values a computed independently for each time point and each electrode), but then if you have more electrodes, you do more dependent tests simultaneously (= multiple comparisons) and the level of correction (FDR or Bonferroni) will be higher.
https://neuroimage.usc.edu/brainstorm/Tutorials/Statistics#Correction_for_multiple_comparisons

Francois

darzounian · October 28, 2019, 12:11pm

Hi François,

Let me help Margaux clarify. She has 128-channel EEG recordings that she re-referenced to the average of 2 mastoids. She computed statistics to compare different conditions on the 128-channel data. We thought it would be easier to start by visualizing the results in a subset of the channels, so she created a 32-channel montage by selecting the electrodes of interest, using the same mastoid reference as before. She re-computed the statistics with the same procedure, except she used the process “apply montage” before the t-test. For sanity check she compared the results in the 32-channel montage to the corresponding channels in the original 128-channel results, of course without applying any correction for multiple comparison. We expected, as you said, that uncorrected results should be the same, but they are not. Any clue why?

Dorothée

Francois · October 28, 2019, 2:23pm

Because one has a re-referencing montage applied and not the other.
You're not testing the same values, why would you expect to get the same results?

darzounian · October 28, 2019, 2:48pm

Sorry if I am missing something, why would we be testing different values ?

In fact we are trying to test the same values. We did check and the signals obtained using the re-referenced 32-channel montage are exactly the same as the corresponding signals in the re-referenced 128-channel montage, as we had expected. The differences appear only after the stat process.

In fact, there is a re-referencing montage in both cases. The only difference is that in the first montage, we kept all 128 channels whereas in the second one we keep only 32 out of 128. Sorry if we were unclear about this.

Francois · October 28, 2019, 2:59pm

I'm sorry, this is confusing, you wrote previously:

We thought it would be easier to start by visualizing the results in a subset of the channels, so she created a 32-channel montage by selecting the electrodes of interest, using the same mastoid reference as before. She re-computed the statistics with the same procedure, except she used the process “apply montage” before the t-test.

This implies that the first test was performed using the original recordings, and the second one on re-reference data.

If indeed you tested exactly the same dataset processed exactly in the same way, the channels common to both cases should give you exactly the same uncorrected p-values, as each test is done independently. If they appear different, make sure this is not because of manipulation errors in one of the two, or simply because of differences in colormaps or other display-related configuration options.

If you can replicate this same observation, then there is maybe a bug somewhere. In that case, I would need an example dataset so I can reproduce the erroneous behavior on my end and fix the code accordingly. Please prepare one subject with only the test datasets, with four folders: trials of condition A 32 channels, condition B 32 chan, condition A 128 chan, condition B 128 chan. Keep the channel files and the minimum number of trials necessary to reproduce the error (two should be enough - just set the p-value threshold to 1 in order to see something).
Then right-click on the subject > File > Export subject, upload the zip file somewhere and post the download link here.

Thanks

Francois · October 28, 2019, 3:06pm

PS: calling this process "Apply montage" should not be necessary if you have set your montage as a linear operator (https://neuroimage.usc.edu/brainstorm/Tutorials/Epilepsy#Average_reference) and have imported the epochs to the database, your montage is already applied.
In most cases in EEG, you should not use this process "Apply montage". So maybe there is some confusion in your pre-processing pipeline.

There are some use cases of the this process when converting from common-reference intracranial recordings to bipolar montages (https://neuroimage.usc.edu/brainstorm/Tutorials/Epileptogenicity#Bipolar_montage), but this is most likely not your case here.

darzounian · October 28, 2019, 4:00pm

I apologize, my initial post was indeed confusing. Both tests were performed on re-referenced data.

We will make sure again we’re able to replicate this issue and will get back to you with a sparse dataset.

Dorothée

darzounian · October 28, 2019, 4:01pm

Is there a way we can create a customized re-referencing linear operator, other than average reference? We want to use mastoid references so as to be able to compare our results in the sensor space with other previously reported results. As you said yourself, we expect different stat results depending on the reference used.

Also, one related question: if we use the montage feature to display the stat results only on a subset of channels, will the threshold correction for multiple comparison be updated based on the number of displayed channels? I expected not, and this was a further motivation to create and “apply” a montage with only 32 channels.

Dorothée

Francois · October 29, 2019, 8:37am

Is there a way we can create a customized re-referencing linear operator, other than average reference? We want to use mastoid references so as to be able to compare our results in the sensor space with other previously reported results.

Yes, select the process "Re-reference EEG" and read the instructions in the process options. Is there anything that is not clear?

Also, one related question: if we use the montage feature to display the stat results only on a subset of channels, will the threshold correction for multiple comparison be updated based on the number of displayed channels?

No, indeed. The subselection of sensors occurs on the FDR correction applied using all the electrodes. Note that selecting the number of input data in order to adjust your significance levels manually is not a good approach when using statistics.

I expected not, and this was a further motivation to create and “apply” a montage with only 32 channels.

I see. I never tried using this process only to remove channels, maybe there is some hidden bug somewhere in there.

darzounian · October 29, 2019, 10:05am

Got it, thank you Francois!

Margaux · October 30, 2019, 2:48pm

Hello Francois,

I give you a piece of our data, if you can try to find the problem of the difference between 32 and 128 electrodes we find in our statistics ...

https://www.dropbox.com/sh/en3c3uze4kcmax1/AAC6uOKLvb2IlbprjZ962n0la?dl=0

Thanks a lot

Margaux

Francois · October 31, 2019, 11:49am

First: start by updating Brainstorm, your database structure is outdated
(but it was a good occasion to find a few more bugs when importing old protocols
Then reload your entire database because there are some weird issues in there (missing references to @intra folders...)

Then, you have one obvious difference between your 128 and 32-channel files: the list of bad channels. You marked bad channels in the 32 channel, not in the 128.

Also, you should make a better effort in standardizing the two versions, if you want to be able to track what the differences are: start by setting the same channel names in both. It makes it very difficult to navigate, the way it is now.