Classification (Decoding) Error

Sean_Trott · February 28, 2017, 9:53pm

Hello,

Apologies if this has been posted before, but I'm having some trouble using the decoding / classification process. I followed the tutorial from this page (http://neuroimage.usc.edu/brainstorm/Tutorials/Decoding) successfully with no issues, but I've run into the error when trying to use this on my own data-set.

A few other possibly relevant notes:

The data seems fine, but it did undergo several conversion processes from another file format to get it Brainstorm ready (it's now in the ".set" file format, which is also used in EEGLAB); I don't currently have names for the channels, nor is the data calibrated/normalized, not sure if that matters.
I haven't yet run any sort of artifact detection/cleaning yet - this was purely a preliminary move to see what the classifier output.

I'm getting the error when I run both cross-validation or permutation . I have 36 samples from each condition. Any idea what this is from?

Let me know if you need more information!
Thanks,
Sean Trott

Francois · March 3, 2017, 9:44am

Hi Sean,

I have forwarded your message to the authors of this method, hopefully they will reply directly on the forum.

What was the original file format? Do you think this is something that could be added to Brainstorm?
This is good, the classifiers should be able to discriminate the conditions without pre-processing.

Francois

Sean_Trott · March 7, 2017, 3:43pm

Thank you, that would be really helpful! I’ve actually made some progress – by not applying a frequency filter during the classifier, I avoid the error.

Re: the questions:

The original file format was specific to one of the labs here at UC San Diego, and thus is probably not worth adding to Brainstorm. The lab has some methods in MATLAB for converting the files to “.set” files.
Yep, the classifier currently isn’t performing that well, but it’s working at least (between 70-80% peak accuracy for discriminating between two conditions). But I’m hoping that once I clean the data more, it will help the classifier.

I [B]did[/B] actually have a couple of other questions, which would also be great to ask the paper authors (I’m thinking of the Cichy et al, 2014 paper) .

Currently, I’ve just been comparing individual events from two conditions, e.g. Condition A vs. Condition B. But there are many nested factors (e.g. Condition A1 vs. Condition A2). In the Cichy (2014) paper, they make a 92x92 matrix of SVM classifier accuracy at each time point. This allows them to compare different levels of category discrimination, e.g. “faces vs. bodies” and “animate vs. inanimate”.

Do you know whether Brainstorm has any sorts of contingencies for this built-in? Alternatively, is there a way to use the existing libraries in Brainstorm for a MATLAB script that would do this?

Thank you!

Francois · March 8, 2017, 12:14am

Hello,

The structure of the Brainstorm database currently doesn’t allow the classification of files in multiple categories.
For running the analysis at different levels of classification, you need to write your own scripts, the interface cannot help you much with it.
You can save the epochs corresponding to all the conditions in the same folder, with the comment of the files indicating which image it is. Your script can then select the files based on their names (using the selection processes in the File category), and possibly group them by more general categories. It’s not too complicated, but you have to be familiar with Matlab, or at least with programming in general.
If you need more help with the methodology, try to contact the authors of this article directly (R Cichy & D Pantazis).

You can find some help with the file selection and the scripting in these tutorials and scripts:
http://neuroimage.usc.edu/brainstorm/Tutorials/PipelineEditor#Automatic_script_generation
http://neuroimage.usc.edu/brainstorm/Tutorials/Scripting

github.com

brainstorm-tools/brainstorm3/blob/master/toolbox/script/tutorial_visual_group.m#L131


'keepevents', 1);

% Process: Average: By trial group (grand average)

sAvgGroup = bst_process('CallProcess', 'process_average', sAvgSubj, [], ...

'avgtype',    7, ...  % By trial group (grand average)

'avg_func',   1, ...  % Arithmetic average:  mean(x)

'weighted',   0, ...

'keepevents', 0);



% ===== SUBJECT AVERAGES: SOURCES (EEG) =====

% Process: Select source files in: */*/EEG

sAvgRunSrcEeg = bst_process('CallProcess', 'process_select_files_results', [], [], 'tag', 'EEG');

% Process: Weighted Average: By trial group (subject average) - EEG

sAvgSubjSrcEeg = bst_process('CallProcess', 'process_average', sAvgRunSrcEeg, [], ...

'avgtype',         6, ...  % By trial group (subject average)

'avg_func',        1, ...  % Arithmetic average:  mean(x)

'weighted',        1, ...

'scalenormalized', 0);

% Process: Add tag: EEG

sAvgSubjSrcEeg = bst_process('CallProcess', 'process_add_tag', sAvgSubjSrcEeg, [], ...

'tag',    'EEG', ...

'output', 1);  % Add to comment

Cheers,
Francois

skhaligh · March 14, 2017, 10:51am

Hi Sean,

This is an error in the filtfilt function, and not the decoding toolbox itself. This is the function used in brainstorm for filtering (e.g. low-pass filtering). So that is where you get the error. The error screen suggests that one of your data points has less than 300 channels, whereas I think the function expects you to have 300 channels here. So one solution is to go to your data folder (inside your brainstorm_db) for this subject and see which of your trials has less than 300 channels and remove them (you can do this by looking at Variable F).

Best,
Seyed

skhaligh · March 14, 2017, 11:18am

Hi Sean,

This is an error from the filtfilt function, and not the decoding toolbox itself. That is the function dealing with fitlering data (e.g. low-pass filtering).
The error suggests that you have 300 channels, and number of timepoints apparently can not be less that number of channels. So one or more of your trials are having less than 300 timepoints. I think that is why the error says you should have larger 300 samples. One solution is to go to your data (in brainstorm_db) for this subject and remove those trials (variable F) that have less than 300 timepoints; or otherwise maybe skip the filtering part.

Hope this helps,
Seyed

Sean_Trott · March 14, 2017, 3:10pm

Thanks! I got past the error by skipping the filtering.

Sean_Trott · March 14, 2017, 3:13pm

Cool, this is exactly what I needed. Is there also a code database or API documentation for the functions I need to call? E.g. on the GitHub?

Also, I had a methodological question concerning the classifier. Not sure if this is the right venue, but one option that occurred to me was to collapse events across subjects, rather than looking at each subject individually. However, it seems like this would quite possibly decrease classifier performance, since I imagine there will be some individual differences. Do you know whether there is any literature on this I could read relating to the factors affecting effectiveness of the classifier? (As well as the mechanism.)

Thanks again,
Sean

Francois · March 15, 2017, 1:28am

[QUOTE=Sean Trott;11832]Cool, this is exactly what I needed. Is there also a code database or API documentation for the functions I need to call? E.g. on the GitHub?[/QUOTE]
There is no formal technical documentation. The only thing that is correctly documented is the plugin API: http://neuroimage.usc.edu/brainstorm/Tutorials/TutUserProcess

The code is well documented, look for help directly from the Matlab scripts.
The code is on github (https://github.com/brainstorm-tools/brainstorm3), but if you have downloaded it, you already have a full copy of it.

For script examples, you can refer to all the examples in the section “Other analysis scenarios” on the tutorial page. At the end, each page has an equivalent script for what is done graphically in the tutorial.
All the scripts are in brainstorm3/toolbox/scripts/tutorial_*.m

[QUOTE=Sean Trott;11832]Also, I had a methodological question concerning the classifier. Not sure if this is the right venue, but one option that occurred to me was to collapse events across subjects, rather than looking at each subject individually. However, it seems like this would quite possibly decrease classifier performance, since I imagine there will be some individual differences. Do you know whether there is any literature on this I could read relating to the factors affecting effectiveness of the classifier? (As well as the mechanism.)[/QUOTE]

This is a question for Seyed.

Francois

Sean_Trott · March 16, 2017, 4:20pm

Hi,

I had a methodological question concerning the classifier. Not sure if this is the right venue, but one option that occurred to me was to collapse events across subjects, rather than looking at each subject individually. However, it seems like this would quite possibly decrease classifier performance, since I imagine there will be some individual differences. Do you know whether there is any literature on this I could read relating to the factors affecting effectiveness of the classifier? (As well as the mechanism.)

pantazis · March 17, 2017, 8:07am

Hi Sean,

I don't know literature explicitly discussing this issue. But you should definitely be running classifiers separately per subject because the sensor patterns are very different for each individual. All published articles use this approach, for fMRI, MEG, and EEG. Thus, you should be computing a decoding time series per subject, then evaluate the subject-average decoding time series for statistical significance (we always use permutation tests across subjects).

The approach is intuitively described in our articles, for example:
http://www.nature.com/neuro/journal/v17/n3/abs/nn.3635.html

The classifier is based on the libsvm implementation, and they have a guide for beginners:
https://www.csie.ntu.edu.tw/~cjlin/libsvm/

Best,
Dimitrios

Sean_Trott · March 21, 2017, 2:53pm

This is perfect, thank you for all the help.