Running Brainstorm script using parfor

Hello,

I am trying to run a matlab script using Brainstorm functions, and I would like to use a 'parfor' loop for parallel processing. However, every time I try I get the same error and I guess I am doing something wrong. I include below a simplified version of my script:

addpath('/xx/brainstorm3')
brainstorm
parpool()

participant = {'2262'};
session = {'A' 'B' 'C' 'D'};

for p = 1:length(participant)
parfor s = 1:length(session)

    sFiles = {...
    ['2262/@raw2262' session{s} '_Click_run1_mc_tsss_AMICA_matlab_high/data_0raw_2262' session{s} '_Click_run1_mc_tsss_AMICA_matlab_high.mat'], ...
    ['2262/@raw2262' session{s} '_Click_run2_mc_tsss_AMICA_matlab_high/data_0raw_2262' session{s} '_Click_run2_mc_tsss_AMICA_matlab_high.mat']};

    sFiles = bst_process('CallProcess', 'process_import_data_event', sFiles, [], ...
        'subjectname',  participant{p}, ...
        'condition',    '', ...
    ...%    'datafile',     RawFiles, ...
        'eventname',    ['EEG_MLR_11' session{s}], ...
        'timewindow',   [], ...
        'epochtime',    [-0.05, 0.1], ...
        'createcond',   1, ...
        'ignoreshort',  1, ...
        'channelalign', 0, ...
        'usectfcomp',   0, ...
        'usessp',       1, ...
        'freq',         [], ...
        'baseline',     []);
    
end

end

It works just fine if in the second for loop I use 'for' instead of 'parfor'. If I use 'parfor', this is the error I get, despite Brainstorm being started.

Error using bst_process>CallProcess (line 2104)
Please start Brainstorm before calling bst_process().
Error in bst_process (line 36)
eval(macro_method);

Any help would be really appreciated,

Many thanks,

Fran

Hi Fran,

Sadly, Brainstorm was not coded to work in a parallel setting. Calling any Brainstorm function requires access to global data that is not accessible inside a parallel thread. Your best bet is either to enable parallel processing in the options of the process you want to run (but very few processes support this) or make a script where each thread creates its own instance of Brainstorm (See Running scripts on a cluster).

Note that parallel processing is only beneficial in cases where there are a lot of CPU computations and little I/O operations. I don't think the import process fits that mould, I doubt you would save a lot of time as your bottleneck will be writing files to the disk, which is not parallelized.

I hope this helps!
Martin

Hello Martin,

Using your suggested option of creating its own instance of Brainstorm for each thread, I guess I could use parfor to run several participants at a time. (not necesarily for epoching, but for filtering for instance). Would that be possible? I'm thinking of a better solution than having several matlabs opened at a time running separate subjects.

Many thanks again,

Fran

Having multiple Brainstorm instances running on the same database and the same user can lead to many issues. Brainstorm is not (yet) robust for multiple simultaneous users. If you really want to try something like this, you have to expect some instabilities in the database structure. Reload the database before and after each processing step to prevent them to worsen.

Note that many Matlab functions are already coded with multiple threads. Run the filtering of one file in Brainstorm, open your resource monitor and look at the usage of the multiple processors. Parallel executions of multiple Matlab threads (or multiple multi-threaded Matlab functions) are not necessarily increasing the global execution speed. It packs the memory (all the workspace has to be duplicated for each worker) and could even prevent Matlab to run efficiently each instance.

2 Likes

Hi Francois,
The question is that with more than 400 participants, the whole processing step would take more than 400 hours in total, which means I have to wait for about 20 days.