85
Comment:
|
16643
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= Writing your own processes = User-defined processes in ~user/.brainstorm/process |
= How to write your own process = Brainstorm offers a flexible plug-in structure. All the operations available when using the Process1 and Process2 tabs, which means most of the Brainstorm features, are in fact written as plug-ins. This tutorial looks long and complicated, but don't let it scare you. Putting your code in a process is not so difficult. The first part in an exaustive reference manual that details all the possible options, you don't need to understand it completely. The second part explains how to copy an existing process and modify it to do what you want. {{attachment:introPipeline.gif||height="283",width="489"}} == Process folders == A Brainstorm plug-in, or "process", is a single Matlab .m script that is automatically identified and added to the menus in the pipeline editor. Two folders are parsed for plug-ins: * '''brainstorm3/toolbox/process/functions''':<<BR>>Brainstorm "official" processes, included in the main distribution of the software * '''$HOME/.brainstorm/process''':<<BR>>User processes folder, to develop new processes or overwrite some default function If you write a valid process function and place it in one of those folders, it will become automatically available in the pipeline editor menus, when you use the Process1 or Process2 tabs. Send it to another Brainstorm user and your code will be automatically available into the other person's Brainstorm interface. It is a very efficient way solution for exchanging code without the nightmare of understanding what are the inputs of the functions (units of the values, dimensions of the matrices, etc.). == Structure of the process scripts == === Sub-functions === A process function must be named "process_...m" and located in one of the two process folders in order to be recognized by the software. Let's call our example function "process_test.m". It contains at least 4 functions: * '''process_test'''(): The first line of the script must contain a function with the same name as the .m script. It contains only a call to the Brainstorm script macro_methodcall. This allows us to call subfunction in the process_test.m script from outside, using the syntax: process_test('!FunctionName', arguments) * '''!GetDescription'''(): Returns a structure that describes the process: name, category, accepted inputs, options, etc. This function is called when Brainstorm parses the process folders to find all the valid processes. It informs the pipeline editor on how the process has to be integrated in the interface. * '''!FormatComment'''(): Returns a string that identifies the process in the interface. In the pipeline editor window, when the process is selected or when its options are modified, this function is called to update the process description line. Most processes would return simply the field sProcess.Comment, some other would add some options in the description (example: Pre-process > Band-pass filter, or Average > Average files). * '''Run'''(): Function called when the process is executed, either from the interface (after clicking on the Run button of the pipeline editor) or from a Matlab script (call to bst_process('!CallProcess', 'process_test', ...)). While the three first functions are descriptive, this one really does something. It receives the files placed in the Process1 or Process2 boxes, does its job and returns the output of the computation to Brainstorm. You are free to add as many sub-functions as needed to the process file. If your process needs some sub-functions to run, it is preferable to copy the full code directly into the "process_test.m" code, rather than leaving it in separate functions. This way it prevents from spreading subfunctions everywhere, which get later lost or forgotten in the distribution when the process is deleted. It might be incomfortable at the beginning if you are not used to work with scripts with over 100 lines, but you'll get used to it, the Matlab code editor offers many solution to make long scripts easy to edit (cells, code folding...). It makes your process easier to maintain and to exchange with other users, which is important in the long run. === Optional function: Compute() === Some processes can be designed to be called at the same time from the Brainstorm context, to work as a plug-in, and directly from the Matlab command line or a script, independently from the Brainstorm database and plug-in system. In this case, we can leave what is specific to the Brainstorm structure in the Run() function, and move the real computation to additional sub-functions. In this case, we recommend that you respect the following convention: name the main external sub-function '''Compute'''(). === Example: Z-score === Let's take the example of the process "Standardize > Z-score (static)", which is described in the function '''process_zscore.m'''. The function '''Run()''' reads and tests the options defined by the user and then calls Compute(), which is responsible from calculating the z-score normalization. {{{ function sInput = Run(sProcess, sInput) % Get inputs iBaseline = panel_time('GetTimeIndices', sInput.TimeVector, sProcess.options.baseline.Value{1}); [...] % Compute zscore sInput.A = Compute(sInput.A, iBaseline); [...] end }}} The function '''Compute'''() calls another function '''!ComputeStat'''(): {{{ function A = Compute(A, iBaseline) % Calculate mean and standard deviation [meanBaseline, stdBaseline] = ComputeStat(A(:, iBaseline,:)); % Compute zscore A = bst_bsxfun(@minus, A, meanBaseline); A = bst_bsxfun(@rdivide, A, stdBaseline); end }}} {{{ function [meanBaseline, stdBaseline] = ComputeStat(A) % Compute baseline statistics stdBaseline = std(A, 0, 2); meanBaseline = mean(A, 2); % Remove null variance values stdBaseline(stdBaseline == 0) = 1e-12; end }}} This mechanism allows us to access this z-score function at different levels. We can call it as a Brainstorm process that takes Brainstorm structures in input (this is usually not done manually, but by the pipeline editor or by bst_process): {{{ sInput = process_zscore('Run', sProcess, sInput); }}} Or as regular functions that take standard Matlab matrices in input: {{{ % Generate some random signal F = rand(1,500); ind = 1:100; % Normalize the signal F = process_zscore('Compute', F, ind); % Or just calculate its average and standard deviation [Favg, Fstd] = process_zscore('ComputeStat', F); }}} == Process description == The function !GetDescription() creates a structure sProcess that documents the process: its name, the way it is supposed to be used in the interface and all the options it needs. It contains the following fields: * '''Comment''': String that represents the process in the "Add process" menus of the pipeline editor window. * '''!FileTag''': String that is added to the description of the output files, in the case of "Filter" processes. In the example of the Z-score process, !FileTag='| zscore'. If you apply the Z-score process on a file named "MN: MEG", the file created by the process is named "MN: MEG | zscore". This file tag is also added to the file name. * '''Category''': String that defines how the process is supposed to behave. The possible values are defined in the next section: 'Filter', 'File', 'Custom'... * '''!SubGroup''': Sub-menu in which you want the process to appear in the menus of the pipeline editor. It can be an existing category (eg. 'Pre-processing', 'Standardize', etc) or a new category. * '''Index''': Integer value that indicates a relative position in the "Add process" menus. For example, if your process sets Index=411, it would be displayed after the Z-score process (Index=410). Two processes can have the same index, in this case the one displayed first is the one that is read first in the process folder. If you set the Index to zero, it would be ignored and not displayed in the menus of the pipeline editor. * '''isSeparator''': Display a separator bar after the process in the pipeline editor menus. * '''!InputTypes''': Cell array of strings that represents the list of possible input types ('raw', 'data', 'results', 'timefreq', 'matrix'). This information is used to determine if a process is available in a specific interface configuration. For example: In the Process1 tab, if the "Process sources" button is selected, only the processes that have 'results' in their list !InputTypes, will marked as available. All the others will be greyed out. * '''!OutputTypes''': Cell array of strings with the same dimension as !InputTypes. It defines, for each input type, what is the type of the files in output. For example: a process that has !InputTypes={'data','results'} and !OutputTypes={'timefreq','timefreq'} transforms recordings in time-frequency objects, and sources in time-frequency objects. Now if !OutputTypes={'data','results'}, the type of the new files is the same as the one from the input file. * '''nInputs''': Integer, defines the number of inputs of the process. If nInputs=1, the process appears in the Process1 tab. If nInputs=2, the process appears in the Process2 tab. * '''nMinFiles''': Integer, minimum number of files required by the process to run. Can be 1 or more. * '''processDim''': For the "Filter" processes, defines along which dimensions the process is allowed to split the input data matrix while processing it, if it's to big to be processed at once. Possible values: * 1: Split in blocks of signals (example: Band-pass filter or Sinusoid removal) * 2: Split in time blocks (example: EEG average reference, apply SSP) * Empty: Does not allow the process to split the input data matrix (default) * '''isSourceAbsolute''': For the processes that accept source maps in input (type 'results'), this option defines if we want to process the real source values or their absolute values. The possible values are: * -1: Never process the absolute values of the sources * 0: Offer it as an option, but disabled by default * 1: Offer it as an option, and enables it by default * 2: Always process the absolute values of the sources * '''isPaired''': Applies only to the processes with two inputs (Process2), defines if the process needs pairs of files in input. If isPaired=1, the first file in FilesA and the first file in FilesB are processed together, and the second files are processed together, and so on. For example, it is the case for the paired t-test. * '''options''': List of options that are offered to the user in the pipeline editor window, and used by the !FormatComment() and Run() functions. This variable is a structure, where each field represents an option. Not all the fields have to be defined in the function !GetDescription(). The missing ones will be set to their default values, as defined in db_template('!ProcessDesc'). == Definition of the options == === Options structure === The field sProcess.options describes the list of options that are displayed in the pipline editor window when the process is selected. It is a structure with one field per option. If we have an option named "overwrite", it is described in the structure sProcess.options.overwrite. Every option is a structure with the following fields: * '''Type''': String, defines the type of option (checkbox, text field, etc.) * '''Comment''': String, describes the option in the pipeline editor window (some option types ignore it) * '''Value''': Default value for this option. The type of this variable depends of the Type field of the option === User preferences === Note that the default values defined in sProcess.options are usually displayed only once. When the user modifies the option, the new value is saved in the user preferences and offered as the default the next time the process is selected in the pipeline editor. If you modify the Value field in your process function, the default offered when you select the process in the pipeline editor may not change accordingly. This means that another default has been saved in the user preferences. To reset all the options to their real default values (as defined in the process functions), you can use the menu '''Reset options''' in the Pipeline menu of pipeline editor window. === Option types === * 'checkbox' * 'radio' * 'combobox' * 'range', 'timewindow', 'baseline', 'poststim' {[start,stop], units, precision} * 'value' {value, units, precision} * 'label' * 'text' * 'textarea' * 'groupbands' * {'cluster', 'cluster_confirm'} * 'channelname' * 'subjectname' * 'atlas' * {'filename', 'datafile'} * 'editpref' * 'separator' == Categories of process == There are three different types of processes: '''Filter''', '''File''', '''Custom'''. The category of the process is defined by the field sProcess.Category. === Filter === Brainstorm processes independently each file in the input list (the files that have been dropped in the Process1 or Process2 files lists) and is responsible for the following operations: * reading the input files, * possibly splitting them if they are too big, * writing the output files on the hard drive, * referencing them in the database. In the process, the function Run(): * receives the data matrix to process, one file at a time, * applies some operation on it, * returns the processed values. Advantages: All the complicated things are taken care of automatically, the functions can be very short. Limitations: There is no control over the file names and locations, one file in input = one file in output, and the file type cannot be changed (!InputTypes=!OutputTypes). For example, let's consider one of the simplest processes: process_absolute.m. It just calcuates the absolute value of the input data matrix. The Run() function is only one line long: {{{ function sInput = Run(sProcess, sInput) sInput.A = abs(sInput.A); end }}} The sInput structure gives lots of information about the input file coming from the database, and one additional fields "A" that contains the block of data to process. This process just applies the function abs() to the data sInput.A and returns modified values. A new file is created by Brainstorm in the database to store this result. === File === Brainstorm processes independently each file in the input list. It creates a structure sInput that documents the input file but does not load the data in the "A" field, as in the Filter case. In the process, the function Run() is called once for each input file and is responsible for: * reading the input file, * processing it, * saving the results in a new file on the hard drive, [optional] * referencing the new file in the database. [optional] * return the path of the new file The resulting functions are much longer, but this time the process is free do anything, there are no restrictions. The outline of the typical Run() function can be described as following: {{{ function OutputFile = Run(sProcess, sInput) % Load input file DataMat = in_bst_data(sInput.FileName); % Apply some function to the data in DataMat OutputMat = some_function(DataMat); % Generate a new file name in the same folder OutputFile = bst_process('GetNewFilename', bst_fileparts(sStudy.FileName), fileType); % Save the new file save(OutputFile, '-struct', 'OutputMat'); % Reference OutputFile in the database: db_add_data(sInput.iStudy, OutputFile, OutputMat); end }}} === Custom === Similar to the previous case "File", but this time all the input files are passed at onces to the process. The function Run() is called only once. It receives all the input file names in an array of structures "sInputs". From that, it can create zero, one or many files. The list of output files is returned in a cell array of strings "!OutputFiles". {{{ function OutputFiles = Run(sProcess, sInputs) % Load input files % Do something interesting % Save new files % Reference the new files in the database % Return all the new file names in the cell-array OutputFiles end }}} == Input description == The structure sInput contains the following fields: * iStudy: * iItem: * !FileName: * !FileType: * Comment: * Condition: * !SubjectFile: * !SubjectName: * !DataFile: * !ChannelFile: * !ChannelTypes: == Running a process == A pipeline is an array of sProcess structures, that are exectuted one after the other. == Alternative == Run Matlab command |
How to write your own process
Brainstorm offers a flexible plug-in structure. All the operations available when using the Process1 and Process2 tabs, which means most of the Brainstorm features, are in fact written as plug-ins.
This tutorial looks long and complicated, but don't let it scare you. Putting your code in a process is not so difficult. The first part in an exaustive reference manual that details all the possible options, you don't need to understand it completely. The second part explains how to copy an existing process and modify it to do what you want.
Process folders
A Brainstorm plug-in, or "process", is a single Matlab .m script that is automatically identified and added to the menus in the pipeline editor. Two folders are parsed for plug-ins:
brainstorm3/toolbox/process/functions:
Brainstorm "official" processes, included in the main distribution of the software$HOME/.brainstorm/process:
User processes folder, to develop new processes or overwrite some default function
If you write a valid process function and place it in one of those folders, it will become automatically available in the pipeline editor menus, when you use the Process1 or Process2 tabs.
Send it to another Brainstorm user and your code will be automatically available into the other person's Brainstorm interface. It is a very efficient way solution for exchanging code without the nightmare of understanding what are the inputs of the functions (units of the values, dimensions of the matrices, etc.).
Structure of the process scripts
Sub-functions
A process function must be named "process_...m" and located in one of the two process folders in order to be recognized by the software. Let's call our example function "process_test.m". It contains at least 4 functions:
process_test(): The first line of the script must contain a function with the same name as the .m script. It contains only a call to the Brainstorm script macro_methodcall. This allows us to call subfunction in the process_test.m script from outside, using the syntax: process_test('FunctionName', arguments)
GetDescription(): Returns a structure that describes the process: name, category, accepted inputs, options, etc. This function is called when Brainstorm parses the process folders to find all the valid processes. It informs the pipeline editor on how the process has to be integrated in the interface.
FormatComment(): Returns a string that identifies the process in the interface. In the pipeline editor window, when the process is selected or when its options are modified, this function is called to update the process description line. Most processes would return simply the field sProcess.Comment, some other would add some options in the description (example: Pre-process > Band-pass filter, or Average > Average files).
Run(): Function called when the process is executed, either from the interface (after clicking on the Run button of the pipeline editor) or from a Matlab script (call to bst_process('CallProcess', 'process_test', ...)). While the three first functions are descriptive, this one really does something. It receives the files placed in the Process1 or Process2 boxes, does its job and returns the output of the computation to Brainstorm.
You are free to add as many sub-functions as needed to the process file. If your process needs some sub-functions to run, it is preferable to copy the full code directly into the "process_test.m" code, rather than leaving it in separate functions. This way it prevents from spreading subfunctions everywhere, which get later lost or forgotten in the distribution when the process is deleted. It might be incomfortable at the beginning if you are not used to work with scripts with over 100 lines, but you'll get used to it, the Matlab code editor offers many solution to make long scripts easy to edit (cells, code folding...). It makes your process easier to maintain and to exchange with other users, which is important in the long run.
Optional function: Compute()
Some processes can be designed to be called at the same time from the Brainstorm context, to work as a plug-in, and directly from the Matlab command line or a script, independently from the Brainstorm database and plug-in system.
In this case, we can leave what is specific to the Brainstorm structure in the Run() function, and move the real computation to additional sub-functions. In this case, we recommend that you respect the following convention: name the main external sub-function Compute().
Example: Z-score
Let's take the example of the process "Standardize > Z-score (static)", which is described in the function process_zscore.m. The function Run() reads and tests the options defined by the user and then calls Compute(), which is responsible from calculating the z-score normalization.
function sInput = Run(sProcess, sInput) % Get inputs iBaseline = panel_time('GetTimeIndices', sInput.TimeVector, sProcess.options.baseline.Value{1}); [...] % Compute zscore sInput.A = Compute(sInput.A, iBaseline); [...] end
The function Compute() calls another function ComputeStat():
function A = Compute(A, iBaseline) % Calculate mean and standard deviation [meanBaseline, stdBaseline] = ComputeStat(A(:, iBaseline,:)); % Compute zscore A = bst_bsxfun(@minus, A, meanBaseline); A = bst_bsxfun(@rdivide, A, stdBaseline); end
function [meanBaseline, stdBaseline] = ComputeStat(A) % Compute baseline statistics stdBaseline = std(A, 0, 2); meanBaseline = mean(A, 2); % Remove null variance values stdBaseline(stdBaseline == 0) = 1e-12; end
This mechanism allows us to access this z-score function at different levels. We can call it as a Brainstorm process that takes Brainstorm structures in input (this is usually not done manually, but by the pipeline editor or by bst_process):
sInput = process_zscore('Run', sProcess, sInput);
Or as regular functions that take standard Matlab matrices in input:
% Generate some random signal F = rand(1,500); ind = 1:100; % Normalize the signal F = process_zscore('Compute', F, ind); % Or just calculate its average and standard deviation [Favg, Fstd] = process_zscore('ComputeStat', F);
Process description
The function GetDescription() creates a structure sProcess that documents the process: its name, the way it is supposed to be used in the interface and all the options it needs. It contains the following fields:
Comment: String that represents the process in the "Add process" menus of the pipeline editor window.
FileTag: String that is added to the description of the output files, in the case of "Filter" processes. In the example of the Z-score process, FileTag='| zscore'. If you apply the Z-score process on a file named "MN: MEG", the file created by the process is named "MN: MEG | zscore". This file tag is also added to the file name.
Category: String that defines how the process is supposed to behave. The possible values are defined in the next section: 'Filter', 'File', 'Custom'...
SubGroup: Sub-menu in which you want the process to appear in the menus of the pipeline editor. It can be an existing category (eg. 'Pre-processing', 'Standardize', etc) or a new category.
Index: Integer value that indicates a relative position in the "Add process" menus. For example, if your process sets Index=411, it would be displayed after the Z-score process (Index=410). Two processes can have the same index, in this case the one displayed first is the one that is read first in the process folder. If you set the Index to zero, it would be ignored and not displayed in the menus of the pipeline editor.
isSeparator: Display a separator bar after the process in the pipeline editor menus.
InputTypes: Cell array of strings that represents the list of possible input types ('raw', 'data', 'results', 'timefreq', 'matrix'). This information is used to determine if a process is available in a specific interface configuration. For example: In the Process1 tab, if the "Process sources" button is selected, only the processes that have 'results' in their list InputTypes, will marked as available. All the others will be greyed out.
OutputTypes: Cell array of strings with the same dimension as InputTypes. It defines, for each input type, what is the type of the files in output. For example: a process that has InputTypes={'data','results'} and OutputTypes={'timefreq','timefreq'} transforms recordings in time-frequency objects, and sources in time-frequency objects. Now if OutputTypes={'data','results'}, the type of the new files is the same as the one from the input file.
nInputs: Integer, defines the number of inputs of the process. If nInputs=1, the process appears in the Process1 tab. If nInputs=2, the process appears in the Process2 tab.
nMinFiles: Integer, minimum number of files required by the process to run. Can be 1 or more.
processDim: For the "Filter" processes, defines along which dimensions the process is allowed to split the input data matrix while processing it, if it's to big to be processed at once. Possible values:
- 1: Split in blocks of signals (example: Band-pass filter or Sinusoid removal)
- 2: Split in time blocks (example: EEG average reference, apply SSP)
- Empty: Does not allow the process to split the input data matrix (default)
isSourceAbsolute: For the processes that accept source maps in input (type 'results'), this option defines if we want to process the real source values or their absolute values. The possible values are:
- -1: Never process the absolute values of the sources
- 0: Offer it as an option, but disabled by default
- 1: Offer it as an option, and enables it by default
- 2: Always process the absolute values of the sources
isPaired: Applies only to the processes with two inputs (Process2), defines if the process needs pairs of files in input. If isPaired=1, the first file in FilesA and the first file in FilesB are processed together, and the second files are processed together, and so on. For example, it is the case for the paired t-test.
options: List of options that are offered to the user in the pipeline editor window, and used by the FormatComment() and Run() functions. This variable is a structure, where each field represents an option.
Not all the fields have to be defined in the function GetDescription(). The missing ones will be set to their default values, as defined in db_template('ProcessDesc').
Definition of the options
Options structure
The field sProcess.options describes the list of options that are displayed in the pipline editor window when the process is selected. It is a structure with one field per option. If we have an option named "overwrite", it is described in the structure sProcess.options.overwrite. Every option is a structure with the following fields:
Type: String, defines the type of option (checkbox, text field, etc.)
Comment: String, describes the option in the pipeline editor window (some option types ignore it)
Value: Default value for this option. The type of this variable depends of the Type field of the option
User preferences
Note that the default values defined in sProcess.options are usually displayed only once. When the user modifies the option, the new value is saved in the user preferences and offered as the default the next time the process is selected in the pipeline editor.
If you modify the Value field in your process function, the default offered when you select the process in the pipeline editor may not change accordingly. This means that another default has been saved in the user preferences. To reset all the options to their real default values (as defined in the process functions), you can use the menu Reset options in the Pipeline menu of pipeline editor window.
Option types
- 'checkbox'
- 'radio'
- 'combobox'
- 'range', 'timewindow', 'baseline', 'poststim' {[start,stop], units, precision}
- 'value' {value, units, precision}
- 'label'
- 'text'
- 'textarea'
- 'groupbands'
- {'cluster', 'cluster_confirm'}
- 'channelname'
- 'subjectname'
- 'atlas'
- {'filename', 'datafile'}
- 'editpref'
- 'separator'
Categories of process
There are three different types of processes: Filter, File, Custom. The category of the process is defined by the field sProcess.Category.
Filter
Brainstorm processes independently each file in the input list (the files that have been dropped in the Process1 or Process2 files lists) and is responsible for the following operations:
- reading the input files,
- possibly splitting them if they are too big,
- writing the output files on the hard drive,
- referencing them in the database.
In the process, the function Run():
- receives the data matrix to process, one file at a time,
- applies some operation on it,
- returns the processed values.
Advantages: All the complicated things are taken care of automatically, the functions can be very short.
Limitations: There is no control over the file names and locations, one file in input = one file in output, and the file type cannot be changed (InputTypes=OutputTypes).
For example, let's consider one of the simplest processes: process_absolute.m. It just calcuates the absolute value of the input data matrix. The Run() function is only one line long:
function sInput = Run(sProcess, sInput) sInput.A = abs(sInput.A); end
The sInput structure gives lots of information about the input file coming from the database, and one additional fields "A" that contains the block of data to process. This process just applies the function abs() to the data sInput.A and returns modified values. A new file is created by Brainstorm in the database to store this result.
File
Brainstorm processes independently each file in the input list. It creates a structure sInput that documents the input file but does not load the data in the "A" field, as in the Filter case.
In the process, the function Run() is called once for each input file and is responsible for:
- reading the input file,
- processing it,
- saving the results in a new file on the hard drive, [optional]
- referencing the new file in the database. [optional]
- return the path of the new file
The resulting functions are much longer, but this time the process is free do anything, there are no restrictions. The outline of the typical Run() function can be described as following:
function OutputFile = Run(sProcess, sInput) % Load input file DataMat = in_bst_data(sInput.FileName); % Apply some function to the data in DataMat OutputMat = some_function(DataMat); % Generate a new file name in the same folder OutputFile = bst_process('GetNewFilename', bst_fileparts(sStudy.FileName), fileType); % Save the new file save(OutputFile, '-struct', 'OutputMat'); % Reference OutputFile in the database: db_add_data(sInput.iStudy, OutputFile, OutputMat); end
Custom
Similar to the previous case "File", but this time all the input files are passed at onces to the process.
The function Run() is called only once. It receives all the input file names in an array of structures "sInputs". From that, it can create zero, one or many files. The list of output files is returned in a cell array of strings "OutputFiles".
function OutputFiles = Run(sProcess, sInputs) % Load input files % Do something interesting % Save new files % Reference the new files in the database % Return all the new file names in the cell-array OutputFiles end
Input description
The structure sInput contains the following fields:
- iStudy:
- iItem:
FileName:
FileType:
- Comment:
- Condition:
SubjectFile:
SubjectName:
DataFile:
ChannelFile:
ChannelTypes:
Running a process
A pipeline is an array of sProcess structures, that are exectuted one after the other.
Alternative
Run Matlab command