Error opening nwb files

Hi Francois,
Thank you!
Unfortunately I haven't produced any nwb file myself yet, but I was practicing with these open-access files:
https://gui.dandiarchive.org/#/dandiset/000020/draft

Irene

I updated the NWB downloader, and it will automatically update the NWB library to the version currently supported by Brainstorm.
https://github.com/brainstorm-tools/brainstorm3/commit/93ad6b827c173a87962b137a6bdbdcbcde7c6308

However, I can't get the function nwbRead() to read the example files you suggested...
I set Brainstorm to use the last stable release of the matnwb library (https://github.com/NeurodataWithoutBorders/matnwb/tree/v2.2.5.0), and if I try to use it to read these NWB files, I get and error. If I use the current master branch, I get a different error...
I'm not sure what to do with that.

Are you familiar with this file format?
Can you read them with other programs in Matlab? Using this matnwb library?

@mpompolas Could you help us with this?

K>> nwb2 = nwbRead('sub-626194774_ses-637919731_icephys.nwb')
Dot indexing is not supported for variables of this type.

Error in file.fillExport>traverseRaw (line 120)
        attrmatch = strcmp({raw.attributes.name}, propname);

Error in file.fillExport>traverseRaw (line 111)
                    res = traverseRaw(suffix, raw.datasets(i));

Error in file.fillExport (line 23)
    pathProps = traverseRaw(pnm, raw);

Error in file.fillClass (line 93)
exporterFcns = file.fillExport(nonInherited, class, depnm);

Error in file.writeNamespace (line 22)
        fwrite(fid, file.fillClass(className, Namespace, processed, ...

Error in nwbRead>tryWriteSpec (line 103)
    file.writeNamespace(namespaceName);

Error in nwbRead>generateSpec (line 91)
    if ~tryWriteSpec(name)

Error in nwbRead>checkEmbeddedSpec (line 44)
    generateSpec(fid, h5info(filename, specLocation));

Error in nwbRead (line 27)
    specLocation = checkEmbeddedSpec(filename);

@mpompolas @MartinC
I discovered that I need to update the function in_tess_nwb.m in a similar way.
How does this file get used?

There was a change last year in the nwb and as far as I know now they use version control for reading/writing nwb files. Meaning that if a file was created with a specific version, you have to use "generateCore" from that version as well.
The only solution that I can think of, is have a cache of all versions and call the appropriate one depending on the file that needs to be loaded.
So just supporting a single version would unfortunately not be a solution.

Here are the versions:
https://github.com/NeurodataWithoutBorders/nwb-schema/releases

I can take a look at it if you still have trouble the week after the next one.

Konstantinos

Hi,
Thank you for your time.
I am not familiar with this format, I am trying to learn now.
I tried to open them with brainstorm and with Matlab using matnwb, but also in this case I had errors with generateCore and then with nwbRead.

I have opened a thread in the Neurodata without bordes on Slack and it looks like there is some problem with the matnwb.
I am not sure if you can see the thread via this link https://nwb-users.slack.com/archives/C5XKC14L9/p1597230134302200
Anyway the matnwb package has been updated (https://github.com/NeurodataWithoutBorders/matnwb but click on "Code->Download ZIP") but the nwbRead function still does not work properly. I think this is why you are getting two different errors with the master branch and with the v2.2.5.0.

In the meantime I managed to open the files with python, so there shouldn't be any problem with these files.

Irene

@mpompolas Yes, we might need your help... Check again this thread whenever you have some time available.

@mpompolas @MartinC
Where are the files read by in_tess_nwb.m supposed to come from?
This ecog.extensions.yaml you refer to in your code is only available on your personal github repo?

If this "ECoG extension" is not used by anybody and there is no clear need for it, I will remove it from the Brainstorm distribution...

@IreneP
No I don't have access to the NWB Slack and can't create an account (only limited to a few institutions).
I opened a github issue listing all the errors I obtained, maybe you can also post your Slack message there so I can follow part of the discussion:

@mpompolas @MartinC
Could you please share an example of the files you used to write the functions in_fopen_nwb.m/in_fread_nwb.m?

What is currently in Brainstorm can't be used for generic EEG files, it seems extremely specific to one dataset: eg. it needs to have one and only one recorded signal set named "raw" OR one and only processed signal set named "lfp".

I suggest you remove the ECoG extension for now and revisit it if needed.

I wrote this based on some example files I got from the developers about 1.5 years ago. I'm sure things have changed since. The ecog.extensions.yaml was an additional file, I thought I added when I merged, maybe I missed it.

Regarding the utilization of ECoG files in Brainstorm, the developers informed me that there are two ways of storing the locations of the ECoG electrodes now: one by storing the electrode locations on the .nwb, or you can have an additional .nwbaux (or something like that) file.
What was supported was based on the second approach.

@Francois regarding the 'raw' and 'lfp' labels:
unfortunately although NWB is a standardization format, they give too much freedom in labeling.
I had this discussion in the past with the developers and this will be problematic for support by us unless we are more strict with what should be used (but then, if a file is already created and not made based on our "stricter" guidelines it won't be loaded). They suggested maybe to have a brainstorm compatibility check, but this has not been done yet.

Too much freedom is what's going on here.

As you can see in in_nwb_read, I take the "most important" signals as a guide (electrophysiological - raw, lfp), and use their sampling rate as the main one. The behavioral - 'less important' signals, are downsampled or upsampled to match the sampling rate of the raw/lfp. The only way to know which signals should be considered as the guide for the sampling rate is if I force it to be with the label raw or lfp.

If you want this to be semi-automated, I would suggest there is a pop-up when the file is originally imported (in in_fopen_nwb) that asks the user to select one of the present keys within the nwb files as the "sampling guide".

I'll check if I can find a file.

1 Like

I've spent almost two days on the interface of the MATNWB library to read NWB files, and I feel like I've only gone backwards, it would require a lot more work to produce something easy to use in Brainstorm. Unfortunately, I can't dedicate more time to this project at the moment.
Maybe I will wait for a few months or a year a see if it gets any better.

@mpompolas
No need to spend time sending me other files. I'll ask later if there is a clear need for a major project to revive your NWB reading code.

@IreneP
Sorry for giving up after sounding so promising...
Feel free to keep on working on the reading functions in Brainstorm if you have time for this (in_fopen_nwb.m / in_fread_nwb.m / bst_install_nwb.m). If you need to read one specific dataset, it should not be too complicated to write some code that reads it. The complicated part would be to make it generic...
Otherwise, you could write Matlab scripts to read the data you are interested in from the .nwb files are save in the Brainstorm database directly. The data structures are all documented on the website:
https://neuroimage.usc.edu/brainstorm/Tutorials/Scripting#File_structures

FYI, the last message I posted on the github issue (https://github.com/NeurodataWithoutBorders/matnwb/issues/236):

Thank you for your prompt help and suggestions!

However, my goal is not really to read these files... I was hoping to be able to simply add the matnwb library to the Brainstorm environment (downloaded automatically from github when needed) to offer a native support for the NWB file format to the Brainstorm users, as it was accepted as one of the reference file format in the BIDS-EEG specifications.

But it looks a lot more complicated than expected. Apparently it will not be possible to include a version of the library that will be able to read all the .nwb files, it will always require some customized manipulations, and navigation between versions...

Another problem is that is doesn't seem to be clear how EEG/LFP/SEEG/ECOG signals should be stored in the NWB ontology. The various example files I could load in MATLAB do not organize the data in the same way. It doesn't seem easy to write code that could automatically find all the information it needs. This would require an important amount of work to supervise this interactive import of EEG signals from .nwb files into Brainstorm, and then to document it.

Our development resources on the Brainstorm project are limited and we have many projects to lead simultaneously. Unfortunately, we won't be able to invest time in writing supervision tools to manage multiple versions of the matnwb library or the schemas.
At the present time, it is not clear to me whether the matnwb library or the NWB file format are ready for a plug-and-play use from an EEG-processing software environment. Maybe I should wait for an extra year and try again later?

What are your plans for NWB/matnwb in the near future?
Will these problems of schema compatibility be handled more smoothly?
Will you provide more strict specifications/examples/tutorials for EEG/SEEG/ECoG/LFP?
Can you provide help (eg. development time) for integrating your library into external software?

Unless you have better suggestions, I will remove the NWB support currently available in the Brainstorm distribution: it was developed for one specific dataset by a former PhD student of our group (his work is mentioned on your website: https://www.nwb.org/tools/), but this code is not working with any other public dataset available using .nwb files. It needs more work than what we can provide at the moment to be maintained properly.

I'm working with 2 labs that use NWB at McGill. I will start working on this again soon

The new version of the ephys toolbox is out!

Github commit: https://github.com/brainstorm-tools/brainstorm3/commit/f402a427f5d3a89eb59e28d3c107ab18005bdce1

Updated tutorials: https://neuroimage.usc.edu/brainstorm/Tutorials#Multiunit_electrophysiology

This includes updates of the NWB reader in Brainstorm.

Hi,

I'm new to NWB format and had problems with reading NWB data (schema version 2.2.5) in Brainstorm (Version: 21-Jun-2022). The issue is that some NWB data don't have the 'group_name' key that is required by Brainstorm.
I temporarily solved the issue by editing the code in Brainstorm but I feel it's dirty and there must be better solutions, so I recorded it here.

Here is what happened:

  1. Initially I used MATLAB 2017b and encountered the error below:

** Error: Line 34: max
** Invalid option. Option must be 'omitnan' or 'includenan'.
**
** Call stack:
** >correctType.m at 34
** >checkDtype.m at 111
** >DynamicTableRegion.m>DynamicTableRegion.validate_data at 34
** >Data.m>Data.set.data at 29
** >Data.m>Data.Data at 22
** >VectorData.m>VectorData.VectorData at 13
** >DynamicTableRegion.m>DynamicTableRegion.DynamicTableRegion at 13
** >parseDataset.m at 72
** >parseGroup.m at 22
** >parseGroup.m at 38
** >parseGroup.m at 38
** >nwbRead.m at 59
** >in_fopen_nwb.m at 49
** >in_fopen.m at 171
** >import_raw.m at 127
** >bst_call.m at 28
** >tree_callbacks.m>@(h,ev)bst_call(@import_raw,[],[],iSubject) at 660

  1. It seems to require a newer version of max(). So I changed to R2019b and the error above disappeared, but it threw another error:

** Error: Line 165: containers.Map/subsref
** The specified key is not present in this container.
**
** Call stack:
** >Set.m>Set.get at 165
** >in_fopen_nwb.m>getDeeperModule at 244
** >in_fopen_nwb.m at 90
** >in_fopen.m at 171
** >import_raw.m at 127
** >bst_call.m at 28
** >tree_callbacks.m>@(h,ev)bst_call(@import_raw,[],[],iSubject) at 660

  1. According to in_fopen_nwb.m>getDeeperModule at 244, there might be something wrong with this sentence not_ordered_groupLabels = nwb.(electrodes_path).vectordata.get('group_name').data.load;. So I checked the data using nwbRead() and didn't find a key named group_name.
    Here is the output of nwb.(electrodes_path).vectordata:

14×1 Set array with properties:

        HCP: [types.hdmf_common.VectorData]
  filtering: [types.hdmf_common.VectorData]
       good: [types.hdmf_common.VectorData]
      group: [types.hdmf_common.VectorData]
       hemi: [types.hdmf_common.VectorData]
        imp: [types.hdmf_common.VectorData]
      label: [types.hdmf_common.VectorData]
   location: [types.hdmf_common.VectorData]
  pial_dist: [types.hdmf_common.VectorData]
     vertex: [types.hdmf_common.VectorData]
          x: [types.hdmf_common.VectorData]
          y: [types.hdmf_common.VectorData]
          z: [types.hdmf_common.VectorData]
       zone: [types.hdmf_common.VectorData]
  1. So I had to extract the names of electrode groups (i.e., group_name) by modifying the code as:

not_ordered_groupLabels = {nwb.(electrodes_path).vectordata.get('group').data.path};
[~, not_ordered_groupLabels] = cellfun(@(x) fileparts(x), not_ordered_groupLabels', 'UniformOutput', 0);

and finally I can read the data. I'm not sure if it's just an issue about labeling or the data was badly formatted (I obtained the data from an open database). As mentioned by mpompolas, the key names in NWB could be arbitrary.

Thanks,
Shuai

Thanks for this debugging work.

** Invalid option. Option must be 'omitnan' or 'includenan'.
It seems to require a newer version of max(). So I changed to R2019b and the error above disappeared

This error is to be handled at the level of the matnwb library.
I opened an issue here: [Bug]: Syntax valid only after Matlab 2018b · Issue #437 · NeurodataWithoutBorders/matnwb · GitHub

I'm not sure if it's just an issue about labeling or the data was badly formatted (I obtained the data from an open database). As mentioned by mpompolas, the key names in NWB could be arbitrary.

One issue with NWB is indeed the lack of specifications, which makes it difficult to obtain reproducible data structures that can be processed automatically.

If you know how to use github, could you open a PR on the Brainstorm repository to fix this function?
Ideally, your fix should check for the existence of the field "group_name", and if it doesn't exist, use your new code instead.

If you don't know how to use GitHub or don't want to push this contribution yourself: please share your edited function and an example .nwb file (upload it somewhere and post the download link here).
Thanks

Hi there!

Thank you both.
Just for the context, the NWB dataset which we are working on is the one described in Dataset of human intracranial recordings during famous landmark identification (Woolnough, O., Kadipasaoglu, C.M., Conner, C.R., Forseth, K.J., Rollo, P.S., Rollo, M.J., Baboyan, V.G., Tandon, N., 2022. Sci Data 9, 28.). Colleagues nearby @John_Mosher I believe.

Best,
AnneSo

The first issue has already been fixed in the matnwb reprository:. [Bug]: Syntax valid only after Matlab 2018b · Issue #437 · NeurodataWithoutBorders/matnwb · GitHub

Thank you for your reply ! Now we get another error wrt. the channels (please see below an example).
The NWB data provide a DynamicTable that stores the information of electrodes (e.g., good/bad, locations), but in some participants, the raw data contain more electrodes. In the case below, the number of electrodes listed in the DynamicTable is 181. Brainstorm may somehow read the number of electrodes from raw data (?) and the number is 194.

So in Line180 ChannelMat.Channel(ii).Loc = [x(iChannel);y(iChannel);z(iChannel)];, the length of ChannelMat.Channel.Loc (which is 194) and the length of x, y, z (which is 181) don't match.

** Error: Line 180: Index exceeds the number of array elements (181).
**
** Call stack:
** >in_fopen_nwb.m at 180
** >in_fopen.m at 159
** >import_raw.m at 126
** >bst_call.m at 28
** >tree_callbacks.m>@(h,ev)bst_call(@import_raw,[],[],iSubject) at 660

We're working on this issue and will test Brainstorm on other participants. We will open a PR once the code works smoothly.

Thank you !

The PR is now merged: NWB: Added the examination of channel information before loading data. by ws1011001 · Pull Request #556 · brainstorm-tools/brainstorm3 · GitHub

Could you please share an example dataset?
(upload a .nwb file somewhere an post the download link here)

Thanks!