Converting EEG data from .h5 file for import

Hello Francois,

Here is the link to a Google drive folder I created and dropped the 2 files in: https://drive.google.com/drive/folders/1G1438mfArsG367wlVgLm2Mvs91xQx4em?usp=sharing

The Virtual Brain (TVB) provides some documentation regarding the file formats, however I have not found any specifics regarding the internal structure of the files. Here is the main page on import/export file format: https://www.thevirtualbrain.org/tvb/zwei/brainsimulator-data

That being said, I will explain what I know below. If anything is unclear or there is missing information, let me know and I will do my best to communicate it to you clearly/ find the answer.

TVB only exports data in .h5 file format. The overall name of the file is determined by the contents and the date exported but this is something the user can change later.

These .h5 files contain datasets which have names associated with them. All .h5 files are structured in groups and subgroups. The main group name for all the .h5 files I have looked at from TVB has been empty, so a simple '/' is all that needs to precede the name of a subgroup. The number, names and contents of these subgroups (which are each datasets) vary depending on the file you are looking at. That is to say, they will be different for a file containing EEG data than they would be for sensor array, sEEG, MEG or BOLD data.

Within the type of data you are looking at, the subgroup names and structures appear consistent at least for EEG and sensor data (these are the ones I have worked with). I exported and inspected multiple files of each of these, taken from different simulations and different projects in TVB.

Breakdown of EEG data format:

  • '/data' contains an array of the structure [1×63×2×6000 double]. The first value remains constant at 1. The second value is the number of electrodes in the sensor array. The third value will be either 1 or 2; if the value is 1 then only the EEG data is present; if the value is 2, then metabolic data is also present and the EEG data will be the first. The last value is the number of time points taken and is in miliseconds. So my data contains 6 seconds of simulated recording.
  • '/time' contains an array of the structure [6000×1 double]. The first value is the number of time points, the second value will always be 1. For my data, the time starts at 0.5 and increases by 1. I am not sure why that is, but since I set the sampling to 1ms in the simulation, and it increases by what I would expect, I am guessing this is a quirk of the system that has to do with starting at time 0.

Breakdown of the sensor montage data format:

  • '/labels' contains an array of the structure [5×63 char]. The first value is the number of characters in the labels and the second is the number of electrodes. The data contained in this will depend upon the montage uploaded to TVB by the user to run the simulation in the first place. The names of the electrodes may or may not be consistent with a standard format as this is dependent on what the user input originally.
  • '/locations' contains an array of the structure [3×63 double]. The first value should always be 3 as these are coordinates in a 3D space ordered x,y,z. The second is again the number of electrodes. The space these coordinates are in (MNI, patient unique...) again depends upon the user.

The EEG data and the sensor data can only be exported separately. To my knowledge, the formats of each are consistent in their basic structure to the point where the import structure could be used by other TVB users working on other projects.

There is a Google group which serves as the forum for the TVB sorftware and is open to join. The most responsive person on the forum who is directly involved with the software development is Marmaduke. Here is a link to the list of contributors for the software: https://www.thevirtualbrain.org/tvb/zwei/teamwork-contributors

Please let me know if you need anything else. I am new to TVB but will share all that I have learned thus far.

Best,
Zoe