Reading EDF files created from Brainstorm with MNE-Python and EEGLAB

Hi,

I tried exporting multiple Brainvision files to EDF, but I get a problem when trying to read some of the EDFs with MNE-Python or with EEGLAB.

With MNE-Python, a solution that seems to work for the problematic files is to change the encoding to "latin1" (but it is not ideal).

I added the error messages (from MNE and from EEGLAB) at the end. Do you know what could cause that problem ?

Just in case, here is a link to a Brainvision file that works and one that does not.

Thanks,
Corentin

Error messages:

With MNE-Python, I get this error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 64: invalid continuation byte

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/3b5a15cf-20ff-4840-8d84-ddbd428344e9/ALAB1/corentin/projects/export_to_bids/code/read_data_file_with_mne.py", line 7, in <module>
    mne.io.read_raw_edf(data_file, verbose=True)
  File "/home/corentin/.local/lib/python3.8/site-packages/mne/io/edf/edf.py", line 1686, in read_raw_edf
    return RawEDF(
  File "<decorator-gen-251>", line 10, in __init__
  File "/home/corentin/.local/lib/python3.8/site-packages/mne/io/edf/edf.py", line 208, in __init__
    onset, duration, desc, ch_names = _read_annotations_edf(
  File "/home/corentin/.local/lib/python3.8/site-packages/mne/io/edf/edf.py", line 1952, in _read_annotations_edf
    raise Exception(
Exception: Encountered invalid byte in at least one annotations channel. You might want to try setting "encoding='latin1'".

With EEGLAB, I get this error:

@CorentinLabelle, thank you for the heads-up.
We will have a look to it

@CorentinLabelle, thank you for sharing the data .
The issue is now solved at: 6d8c21d


Details FYI: At exporting the valid and invalid examples, the EDF files were created with UTF-8 encoding. In the EDF file, annotations (events) are distributed across the multiple records in the EDF file (number of records is given by the recordings length). The invalid example had less events, then some records (including the first one) did not contain annotations, then at reopening the EDF file (to continue writing), without an explicit encoding Matlab's fopen() detected it as UTF16-LE and wrote with that encoding => Bug. Now, UTF-8 encoding is explicit in Brainstorm for writing EDF files