Reference of the gain matrix - EEG OPENMEEG BEM in Brainstorm

RChowdhury · May 15, 2014, 8:23am

Hello Francois,
Recently I noticed that in the bst_sourceimaging function, you apply the average reference to the EEG gain matrix just before running any of the source localization algorithm. Does that mean that the BEM model we generate in brainstorm is reference-free and we have to convert it to average reference? Please correct me if I am wrong.
I read a few older posts in the brainstorm forum concerning the average reference. But I am not very clear if what is the current available process in brainstorm. This is because, we no more have the process which helped us to convert our EEG data to average reference and so if we are not using data that is average reference what will be our gain matrix reference in such cases?
Please help me understand this.

Thanks

Francois · May 15, 2014, 9:51am

Hi Rasheeda,

The models in Brainstorm convert all the EEG/ECOG/sEEG recordings to an average reference, you don’t need to do it explicitly.
If you do want to apply an average reference montage to your recordings, use the process “Standardize > Apply montage”.
To learn how to use the new montage interface, please read the EEG/Epilepsy tutorial.

For questions on OpenMEEG, please contact the OpenMEEG developers:
http://openmeeg.github.io/

Cheers,
Francois

RChowdhury · May 15, 2014, 12:59pm

Thanks for your response.
Good to know that the recordings will be converted to average reference.
I generally do source analysis on average referenced EEG recordings. But when I export my data and head model from brainstorm, do I have to apply the average reference to the EEG head model before I run source localization?
I am asking this because, when running source localization (wMNE,LORETA,etc.,) within brainstorm, there is a default step where you apply average reference to the head model of EEG/ECOG/SEEG in bst_sourceimaging function.

Francois · May 21, 2014, 12:12pm

Hi Rasheeda,

This depends of what you are doing.
The forward models (the .Gain fields) that are produced by Brainstorm and OpenMEEG are calculated for an arbitrary reference.
It doesn't matter much what reference you are using as long as you are using the same for the EEG recordings and the forward operator.
Before using this Gain matrix, you need to convert it to the same type of reference your recordings are using.

Because it's quite complicated in terms of interface to handle multiple reference systems (average / fixed / linked), we decided at some point to convert everything to an average reference. It means less flexibility in the computation model but prevents the recurring reference errors.
In the calculation of the inverse model, we replace the forward operator G with (VG) where V is the average reference operator. If S are the estimated sources and M the EEG measures, you have something like:
S = (G^TG + lam.I)^-1G^TM
S = (G^TV^TVG + lam.I)^-1G^TV^TM
S = (G^TVG + lam.I)^-1G^TVM (because V^T=V and VV=V)
S = (K.V) . M

What is saved in the source files in Brainstorm is ImagingKernel=(KV).
Hence our inverse operator already contains the average reference operator embedded, there is no need to apply it manually to the recordings before the multiplication ImagingKernel*EEG.

If you use directly the Gain matrix for your own computations, you have to take this into consideration.
In Brainstorm, you never have to explicitly convert the EEG to an average reference, but if you do some things manually you may have to.

Hope this helps,
Francois

RChowdhury · May 22, 2014, 10:14am

Thank you very much for the detailed explanation. Now I understand the process well.

Francois · May 22, 2014, 12:03pm

Hi Rasheeda,

After double-checking what I was claiming in my previous post, I found a major bug in the calculation of the EEG sources (only in the case where there are bad channels).

The average reference operator has to be applied at the same time to the forward operator (Gain matrix) and to the noise covariance matrix.
This was not done properly anymore in the code of bst_sourceimaging. This bug probably appeared a while ago and can change considerably the results in the case of noisy recordings.

It’s already fixed, please consider updating Brainstorm.

Cheers,
Francois

John_Mosher · June 10, 2014, 5:48pm

Hello Rasheeda, and Others interested in the algorithm used for EEG source modeling,

Thanks to questions by Rasheeda, and in timely parallel, my own modeling efforts, we snaked out a workflow problem in Brainstorm that was sometimes plaguing our new ability to source model EEG data, while also allowing users to make their own custom EEG montages.

For those whom this post is TL;DR, then the IMPORTANT NOTE IS: All EEG source modelers should update to the latest Brainstorm, to correct a problem with the noise covariance matrix when some channels were marked "bad."

For those remaining, we wanted to take this opportunity to describe why average referencing always works in source modeling, and why we therefore ignore your specific montages.

Average Referencing

Unlike MEG, in EEG every "referential" channel is acquired with respect to some reference point "R", say. (Note: Brainstorm at this time does not support source modeling of unique "differential" channels, where each channel may have a unique reference. We assume the common practice that scalp (or invasive) data are collected "referentially," i.e. all channels share a common reference point.) The problem is that the reference may be something we cannot model, such as ear-clips, mastoid, etc. In other cases, the reference channel is a scalp point that may have not been recorded or reported. Finally, as is the practice at my site, the reference is one of the standard 10-20 channels (Pz), so that channel is recorded as a flat "zero" channel (since it is referenced to itself). We have a bookkeeping problem of understanding which reference did a user employ and did they include that channel in the imported data matrix "D". Fortunately, average referencing is an exact and convenient method to account for all reference approaches.

Consider three channels numbered 1, 2, 3, plus a fourth channel R, acquired as (1-R), (2-R), (3-R), and (R-R). In Case 1, the user gives us just the first three channels, i.e. we don't know or record R. We form the average as ((1-R) + (2-R) + (3-R)) /3 to form A = (1 + 2 + 3)/3 - R. We then subtract A from each channel the user gave us to form the "average referenced" montage (1 - (1 + 2 + 3)/3), (2 - (1 + 2 + 3)/3), (3 - (1 + 2 + 3)/3). In other words, R is gone from consideration.

In Case 2, the user gives us the explicit channel R as well (it's just all zeros in the channel). We still follow the same black box approach, but now for four channels. The average reference is A = (1 + 2 + 3 + R)/4 - R, and the "average referenced" montage is (1 - (1 + 2 + 3 + R)/4), (2 - (1 + 2 + 3 + R)/4), (3 - (1 + 2 + 3 + R)/4), and (R - (1 + 2 + 3 +R)/4). Note now that even the zero reference channel R now has an apparent signal in the average reference montage, which is perfectly to be expected; we still only have a rank 3 data matrix for the three channels of data provided, just now spread across four non-zero channels. The point is that the average reference approach is identical for either case, averaging all channels the user provided and subtracting it out.

Infinite Reference Model

Allow me to use my own notation, now consider the general data equation

D = G Q + N

where D is the data matrix, say 23 channels of 10-20 EEG data for T seconds, G is the source model (gain matrix) for a dipole, Q is the dipole time series (3 x T), and N is the 23 x T matrix of unknown additive noise. G (23 x 3) is computed as the potential at the primary electrode position, for a dipole somewhere in the brain (e.g. on the cortex). What may seem strange to some users is that source modeling algorithms compute this potential with respect to a reference at "infinity," a physical impossibility, but mathematically quite tractable. To actually compute the true potential difference recorded in a channel and use it in a source model, we compute two such "infinity" electrode points and subtract them, such as (1-R) above, which, notably, neatly removes the impossible "infinity" reference from the model. But, as we noted above, we may not always know where the reference was, or that we cannot compute the infinite reference model (we have no good source modeling for ear clips, for instance.)

The average reference approach neatly solves this problem. Extending the toy case examples above, we can create a matrix W that applies the averaging to both the physical data matrix D and the infinite reference model G. For the three channel case, we see that W is

W = 
[    2/3   -1/3   -1/3
    -1/3    2/3   -1/3
    -1/3   -1/3    2/3]

If we apply this W to both D and to G, then we neatly remove the physically unspecified reference channel from D and simultaneously the physically impossible infinite reference from G, so that the equation is properly

WD = WG Q + WN

and we note that the weight matrix must also be applied to the noise matrix N. Thus we are able to use data with an arbitrary reference and a source model with an impossible infinite reference in the same modeling equation.

Inverse Solution

For simplicity in notation, consider now the "perfect" modeling equation

D = G Q + N

where D has a known reference that has been exactly characterized in G, i.e. we're not concerned about unknown references, and G has been calculated with respect to the true reference, not infinity. The GLS (generalized least-squares) solution to this model is

Qhat = (G' C^-1 G)^-1 G' C^-1 D

where C^-1 is the inverse of the noise covariance matrix, and G' is the transpose of the gain matrix. We call this estimate the generalized least-squares solution, but if N is known to be Gaussian, then this solution is also known as the max likelihood solution (ML).

We build the noise covariance matrix from other samples of signal-free data, either a "quiet" period of the recording, or in the pre-stimulus period; right-click on the data structure and select "Noise covariance" -> "Compute from recordings", for example. For a matrix N of T_n time samples, where T_n is not necessarily the same as the data matrix, but indeed must usually be many tens of seconds, then we can estimate C as

Chat = (N N')/(T_n-1),

but we will drop the "hat" for notational simplicity.

Now if we apply an invertible W to this equation (note technicality below), it does not change the GLS solution, if we follow the same estimation procedure!

(WD) = (WG) Q + (WN)

yields

Qhat = (G'W' (W C W')^-1 WG)^-1 G'W' (W C W') ^-1 (W D)

which simplifies to the same GLS solution as

Qhat = (G' C^-1 G)^-1 G' C^-1 D.

Thus it does not matter what W is applied, the GLS solution remains the same. Simply put, the montage is unimportant in source modeling!

Reduced Rank

But, there's a technical catch in the above simplification we must now address: loss of rank. As the astute user can note, the average reference W operator is singular, losing a degree of freedom in the imported data. The inverses have to be replaced with pseudo-inverses, and (W W^-1) is no longer identity, but rather an idempotent projection operator into the reduced rank space, (W W⁺).

The concept, however, remains the same. Rather than burden this discussion with lots of pseudo-inverse symbols, allow me to simplify the discussion as follows. Assume again we have the perfect data equation,

D = G Q + N

where D explicitly includes the reference channel of all zeros, and G is appropriately calculated to include the explicit true reference, and N also includes this reference channel. Then the model is already rank-deficient, since one of the rows of this equation is already all zeros. Thus the GLS is properly

Qhat = (G' C⁺ G)^-1 G' C⁺ D.

We simply state that for the average reference W, for which the inverse is now properly pseudo-inverted as W⁺, when applied as

(WD) = (WG) Q + (WN)

then Qhat remains the same.

In Case 1, where the user gave no explicit reference channel, i.e. there is no row of zeros in the data or noise matrices, then there is technically a loss of rank in the data matrix when we apply W; however, without explicit knowledge of the reference sensor, we have no ability to correctly calculate the source model G, so this loss of rank is necessary.

Practical Implications

The Brainstorm user imports data matrix D, which we leave undisturbed in its "raw" state, i.e. we apply no montages. The user also calculates a noise covariance matrix C from this raw data, which again we store with NO montages applied.

For each EEG, SEEG, ECOG channel designated, an "infinite reference" forward model is built that considers only the single location of the primary channel.

When source modeling is initiated, all that matters to Brainstorm is the designation of "good channels." The average reference montage matrix W is built for the good channels only, applied to the present noise covariance matrix (the user is free to change), to create a source "kernel"

K = (G'W' (W C W')⁺ WG)^-1 G'W' (W C W') ⁺ W

which can then be later applied to the stored raw data D (note the W at the end). Hence many different kernels K can be built and stored for different combinations of good channels or noise covariances, yet all are applied to the same raw data D.

chao · September 11, 2014, 7:55pm

Thanks, it’s very helpful!