= Tutorial 28: Connectivity =
'''[TUTORIAL UNDER DEVELOPMENT: NOT READY FOR PUBLIC USE] '''

''Authors: Hossein Shahabi, Mansoureh Fahimi, Francois Tadel, Esther Florin, Sergul Aydore, Syed Ashrafulla, Elizabeth Bock, Sylvain Baillet''

== Introduction ==
During the past few years, the research focus in brain imaging moved from localizing functional regions to understanding how different regions interact together. It is now widely accepted that some of the brain functions are not supported by isolated regions but rather by a dense network of nodes interacting in various ways.

Brain networks (connectivity) is a recently developed field of neuroscience which investigates interactions among regions of this vital organ. These networks can be identified using a wide range of connectivity measures applied on neurophysiological signals, either in time or frequency domain. The knowledge provides a comprehensive view of brain functions and mechanisms.

This module of Brainstorm tries to facilitate the computation of brain networks and the representation of their corresponding graphs. Figure 1 illustrates a general framework to analyze brain networks. Preprocessing and source localization tasks for neural data are thoroughly described in previous sections of this tutorial. The connectivity module is designed to carry out remained steps, including the computation of connectivity measures, and statistical analysis and visualizations of networks.

{{attachment:FlowChartGeneral.png||height="230",width="850"}}

== General terms/considerations for a connectivity analysis ==
'''Sensors vs sources: '''The connectivity analysis can be performed either on sensor data (like EEG, MEG time series) or reconstructed sources.

'''Nature of the signals: '''

'''Point-based connectivity vs. full network: '''Most of connectivity functions provide you the option to either compute the connectivity between one point (channel) and the rest of the network (1 x N) or the entire network (N x N). While the later calculate the graph thoroughly, the first options enjoy a faster computation and it is more useful when you are interested in the connectivity of an ROI with the other regions of the brain.

'''Temporal resolution: '''Connectivity networks can be computed in two ways; static and dynamic. In Table1 metrics are classified based on this feature. Dynamic networks can present the time-varying property of the brain. In contrast, the static graphs illustrate a general … which is also helpful in many conditions. The user needs to decide which type of network is more informative for their study.

'''Time-frequency transformation: '''

--(Consider how to choose window (length and overlap) depends on frequency bands )--

'''Output data structure:'''

__'' Consequently, computed connectivity matrices in this toolbox can have up to four dimensions; channels x channels x frequency bands x time. ''__

== Simulated data (AR model) ==
In order to compare different connectivity measures, we use simulated data with known ground truth. Three channels are

{{attachment:TransferMatrix2_AR3.png||height="400",width="550"}}

== Coherence (FFT-based) ==
 * Put the Simulated data in the Process1 tab.
 * Click on [Run] to open the Pipeline editor.
 * Run the process: '''Connectivity > Coherence NxN ''' <<BR>><<BR>>

{{attachment:StatCoherence_Process_ms1.PNG||height="400",width="350"}}

 * Set the options as follows:
  * '''Time window''': Select the entire signal.
  * '''Removing evoked response''': Check this box to remove the averaged evoked response from the individual trials.
  * '''Measure''': You can select either the "Magnitude-Squared" coherence or the "Imaginary" coherence. we first select the former one.
  * '''Maximum frequency resolution''': This value characterizes the distance between frequency bins. Smaller values give higher resolutions but probably noisier.
  * '''Highest frequency of interest''': It specifies the highest frequency which should be analyzed. Here, we selected Fs/2 = 125 Hz to have the coherence in all frequencies.
  * '''Output configuration''': Select one file per input file.

In general, after running the connectivity processes, you can find a multi-dimensional matrix of connectivity in the database. In order to represent this matrix, there are several options.

Right click on the file and select '''Power spectrum''' and '''Display as image''' These two figures are plotted here. The right

{{attachment:rightMenuPlot.PNG||height="180",width="450"}} <<BR>><<BR>> {{attachment:StatCoherence-Results1.PNG||height="300",width="650"}} <<BR>><<BR>><<BR>>

Similarly, we can run this process and select "imaginary coherence". which gives us the following representation,

{{attachment:StatCoherence_Process_lc.PNG||height="400",width="350"}} <<BR>><<BR>> {{attachment:StatCoherence-Results_lc.PNG||height="300",width="650"}} <<BR>><<BR>>

== Granger Causality ==
Granger Causality (GC) is a method of functional connectivity, adapted by Clive Granger in the 1960s, but later refined by John Geweke in the form that is used today. Granger Causality is originally formulated in economics but has caught the attention of the neuroscience community in recent years. Before this, neuroscience traditionally relied on stimulation or lesioning a part of the nervous system to study its effect on another part. However, Granger Causality made it possible to estimate the statistical influence without requiring direct intervention (ref: wiener-granger causality a well-established methodology). <<BR>><<BR>> Granger Causality is a measure of linear dependence, which tests whether the variance of error for a linear autoregressive model estimation of a signal (A) can be reduced when adding a linear model estimation of a second signal (B). If this is true, signal B has a Granger Causal effect on the first signal A, i.e., independent information of the past of B improves the prediction of A above and beyond the information contained in the past of A alone. The term independent is emphasized because it creates some interesting properties for GC, such as that it is invariant under rescaling of A and B, as well as the addition of a multiple of A to B. The measure of Granger Causality is nonnegative, and zero when there is no Granger causality(Geweke, 1982).  <<BR>><<BR>> The main advantage of Granger Causality is that it is an asymmetrical measure, in that it can dissociate between A->B versus B->A. It is important to note however that though the directionality of Granger Causality is a step closer towards measuring effective connectivity compared to symmetrical measures, it should still not be confused with “true causality”. Effective connectivity estimates the effective mechanism generating the observed data (model-based approach), whereas GC is a measure of causal effect based on prediction, i.e., how well the model is improved when taking variables into account that are interacting (data-driven approach) (Barrett and Barnett, 2013). The difference with causality is best illustrated when there are more variables interacting in a system than those considered in the model. For example, if a variable C is causing both A and B, but with a smaller delay for B than for A, then a GC measure between A and B would show a non-zero GC for B->A, even though B is not truly causing A (Bressler and Seth, 2011).

{{attachment:StatCoherence_Process_lc.PNG||height="400",width="350"}} <<BR>><<BR>>

'''Input options:'''

 * '''Time window:''' specifies the time window you want to use for your model.
 * '''Remove evoked response from each trial:''' this option refers to subtracting the average of phase-locked activity (ERP) from each individual trial. Presently some studies measure interdependency of ongoing brain activity by removing the average event-related potential from each trial. It is also recommended by some as it meets the zero-mean stationarity requirement (improves stationarity of the system). However, the problem with this approach is that it does not account for trial-to-trial variability (For a discussion see (Wang et al., 2008)).

'''Estimator options:'''

 * '''Model order:''' Selection of model order is a critical issue and is typically evaluated from criteria derived from information theory. Several criteria have been proposed, of which the most used are Akaike’s information criterion, the Bayesian-Schwartz’s criterion, and the Hannan-Quinn criterion (Koichi and Antonio, 2014). Model fitting quality crucially depends on the proper model order selection. Too low orders may lack the necessary details, while too big orders tend to create spurious values of connectivity. Note that in our simulated example even though the simulation was created with an underlying model of 4, a Granger model order of 6 was selected with decent resulting connectivity.

'''Output options:'''

 * '''Save individual results (one file per input file):''' option to save GC estimates on several files separately.
 * '''Concatenate input files before processing (one file):''' option to save GC estimates on several files as one concatenated matrix.

== Coherence and envelope (Hilbert/Morlet) ==
This process

{{attachment:FlowChartHCorr.png||height="170",width="850"}}

 * '''Input Options:''' The time range of the input signal can be specified here. Also, bad channels and the evoked response of trials can be discarded, if appropriate.
 * '''Time-frequency transformation method:''' The method for this transformation (Hilbert transform or Morlet Wavelet) should be selected. Additionally, this analysis needs further inputs, e.g. frequency ranges, number of bins, and Morlet parameters, which can be defined by an external panel as depicted in Figure 4 (By clicking on “Edit”). A complete description regarding time-frequency transformation can be found here. In the context of connectivity study, we must analyze complex output values of these functions, so two other options (power and magnitude) are disabled on the bottom of this panel.
 * '''Signal splitting:''' This process has the capability of splitting the input data into several blocks for performing time-frequency transformation, and then merging them to build a single file. This feature helps to save a huge amount of memory and, at the same time, avoid breaking a long-time recording to short-time signals, which makes inconsistency in dynamic network representation of spontaneous data. The maximum number of blocks which can be specified is 20.
 * '''Connectivity measure:''' Here, three major and widely used coherence based measures of brain connectivity can be computed. Next, desired parameters for windowing, i.e. window length and overlap, should be determined. Please note that these values are usually defined based on the nature of data, the purpose of the study, and the selected connectivity measure.
 * '''Parallel processing:''' This feature, which is only applicable for envelope correlation, employs the parallel processing toolbox in Matlab to fasten the computational procedure. As described in the advanced section of this tutorial, envelope correlation utilizes a pairwise orthogonalization approach to attenuate the cross-talk between signals. This process requires heavy computation, especially for a large number of channels, however, using Parallel Processing Toolbox, the software distributes calculations on several threats of CPU. The maximum number of pools varies on each computer and it is dependent on the CPU.
 * '''Output configuration:''' Generally, the above calculation results in a 4-D matrix, where dimensions represent channels (1st and 2nd dimensions), time points (3rd dimension), and frequency (4th dimension). In the case that we analyze event-related data, we have also several files (trials). However, due to poor signal to noise ratio of a single trial, an individual realization of connectivity matrices for each of them is not in our interests. Consequently, we need to average connectivity matrices among all trials of a specific event. The second option of this part performs this averaging.

== Simulated data (phase synchrony) ==
== Correlation ==
The correlation is the basic approach to show the dependence or association among two random variables or MEG/EEG signals. While this method has been widely used in electrophysiology, it should not be considered as the best technique for finding the connectivity matrices. The correlation by its nature fails to alleviate the problem of volume conduction and cannot explain the association in different frequency bands. However, it still can provide valuable information in case we deal with a few narrow-banded signals.

 * Put the Simulated data in the Process1 tab.
 * Click on [Run] to open the Pipeline editor.
 * Run the process: '''Connectivity > Correlation NxN ''' <<BR>>

{{attachment:StatCorrelation_Process.PNG||height="350",width="350"}}

 * Set the options as follows:
  * '''Time window''': Select the entire signal.
  * '''Estimator options''': leave the box unchecked so the means will be subtracted before computing the correlation.
  * '''Output configuration''': Select one file per input.

== Phase locking value ==
== Method selection and comparision ==
--(a)--

<<TAG(Advanced)>>

== Granger Causality - Mathematical Background ==
{{attachment:GC_Math_Time2.PNG||height="450",width="700"}}

<<BR>><<BR>> '''Practical issues about GC:'''

'''Temporal resolution:''' the high time resolution offered by MEG,EEG and intracranial EEG allows for a very powerful application of GC and also offers the important advantage of spectral analysis. <<BR>><<BR>>

'''Stationarity:''' the GC methods described so far are all based on AR models, and therefore assume stationarity of the signal (constant auto-correlation over time). However, neuroscience data, especially task-based data such as event-related potentials are mostly nonstationary. There are two possible approaches to solve this problem. The first is to apply methods such as differencing, filtering, and smoothing to make the data stationary (see a recommendation for time domain GC). Dynamical changes in the connectivity profile cannot be detected with the first approach. The second approach is to turn to versions of GC that have been adapted for nonstationary data, either by using a non-parametric estimation of GC or through measures of time-varying GC, which estimate dynamic parameters with adaptive or short-time window methods (Bressler and Seth, 2011). <<BR>><<BR>>

'''Number of variables:''' Granger causality is very time-consuming in the multivariate case for many variables (O(m^2) where m represents the number of variables). Since each connection pair results in two values, there will also be a large number of statistical comparisons that need to be controlled for. When performing GC in the spectral domain, this number increases even more as statistical tests have to be performed per frequency. Therefore, it is usually recommended to select a limited number of ROIs or electrodes based on some hypothesis found in previous literature, or on some initial processing with a more simple and less computationally heavy measure of connectivity. <<BR>><<BR>>

'''Pre-processing:''' The influence of pre-processing steps such as filtering and smoothing on GC estimates is a crucial issue. Studies have generally suggested to limit filtering only for artifact removal or to improve the stationarity of the data but cautioned against band-pass filtering to isolate causal influence within a specific frequency band (Barnett and Seth, 2011).  <<BR>><<BR>>

'''Volume Conduction:''' Granger causality can be performed both in the scalp domain or in the source domain. Though spectral domain GC generally does not incorporate present values of the signals in the model, it is still not immune from spurious connectivity measures due to volume conduction (for a discussion see (Steen et al., 2016)). Therefore, it is recommended to reduce the problem of signal mixing using additional processing steps such as performing source localization and doing connectivity in source domain. <<BR>><<BR>>

'''Data length:''' because of the extent of parameters that need to be estimated, the number of data points should be sufficient for good fit of the model. This is especially true for windowing approaches, where data is cut into smaller epochs. A rule of thumb is that the number of estimated parameters should be at least (~10) several times smaller than the number of data points. <<BR>><<BR>>

== Additional documentation ==
==== References ====
==== Articles ====
 * '''Phase transfer entropy''': Lobier M, Siebenhühner F, Palva S, Palva JM [[http://www.sciencedirect.com/science/article/pii/S1053811913009191|Phase transfer entropy: A novel phase-based measure for directed connectivity in networks coupled by oscillatory interactions]], NeuroImage 2014, 85:853-872

==== Forum discussions ====
 * Forum: Connectivity matrix storage:[[http://neuroimage.usc.edu/forums/showthread.php?1796-How-the-Corr-matix-is-saved|http://neuroimage.usc.edu/forums/showthread.php?1796]]

 * Forum: Comparing coherence values: http://neuroimage.usc.edu/forums/showthread.php?1556

 * Forum: Reading NxN PLV matrix: http://neuroimage.usc.edu/forums/t/pte-how-is-the-connectivity-matrix-stored/4618/2

 * Forum: Scout function and connectivity: http://neuroimage.usc.edu/forums/showthread.php?2843

 * Forum: Unconstrained sources and connectivity: http://neuroimage.usc.edu/forums/t/problem-with-surfaces-vs-volumes/3261

 * Forum: Digonal values: http://neuroimage.usc.edu/forums/t/choosing-scout-function-before-or-after/2454/2

<<HTML(<!-- END-PAGE -->)>>

<<EmbedContent("http://neuroimage.usc.edu/bst/get_prevnext.php?prev=Tutorials/GroupAnalysis&next=Tutorials/Scripting")>>

<<EmbedContent(http://neuroimage.usc.edu/bst/get_feedback.php?Tutorials/Connectivity)>>