### Eliminating Accidental Deviations in Human Connectomics ![:scale 40%](images/neurodata_blue.png) Eric W. Bridgeford | [ericwb.me](http://ericwb.me) --- name:talk ### Outline - [Motivation](#defn) - [Quantifying Discriminability](#statistics) - [Real Data](#results) - [Discussion](#disc) ### [Additional Content](#extra) --- name:defn ### Outline - Motivation - [Quantifying Discriminability](#statistics) - [Real Data](#results) - [Discussion](#disc) ### [Additional Content](#extra) --- ### What is Reproducibility? - .ye[Reproducibility]: ability to replicate, or reproduce, a conclusion - serves as a "first-pass" check for scientific utility - currently in a "reproducibility crisis" --- ### How do we address the Reproducibility Crisis? - fix post hoc analyses (e.g., $p$-values)? - fix measurements (e.g., measurement reproducibility)? Proposal: design experiments to maximize .ye[inter-item discriminability], rather than simply checking reproducibility after conducting the experiment --- name:statistics ### Outline - [Motivation](#defn) - Quantifying Discriminability - [Real Data](#results) - [Discussion](#disc) ### [Additional Content](#extra) --- ### What do we want of our data? If we measure a sample multiple times, then each measurement of that sample is closer to all the other measurements of that sample, as compared to any of the measurements of other samples. ![:scale 100%](images/discr/perfect_discrim.png) Perfect discriminability --- ### What do we want of our data? Imperfect discriminability ![:scale 100%](images/discr/imperfect_discr.png) --- ### What do we want of our statistic? Discriminability is the probability of a measurement from the same item being closer than a measurement from a different item. ![:scale 100%](images/discr/imperfect_discr.png) --- ### Discriminability Statistic: Step 1 - Compute $N \times N$ pairwise distance matrix between all measurements ![:scale 55%](images/discr/imperfect_discr.png) ![:scale 42%](images/discr/dummy_sim_dmtx.png) --- ### Discriminability Statistic: Step 2 - For each measurement, identify which measurements are from the same individual (
green boxes
) - let $\color{green}g$ be the total number of
green boxes = 20
--- ### Discriminability Statistic: Step 3 - For each measurement, identify measurements from other individuals that are more similar than the measurement from the same individual (
orange boxes
) - let $\color{orange}f$ be the total number of
orange boxes = 84
--- ### Discriminability Statistic - Discr = $1 - \frac{\color{orange}f}{N(N-1) - \color{green}g} = 1 - \frac{\color{orange}{84}}{20\cdot 19 - \color{green}{20}} \approx .77$
High discriminability: same-item measurements are more similar than across-item measurements --- ### Discriminability is Construct Valid
--- name:results ### Outline - [Motivation](#defn) - [Quantifying Discriminability](#statistics) - Real Data - [Discussion](#disc) ### [Additional Content](#extra) --- ### What data will we be using? - CoRR metadataset - $N>1,700$ individuals imaged across $26$ different datasets - anatomical MRI and fMRI scans for each - Individuals are measured at least twice --- ### Analysis Procedure Process each measurement using $192$ different pipelines 1. Brain alignment (ANTs/FSL) 2. Frequency filtering (Y/N) 3. Scrubbing (Y/N) 4. Global Signal Regression (Y/N) 5. Parcellation (4 options) 6. Rescaling connectomes (Raw, Log, Pass-to-Rank) $192 = 2 \times 2 \times 2 \times 2 \times 4 \times 3$ All options represent strategies experts consider useful --- ### Pipeline impacts discriminability
--- ### Marginally most discriminabile options tend to be best global options
- Each point is the pairwise difference holding other options fixed (e.g., FNNGCP - ANNGCP) - Best pipeline marginally (FNNGCP) is second best pipeline overall, and not much worse (2-sample test, p=.14) than the best pipeline FNNNCP - We may not need to always try every pre-processing strategy every time --- ### Selection via Discriminability improves inference For each pre-processing strategy, for each dataset, compute: 1. Within-dataset Discr. 2. Demographic effects (sex and age) within the dataset via Distance Correlation (DCorr) 3. Within a single dataset, regress demographic effect on Discr. Question: does a higher discriminability tend to yield larger effects for known biological signals? --- ### Selection via Discriminability improves inference
--- name:disc ### Outline - [Motivation](#defn) - [Quantifying Discriminability](#statistics) - [Real Data](#results) - Discussion ### [Additional Content](#extra) --- ### Contributions 1. Discriminability quantifies the contributions of systematic and accidental deviations 3. Provide theoretical motivation for discriminability in connection with predictive accuracy 2. Formalize tests for assessing and comparing discriminabilities within and between collection strategies 4. Illustrate the value of discriminability for neuroscience and genomics (not discussed) data 5. Code implementations in [python](https://github.com/neurodata/hyppo) and [R](https://github.com/neurodata/r-mgc) --- ### Acknowledgements
Josh Vogelstein
Shangsi Wang
Zhi Yang
Zeyi Wang
Ting Xu
Cameron Craddock
Jayanta Dey
Greg Kiar
William Gray-Roncal
Carlo Colantuoni
Christopher Douville
Stephanie Noble
Carey Priebe
Brian Caffo
Michael Milham
Xinian Zuo
- [BioRxiv manuscript](https://www.biorxiv.org/content/10.1101/802629v6) - Code implementations in [python](https://github.com/neurodata/hyppo) and [R](https://github.com/neurodata/r-mgc) --- name:extra ### [Outline](#talk) ### Additional Content - [Theory](#theory) - [Other Reproducibility Statistics](#other) - [Limitations](#limitations) - [Extension: Discriminability Decomposition](#extension) --- name:theory ### [Outline](#talk) ### Additional Content - Theory - [Other Reproducibility Statistics](#other) - [Limitations](#limitations) - [Extension: Discriminability Decomposition](#extension) --- ### Population Discriminability - population discriminability $D$ is a .ye[property of the distribution] of measurements $D = \mathbb P(\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''}))$ - Probability of within-individual measurements being more similar than between-individual measurements --- ### Discriminability: unbiased and consistent - Sample Discr. $= $fraction of times $\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''})$ - $i, j = 1, ..., n$ and $i \neq j$ for $n$ individuals - $k, k', k'' = 1, ..., s$ and $k \neq k'$ for $s$ sessions - Sample Discr. is an .ye[unbiased estimator] of $D$ - Sample Discr. converges to $D$ (.ye[asymptotically consistent] in $n$) --- ### Connecting Discriminability to Downstream Inference #### Assumption: Multivariate Additive Noise Setting - $y_i \sim Bern(\pi)\;i.i.d.$, - $\theta_i \sim \mathcal N(\mu(y_i), \Sigma_t)\;ind.$, - (the individual means have a center which depends on the class) -- - $\epsilon_{i}^k \sim \mathcal N(c, \Sigma_e)\;i.i.d.$ and $ind.$ of $\theta_i$, - $x_{i}^k = \theta_i + \epsilon_i^k$. - (the measurements $x_i^k$ are normally dispersed about the individual means) --- ### Connecting Discriminability to Downstream Inference Suppose $(x_i^k, y_i)$ follow the Multivar. Additive Noise Setting, where $i=1, ..., n$ and $k=1,...,s$. #### Theorem 1 There exists an increasing function of $D$, $f(D)$, which provides a lower bound on the predictive accuracy of a subsequent classification task - $f(D) \leq A$, where $A$ is the Bayes Accuracy of the classification task #### Consequence - $D \uparrow \Rightarrow f(D) \uparrow$ --- #### Corollary 2 A strategy with a higher $D$ provably provides a higher bound on predictive accuracy than a strategy with a lower $D$ #### Consequence Suppose $D_1 < D_2$, then since $f$ is increasing, $f(D_1) < f(D_2)$ #### Implication We should use strategies with higher discriminability, as the worst-case for subsequent inference is better than a generic strategy with a lower discriminability --- ### Simulation Setup ![:scale 70%](images/discr/sims_sim.png) --- ### Discriminability and Accuracy ![:scale 70%](images/discr/sims_acc.png) Discr. decreases proportionally with accuracy --- ### Are data discriminable? ![:scale 70%](images/discr/sims_os.png) --- ### Is one dataset more discriminable than another? ![:scale 70%](images/discr/sims_ts.png) --- name:other ### [Outline](#talk) ### Additional Content - [Theory](#theory) - Other Reproducibility Statistics - [Limitations](#limitations) - [Extension: Discriminability Decomposition](#extension) --- #### Intraclass Correlation Coefficient (ICC) - can be thought of as looking at the "relative size" of the within-group vs total variance - $y_i^k = \mu + \mu_i + \epsilon_i^k$ - let $\mu_i \sim \mathcal N(0, \sigma_b^2)$, and $\epsilon_i^k \sim \mathcal N(0, \sigma_e^2)$ - $ICC = \frac{\sigma_b^2}{\sigma_e^2 + \sigma_b^2}$ - $ICC \uparrow \Rightarrow $ between-group variance "contains" most of the total variance - negative ICC? mean squared error-based estimator --- #### Image Intraclass Correlation Coefficient (I2C2) - simplest "multivariate extension" of ICC - $y_i^k = \mu + \mu_i + \epsilon_i^k$ - let $\mu \sim \mathcal N(0, \Sigma_b)$ and $\epsilon_i^k \sim \mathcal N(0, \Sigma_e)$ - Wilk's $\Lambda = \frac{\det(\Sigma_b)}{\det(\Sigma_b) + \det(\Sigma_e)}$ - $I2C2 = \frac{tr(\Sigma_b)}{tr(\Sigma_b) + tr(\Sigma_e)}$ - "ratio of total variability accounted for between groups" - Why I2C2 over Wilk's $\Lambda$? Ease-of-use for high-dimensional data --- #### Fingerprinting Index (Finger.) - "greedy discriminability" - $Finger. = \mathbb P(\delta(x_i^1, x_i^2) < \delta(x_i^1, x_j^2) \;\forall\; i \neq j)$ - $\forall\; i \neq j$: this property must occur for every measurement in the second session --- #### Distance Components (Kernel) - "non-parametric ANOVA" - total dispersion is the sum of between and within-sample dispersions ($B$ and $W$) - $DISCO = \frac{\frac{B}{n - 1}}{\frac{W}{n\cdot s - n}}$ - "pseudo F" statistic --- name:limitations ### [Outline](#talk) ### Additional Content - [Theory](#theory) - [Other Reproducibility Statistics](#other) - Limitations - [Extension: Discriminability Decomposition](#extension) --- ### Limitations - experimental design is not "one-size-fits-all" - Discriminability is not sufficient for practical utility - categorical covariates are meaningful but not discriminable - fingerprints are discriminable but not typically biological useful - These statistics are not immune to sample characteristics - confounds such as age may inflate discriminability --- name:extension ### [Outline](#talk) ### Additional Content - [Theory](#theory) - [Other Reproducibility Statistics](#other) - [Limitations](#limitations) - Extension: Discriminability Decomposition --- ### Extension: Discriminability Decomposition #### Setting $(x_{i}^k, y_i)$ $i=1, ..., n$, $k=1,...,s$, $y_i \in$ \{$1, ..., Y$\} - associated with each individual, I have some other categorical covariate of interest, $y_i$, taking one of $Y$ possible values - Can the population discriminability be decomposed as a sum of the within-group discriminabilities? --- ### Within-Group Discriminability - Let $D(y) = \mathbb P(\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''}) | y_i, y_j = y)$ - $D(y)$ is the group discriminability for group $y$ - "How discriminable are samples from group $y$?" -- - Note that $W = \mathbb P(\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''}) | y_i= y_j)$= $\frac{\mathbb P(\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''}) , y_i = y_j)}{\mathbb P(y_i = y_j)}$ by def conditional probability -- - Let $w(y) = \mathbb P(y_i=y_j = y)$ denote the within-group weights - With $\omega = \sum_y w(y)$, then: $W = \frac{1}{\omega}\sum_y w(y) D(y)$ is the within-group Discriminability --- ### Between-Group Discriminability - Let $D(y, y') = P(\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''}) | y_i = y, y_j = y')$ - $D(y, y')$ is the between-group discriminability for groups $y$ and $y'$ - "How discriminable are samples from group $y$ vs group $y'$, and vice versa?" -- - Note that $B = \mathbb P(\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''}) | y_i\neq y_j)$= $\frac{\mathbb P(\delta(x_i^k, x_i^{k'}) < \delta(x_i^k, x_j^{k''}) , y_i \neq y_j)}{\mathbb P(y_i \neq y_j)}$ by def conditional probability -- - Let $b(y, y') = \mathbb P(y_i = y, y_j = y')$ denote the between group weights - With $\beta = \sum_{y\neq y'} b(y,y')$, then: $B = \frac{1}{\beta}\sum_{y \neq y'}b(y,y')D(y,y')$ is the between-group Discriminability --- ### Discriminability Decomposition - $D = \omega W + \beta B$ - Population discriminability is a weighted sum of within and between-group Discriminabilities - Can look at how the within, or between, group discriminabilities compare - $\frac{W}{D}$ ratio of within-group Discriminability and pop. discriminability - $\frac{B}{D}$ ratio of between-group Discriminability and pop. discriminabillity - are certain groups more discriminable than others? - are certain between-group discriminabilities greater than others? - "ANOVA-esque" or DISCO-esque"