name:opening **Connectal Coding: Discovering the Mechanism Linking Cognitive Phenotypes to Individual Histories**
Joshua T. Vogelstein | {[BME](https://www.bme.jhu.edu/),[ICM](https://icm.jhu.edu/),[CIS](http://cis.jhu.edu/),[KNDI](http://kavlijhu.org/)}@[JHU](https://www.jhu.edu/)
.foot[[jovo@jhu.edu](mailto:jovo@jhu.edu) |
| [@neuro_data](https://twitter.com/neuro_data)] --- class: center, middle ## .center[https://neurodata.io/graspy/] --- ## Outline - Background - Example Connectomes - Connectal Coding - Applications - Discussion --- ## .k[Background] --- ### Two definitions .r[Neural (activity) coding]: inferring the relationships between neural activity and stimuli or responses -- .r[Connectal coding]: inferring the relationships between *neural circuit structure* and *individual histories* -- - .r[Individual histories]: genetic, developmental, experiential - .r[Neural circuit structure (aka, a connectome)]: a network of a brain --- ### Genotype, Phenotype, Connectotype - .r[Phenotype]: a description of an individual's properties with regard to a phenomenon of interest - .r[Genotype]: a set of genes and associated variants associated with that phenotype - .r[Connectotype]: a set of nodes, edges, and their properties that are associated with that phenotype --
.center[Genotype --> Connectotype --> Phenotype] **Connectotypes are the implementation-level mechanisms linking genotypes to phenotypes** --- ### Connectome in Literature - Sporns et al. *PLoS CB* (2005) - Hagmann (2005) - PubMed (circa 2018): .r[3000+] hits. --- ### Connectome according to jovo - .r[Network] of a brain, at a spatiotemporal precision & extent - .r[Nodes] are distinct biophysical entities - .r[Edges] are *structural* objects connecting a pair of nodes - .r[Attributes] of the network, nodes, or edges are possible --
- Example nodes: neurons, neural compartment, neural ensembles - Example edges: synapses, gap junction, fiber bundles --- ### Implications - A brain could have many .r[different] connectomes, at different times and/or resolutions - We measure properties of the brain to .r[estimate] connectomes - Those measurements can be .r[structural] or .r[functional] - Estimates are always .r[noisy] -- ### Some caveats - connectomes are not comprehensive, unlike genomes - nodes can be abstractions, eg, all neurons type A - attributes can be arbitrary, eg, distribution of synaptic vesicles - connectomics is the study of neural circuitry --- ### Connectomes for Connectal Coding
Hhypothesis generation not hypothesis testing --- ## .k[Example Connectomes] --- ### C Elegans
- directed, multi --- ### Drosophila Mushroom Body
- directed --- ### Mouse Ex Vivo Diffusion MRI
- undirected --- ### Human MRI
- undirected, multi (semi-dense) --- ## .k[Connectal Coding] --- ### Connectome Analysis Styles - bag of edges - bag of features - bag of parameters --- ### Bag of Edges - treat each edge as independent - does this completely ignores graph structure of data (hint: yes)? - requires multiple hypothesis correction for valid tests - does anybody know a good way to correct (hint: no)? - BH: way under-conservative (false positives) - Bonferroni: way over-conservative (false negatives) - Network based statistics: no theoretical guarantees --- ### Bag of Features - choose m features and compute them per node or graph - how do i choose m? how do i choose which features (hint: arbitrary)? - how many features are possible given a graph with n nodes (hint: many)? - does these features characterize the brain (hint: no)? - can we make causal claims using these features (hint: no)? - least well known of the approaches --- ### Bag of Parameters - build a statistical parametric model of brain network - does it treat edges independently (hint: no)? - do we have ways of choosing model and model complexity (hint: kind of)? - do these models characterize the brain (hint: yup!)? - can we make causal claims (hint: kind of)? --- ## What is a Code? A code is a system of (potentially stochastic) rules that translate from one representation of information into another. .pull-left[ Examples: - Morse: one-to-one - genetic: deterministic - neural: stochastic ] .pull-right[
] --- ## Statistical Coding - Random variables: X, Y - Distributions: X ~ P, Y ~ P - Conditionals: P[X | Y], P[Y | X] Formally, codes are conditional distributions. --- ## Connectal Coding Model Random variables of interest include: - P: phenotypes - C: connectomes - G: genome - E: environment --- ## Abstract Connectome Codes - Pr[C | G]: prob of connectome, given a genome - Pr[P | C]: prob of phenotype, given a connectome - Pr[P | C, G, E]: probability of phenotype given connectome, genome, and environment --- ## .k[Statistical Models of Connectomes] --- ### Independent & Identical Edges Erdos-Renyi (ER): akin to assuming a neuron's spike rate is Poisson with a fixed rate. - edges are binary - all edges independent - all edges sampled from identical distribution - $\Rightarrow$ only 1 parameter: prob of an edge Notes - directed vs. undirected - loopy vs. no loops - Simplest random graph model - lacks sufficient complexity/descriptive power for most questions --- ### Independent & Identical Edges .r[Weighted] Erdos-Renyi: akin to Poisson model using a bigger bin width. - edges can take .r[any value] - edges are independent - edges are sampled from identical distribution - $\Rightarrow$ can still only be 1 parameter: expected weight of an edge Notes - directed vs. undirected - loopy vs. no loops - simplest .r[weighted] random graph model - lacks sufficient complexity/descriptive power for most questions --- ### Independent & Identical Edge .r[Zero-Inflated] Weighted Erdos-Renyi: akin to assuming a bursty neuron, modeling both probability of burst and expected number of spikes in each burst - edges can take any value - edges are independent - edges are sampled from identical distribution - .r[2 parameters]: prob of edge, and expected weight of edge. Notes - directed vs. undirected - loopy vs. no loops - simplest sparse weighted random graph model - can provide useful/interesting description of a connectome --- ### IIE Models of Connectomes
--- ### Independent Edge Model - edges are binary - edges are independent - edges are sampled from .r[differnt] distribuions - .r[n*n parameters]: prob of edge between each pair. - P[ A(i,j) ] = p(i,j) Notes - same generalizations as above apply here as well - n*n paramers is much larger than 1, - still ignores structure - can't fit without lots of samples or further assumptions/restrictions --- ## Categorical Conditionally Independent Edge Models Stochastic Block Model (SBM): akin to assuming a neuron's are in different states, which determine Poisson rate. - edges are binary - edges are .r[conditionally] independent - each node has a class assignment - P[ A(i,j) ] = B(class i, class j) Notes - directed vs. undirected - loopy vs. no loops - simplest >2 parameter model --- ## Connectome SBMs
--- ## Generalized SBMs Weighted Stochastic Block Model (SBM) - edges are .r[weighted] - edges are conditionally independent - each node has a class assignment - P[ A(i,j) ] = B(class i, class j) is expected weight of connection Notes - directed vs. undirected - loopy vs. no loops --- ## Generalized SBMs Zero-Inflated Weighted Stochastic Block Model (SBM) - edges are weighted - edges are conditionally independent - each node has a class assignment - P[ A(i,j) ] = is defined by a matrix of probabilities of connection, and a matrix of expected weights Notes - directed vs. undirected - loopy vs. no loops --- ## Continuous Conditionally Independent Edge Models Random Dot Product Graphs (RDPG): akin to latent state models in population coding - edges are binary - edges are conditionally independent - each node has a .r[latent position in d-dimensions] - P[ A(i,j) ] = f(latent position i, latent position j) - for example, P[ A(i,j) ] is the product of latent positions Notes - directed vs. undirected - no loops is ickier - generalizes previous models --- ## Connectome RDPGs
--- ## Generalized RDPG - Weighted RDPG: edges have weights - Zero-Inflated Weighted RDPG: edges have probabilities and expected weights --- ## Latent Structure Models - Special case of RDPG, where latent positions are organized into .r[structures] - Examples - each node class has a distribution of latent positions, eg, Gaussian - latent positions are hierarchical, eg, multiscale atlas - repeated motif, eg, cortical columns - latent positions are curved --- ## Drosophila Mushroom Bodies
--- ## Population Graph Models - Mixture of RDPG - Joint Heterogeneous RDPG --- ## .k[Discussion] --- ## Summary and Next Steps - Connectomes are the mechanistic link: .center[.r[genotype --> phenotype]] - Extend ideas from coding theory to support these analyses - Connectomes, genetic and phenotypic data are available --- ### Acknowledgements
Eric Bridgeford
Ben Pedigo
Jaewon Chung
Carey Priebe
Randal Burns
Michael Miller
Daniel Tward
Vikram Chandrashekhar
Drishti Mannan
Jesse Patsolic
Benjamin Falk
Kwame Kutten
Eric Perlman
Alex Loftus
Brian Caffo
Minh Tang
Avanti Athreya
Vince Lyzinski
Daniel Sussman
Youngser Park
Cencheng Shen
Shangsi Wang
Tyler Tomita
James Brown
Disa Mhembere
Greg Kiar
Jeremias Sulam
♥, 🦁, 👪, 🌎, 🌌
--- class:center