CSCI 2951-N Overview

Download Report

Transcript CSCI 2951-N Overview

Published Genome-Wide Associations through 2011
1,617 published GWA at p≤5X10-8 for 249 traits
The GWAS Human Genome
Autism marker
Multiple Sclerosis Marker
Published Genome-Wide Associations through 2011
1,617 published GWA at p≤5X10-8 for 249 traits
Genetic Heterogeneity
The GWAS Human Genome
The Common Disease Common Variant (CDCV)
hypothesis is dead.
Long live the Common Disease Many Rare Variants
hypothesis!
The CDCV ‘s classical drawing metaphor as “Needles in
the Haystack,” with few needles with a common look in
a large haystack, needs to be replaced now with a van
Gogh-like drawing, with many needles each differently
looking and private to areas in the large haystack.
Vincent
Published Genome-Wide Associations through 2011
1,617 published GWA at p≤5X10-8 for 249 traits
The Missing Heritability Puzzle
The GWAS Human Genome
Additivity of alleles? Just a convenient approximation,
friendly to “heritability” measured as a correlation
coefficient.
Ronald
Published Genome-Wide Associations through 2011
1,617 published GWA at p≤5X10-8 for 249 traits
Application Topics include
The GWAS Human Genome
•
haplotype phasing, linkage disequilibrium, tagging SNPs,
identical by descent (IBD), pedigrees, trios
•
coalescent theory, Polya urn game, Ewens sampling
lemma, genome-wide graph theory algorithms
•
the genetic heterogeneity problem, the missing
heritability problem
•
statistical models of disease, association tests and
multiple hypothesis testing
•
autism, multiple sclerosis, type 2 diabetes
Genomic Foundations
• Modeling and Measuring Evolution: Linkage Disequilibrium
(LD), Urn Models
• Genome-Wide Association Studies (GWAS): Statistical
associations, the missing heritabililty problem, genetic
heterogeneity, genomic privacy
Algorithms
• Maximum Likelihood and Expectation-Maximization Algorithms
Biological Problem: Inferring haplotype frequencies in populations.
• Set-cover and Minimum Informative Subset Algorithms
Biological Problem: Tagging SNPs selection, LD.
Algorithms
• Markov Chain Monte Carlo Algorithms
Biological Problem: Population Substructure
• Knapsack Algorithms and Statistical Hypothesis Testing,
The Neyman-Pearson Lemma, Multiple Testing
Biological Problem: Statistical Associations in GWAS
Algorithms
• Voting Theory Algorithms, von Neumann-Morgenstern Utility
Theory & The Social Network of Protein Folds
Biological Problem: The Protein Folding Problem
- Individual Preferences of Amino Acids and the
Thermodynamic "Social Choice" Hypothesis
- The Protein Folding Energy Function Inference Problem