I. Comparing genome sequences

Download Report

Transcript I. Comparing genome sequences

Comparative Genomics II:
Functional comparisons
Caterino and Hayes, 2007
Overview
I. Comparing genome sequences
• Concepts and terminology
• Methods
-
Whole-genome alignments
-
Quantifying evolutionary conservation (PhastCons, PhyloP, GERP)
-
Identifying conserved elements
• Utility and limitations of conservation
• Available datasets at UCSC
II. Comparative analyses of function
• Evolutionary dynamics of gene regulation
• Case studies
• Insights into regulatory variation within and across species
Functional variation within and among species
Human
Chim
p
Rhes
us
Mous
e
Regulatory variation contributes to human phenotypic
variation
Modularity of developmental gene
expression
overall
Limb TFs
gene A
limb
gene A
forebrain
Brain TFs
Neural
TFs
gene A
neural
tube
Regulatory changes introduce variance without disrupting protein
Regulatory mutations affecting pleiotropic genes cause
discrete developmental changes
Lettice et al. Hum Mol Genet 12:1725 (2003)
Sagai et al. Development 132:797 (2005)
Patterns of selection on gene expression and regulation
Neutral
Constrained
Romero et al., Nat Rev Genet. 13:505 (2012)
Directional
Comparative approaches to identify conserved and
variant regulatory functions
Regulatory conservation
Regulatory rewiring
Visel and Pennacchio, Nat Genet 42:557
Genetic drivers of gene regulatory variation
Furey and Sethupathy, Science
Comparative analysis of ChIP-seq datasets
• H3K4me2
• H3K27ac
Human
Mouse
• H3K4me2
• H3K27ac
Compare TF binding, histone modifications,
DNase hypersensitivity
in equivalent tissues
Requires a statistical framework to
reliably quantify changes in
ChIP-seq signals
Issues in comparative functional genomics
•Input data are noisy: ChIP-seq, RNA-seq data are signal based, sub
to considerable experimental variation
•Using comparable biological states within and across species
(e.g., human liver vs. mouse liver) = variation across tissues?
•How do epigenetic states and gene expression diverge among
individuals and across species (Neutral? Constrained?)
•Can we identify variants or substitutions that drive regulatory
changes?
Science 328: 232 (2010)
•Targets:
RNA Polymerase II
NFkB
•10 human lymphoblastoid cell lines
3 major population groups: European, East Asian, Nigerian
9 females, 1 male
9 analyzed by HapMap and 1000 Genomes
NFkB
PolII
Fraction of regions
bound
Pairwise difference
in binding
Variation in TF binding is common
# individuals
Science 342: 747 (2013)
•Targets:
RNA Polymerase II
H3K4me1, H3K4me3, H3K27ac, H3K27me3
DNase hypersensitivity
•10 human lymphoblastoid cell lines
1 population group (Nigerian)
All analyzed by HapMap and 1000 Genomes
Measuring allelic imbalance in histone modification profiles
ChIP-seq reads
G allele
Allelic
imbalance
T allele
Need to map reads reliably to individual alleles
Cis-quantitative trait loci
~1200 identified
Science 328: 1036 (2010)
•Targets:
CCAAT/enhancer binding protein a (CEBPA)
Hepatocyte nuclear factor 4a (HNF4A)
Essential for normal liver development and function
•Tissue:
Adult liver from 4 mammal species plus chicken
Lineage-specific gain and loss of CEBPA binding in liver
Lineage-specific: 0 bp overlap
in multiple species alignment
Widespread variation in CEBPA binding in mammals
Widespread variation in CEBPA binding in mammals
Cell 154: 530 (2013)
Single TF binding events may not indicate regulatory function
• Many TFs are present at high concentration
in the nucleus
• TF motifs are abundant in the genome
Enhancer-associated
histone modification
• Single TF binding events may be incidental
Combinatorial TF binding events are more conserved
Many TF binding changes do not have obvious genetic causes
In mammalian liver:
Many TF binding changes do not have obvious genetic causes
In mouse liver:
Cell 154: 185 (2013)
Human
Rhesus
Mouse
Bud stage; digit
specification
Digit separation
Identifying human-lineage changes in promoter
and enhancer function
• Compare H3K27ac signal at orthologous sit
• ‘Stable marking’: 1.5-fold or less change in
H3K27ac among human, rhesus and mou
• Human gain: require significant, reproducibl
gain in human versus all 12 datasets in
rhesus and mouse
Mapping active promoters and enhancers in human limb
ENCODE cell lines
H3K27ac
Gains in promoter and enhancer activity
• Bone morphogenesis
• Chondrogenesis
• Digit malformations in mouse
Human-specific H3K27ac marking correlates with changes in
enhancer function
Epigenetic signatures reflect tissue identity and
species relationships
H3K27ac signal in human and mouse
H3K27ac in human, rhesus, mouse
Primate
Mouse
Nature 478: 343 (2011)
•
•
•
•
•
•
•
•
•
•
Human
Chimpanzee
Bonobo
Gorilla
Orangutan
Macaque
Mouse
Opossum
Platypus
Chicken
•
•
•
•
Custom gene models based on Ensembl + RNA-s
5,636 1:1 orthologs in amniotes
13,277 1:1 orthologs in primates
Only constitutive exons
Global patterns of gene expression differences
Gene expression recapitulates species phylogenies
Gene expression divergence rates are tissue-specific
testis
liver
brain
Gene expression divergence increases with evolutionary time
Conservation of core organ functions restricts divergence
Summary
•Comparative functional genomics identifies regulatory differences
within and among species
•TF binding is variable within species and highly variable among
species
•Epigenetic comparisons provide more insight into biologically
relevant regulatory diversity and divergence
•Gene regulation and expression diverges with increasing
phylogenetic distance – they mirror neutral expectation