Genomics in Drug Discovery

Download Report

Transcript Genomics in Drug Discovery

The evolution of the immune
system in chicken and higher
vertebrates
@ Organon, Oss
2005-09-20
Tim Hulsen
Biorange Project SP3.2.2
• Chicken immunosystem project is part of
WP1, “Translational Medicine through
Comparative Genomics and Integration”
• Partners:
– Animal Breeding and Genetics Group,
Wageningen UR (Prof. dr. Martien Groenen)
– Avian Cytokines Group, Institute for Animal
Health, Compton (UK) (Prof. dr. Pete Kaiser)
• Jack Leunissen (WUR) also part of WP1
M. Groenen: chicken sequencing
Kaiser: chicken immune system
Introduction
• Goal: developing an insight in the recent
evolution of the immune system
• Usage of a more distant species: chicken
(recently sequenced)
• Support by experimental data
Overview
1.
2.
3.
4.
5.
Find IS-related proteins
Determine orthologies
Pfam annotation
Panther annotation
Zooming in
Step 1: Find IS-related proteins
• IRIS: “Immunogenetic Related
Information Source”
• number of immune genes: 1562 (out of 21389 in
LocusLink)
• percentage of genome related to immunity:
7.30%
• 1562 LocusLink proteins mapped to our Protein
World set: 1381 proteins
Step 1: Find IS-related proteins
Step 1: Find IS-related proteins
• GO: Gene Ontology
• collaborative effort to address the need for
consistent descriptions of gene products in
different databases
• Checked human GO annotation for certain
terms: “immunology”,”cytokine”,etc.
• 1515 proteins in human Potein World set
Step 1: Find IS-related proteins
• Result:
– 1381 proteins through IRIS
– 1515 proteins through GO
– 1929 proteins total
414
IRIS only
GO only
IRIS & GO
967
548
Step 2: Determine orthologies
• Study evolution from chicken (Gg) to rat
(Rn), mouse (Mm) and human (Hs):
–
–
–
–
–
–
Hs<->Mm
Hs<->Rn
Hs<->Gg
Mm<->Rn
Mm<->Gg
Rn<->Gg
• Two methods: Best Bidirectional Hit (BBH)
and PhyloGenetic Tree (PGT)
Best Bidirectional Hit (BBH)
• Very easy and quick
• Human protein (1)  SW  best
hit in mouse/rat (2)
• Mouse/rat protein (2)  SW 
best hit in human (3)
• If 3 equals 1, the human and
mouse/rat protein are considered
to be orthologs
Step 2: Determine orthologies
BBH
Hs
Mm
Rn
Gg
Hs
1929 1145 1046 704
Mm
-
-
863
345
Rn
-
-
-
488
Gg
-
-
-
-
PhyloGenetic Tree (PGT)
PROTEOME
Human
PROTEOMES Human, mouse, rat, chicken
SELECTION OF HOMOLOGS
Hs, Mm, Rn, Gg
LIST Hs-Mm pairs
ALIGNMENTS AND TREES
TREE SCANNING
PHYLOME
Z>20
RH>0.5*QL
Hs-Rn pairs
Hs-Gg pairs
Mm-Rn pairs
Mm-Gg pairs
Rn-Gg pairs
~25,000 groups
Step 2: Determine orthologies
PGT
Hs
Mm
Rn
Gg
Hs
1929 2301 1819 1129
Mm
-
-
2873 2087
Rn
-
-
-
2142
Gg
-
-
-
-
Step 3: Pfam annotation
• Pfam: “Protein families database of
alignments and HMMs”
• collection of protein families and domains
• Pfam contains multiple protein alignments and
profile-HMMs of these families
• 75% of protein sequences have at
least one match to Pfam
• 1700 IS-related proteins mapped to 584 Pfam
families (2814 mappings)
Step 3: Pfam annotation
BBH
Hs
Mm
Rn
Gg
PGT
Hs
Mm
Rn
Gg
Hs
1929
1145
1046
704
Hs
1929
2301
1819
1129
Mm
-
-
863
345
Mm
-
-
2873
2087
Rn
-
-
-
488
Rn
-
-
-
2142
Gg
-
-
-
-
Gg
-
-
-
-
Hs
Mm
Rn
Gg
BBH
pfam
Hs
Mm
Rn
Gg
PGT
pfam
Hs
1776
1069
974
639
Hs
1776
2135
1700
1040
Mm
-
-
795
312
Mm
-
-
2846
2070
Rn
-
-
-
442
Rn
-
-
-
2125
Gg
-
-
-
-
Gg
-
-
-
-
Step 3: Pfam annotation
14%
13%
Hs
Hs+Mm
Hs+Mm+Rn
Hs+Mm+Rn+Gg
11%
62%
Step 4: Panther annotation
•
•
•
•
•
PANTHER: “Protein ANalysis
THrough Evolutionary Relationships”
designed to classify proteins (and their genes) in
order to facilitate high-throughput analysis
proteins have been classified according to
families and subfamilies, molecular functions,
biological processes, pathways
contains over 6683 protein families, divided into
31,705 functionally distinct protein subfamilies
1872 IS-related proteins mapped to 970 Panther
families (4667 subfamilies, 14737 mappings)
Step 4: Panther annotation
BBH
Hs
Mm
Rn
Gg
PGT
Hs
Mm
Rn
Gg
Hs
1929
1145
1046
704
Hs
1929
2301
1819
1129
Mm
-
-
863
345
Mm
-
-
2873
2087
Rn
-
-
-
488
Rn
-
-
-
2142
Gg
-
-
-
-
Gg
-
-
-
-
Hs
Mm
Rn
Gg
BBH
panther
Hs
Mm
Rn
Gg
PGT
panther
Hs
1872
1125
1029
688
Hs
1872
2266
1793
1118
Mm
-
-
846
339
Mm
-
-
2729
1970
Rn
-
-
-
481
Rn
-
-
-
2008
Gg
-
-
-
-
Gg
-
-
-
-
Step 4: Panther annotation
26%
39%
Hs
Hs+Mm
Hs+Mm+Rn
Hs+Mm+Rn+Gg
20%
15%
Step 5: Zooming in
• Which families are ‘new’ in human?
• Which orthologs have a different domain
structure through evolution?
• Which human proteins don’t have
orthologs in the other species?
• Any other interesting stuff
Future directions
• Include paralogs in our analysis (makes
possible checking which families only exist
in mouse/rat/chicken)
• Combine our findings with research at
WUR: synteny between human and
chicken
• Take a look at ratio of non-synonymous to
synonymous substitutions (dN/dS)
Credits
• NV Organon:
– Peter Groenen
– Wilco Fleuren
• Wageningen UR:
– Martien Groenen
– Hindrik Kerstens
• Compton (UK):
– Pete Kaiser