Computational-Skills..

Download Report

Transcript Computational-Skills..

Computational Skills Course
week 4
Mike Gilchrist
NIMR May-July 2011
WEEK FOUR
Individual project plans
Laurent
Veronique
Aims of the project :
- Comparing the transcriptional profiles of forelimbs and hindlimbs over an embryonic time-course:
- Identify additional candidates “limb-type modifiers”
- Compare the transcriptional profile dynamics between FL and HL of common
forelimb/hindlimb GRNs to establish a limb-type signature
RNA seq -Reads from solexa
Alignment on the transcriptome
Normalization
S hh
500
400
300
200
100
0
forelimb
hindlimb
9.5
Analysis of the transcriptional dynamic of
chosen gene regulatory networks
10
10.5
11
P tc h1
4000
3000
forelimb
2000
hindlimb
1000
0
9.5
10
10.5
11
Guilherme
Our lab generated an hypomorphic Lhx6 allele which expresses reduced levels of
mRNA. This allele specifically affects differentiation of a subset of cortical
interneurons. This results in the development of seizures. These unique mutants
allow the study of mechanisms of specification of cortical interneuron subtypes and
the generation of seizures.
After P60:
Spontaneou
s Seizures
From about E11.5:
reduced Lhx6 levels
From E13.5:
reduced Sst levels
P15: mRNA-Seq experiment
mRNA extracted from cortex
+/- : controls (4)
-/-: nulls (3)
LacZ/-: hypomorphs (3)
Questions:
1) Molecular processes affected
P40 - P60:
2) Molecular markers for
Physiologica
cell types affected
l defects at
inhibitory
dendritic
synapses
George
How do binding sites for the T-box transcription factor Brachyury change over
time during frog embryogenesis?
James
Data
• mRNA-seq in chick neural cells + and - Shh
• database of chick transcription factors (TFs)
• ChIP-seq analysis identifying binding sites of several TFs in
mouse neural cells responding to Shh
Analysis
•
•
expressed
•
genes)
map mRNA-seq data to chick genome
measure gene expression levels and identify differentially
genes
identify subset of regulated genes that are TFs
• identify mouse orthologs of differentially expressed TFs (and
• identify clusters of TFs binding near regulated genes
• ask whether there is (i) any enrichment for clusters of binding
sites near regulated genes; (ii) any correlation between combination of Tfs
bound and type of regulation; (iii) predictive value in the ChIP-seq data for
the regulation of gene expression.
Siggi
The ‘PROJECT’ - Siggi Sato (Parasitology)
Gene expression in P. falciparum
Nuclear genome
Identify genetic elements determining the limit of the
intron
Protein/RNA factors for splicing and controlling organelles
Organellar genomes (Plastid, Mitochondrion)
Identify genetic elements for replication and transcription
New anti-P. falciparum
Finding new substances and identifying their targets
Alaremycin (patent filed)
MRC-T “small molecules”
( Others? )
Ashleigh
The molecular regulation of IL-10 and IL-12 in innate cells: Investigating the differential
production of IL-10 and IL-12 in commonly used inbred mouse strains
• C57BL/6 and BALB/c macrophages produce reciprocal levels of IL-10 and IL-12 when stimulated with
bacterial products. This could influence their responses to infection. Other inbred mouse strains also
differentially regulate IL-10 and IL-12.
WHY?
• To compliment wet lab experiments, we would like to use re-sequencing data generated by ourselves
and by the Sanger Institute Mouse Genomes Project to hone in on genetic differences in 5 different
mouse strains that could contribute to these phenotypes (starting with candidate loci based on our in
vitro studies).
• Key initial questions include:
o Are there differences (SNPs/deletions) in the IL-10/IL-12/type
I IFN loci?
o Are these differences in regulatory elements (TF binding
sites/3’UTR) or protein coding regions?
http://www.ensembl.org/Mus_musculus/Location/
Alex
Are there any genes in the
Xenopus tropicalis genome
that do not have a corresponding
EST in the Xenopus laevis database?
& Vice versa
Jose
DB IMGT
RF http://imgt.cines.fr/cgi-bin/IMGTlect.jv?query=5+AB019437
AC AB019437
SP Human
GL IGHV
GN V7-81
NA caggtgcagctggtgcagtctggccatgaggtgaagcagcctggggcctcagtgaaggtc
NA tcctgcaaggcttctggttacagtttcaccacctatggtatgaattgggtgccacaggcc
NA cctggacaagggcttgagtggatgggatggttcaacacctacactgggaacccaacatat
NA gcccagggcttcacaggacggtttgtcttctccatggacacctctgccagcacagcatac
NA ctgcagatcagcagcctaaaggctgaggacatggccatgtattactgtgcgagata
AA
QVQLVQSGHEVKQPGASVKVSCKASGYSFTTYGMNWVPQAPGQGLEWMGWFNTYTGNP
TY
AA AQGFTGRFVFSMDTSASTAYLQISSLKAEDMAMYYCAR
//
Flat file database of mouse and human sequences from databases IMGT, ABG,
NCBI and VBASE2 in EMBL format.
Load (how?) into MySQL (database design: tables and primary key?)
Remove redundancy in sequences but retain pointers to other fields.
Flexible query and output different sets of sequences for e.g. blast search.
IL-10 Associated Histone Modification Pattern of T helper Subsets
Leona
IL-2
FACS purify
CD4+CD44loCD25-Foxp3GFP-10BiTT cells from SPN
ChIP-Seq: Histone Modification
IL-12, aIL-4
IFN-g
IL-10
10BiT (IL-10 reporter)
Foxp3GFP
TCR7 Rag1-/-
Culture with:
- HEL peptide
- DCs
- Skewing cytokines/
blocking antibodies
TGFb
IL-10
FACS purify
CD4+10BiT+
vs
CD4+10BiTGene
T cells
Gene
Naive
CD4+10BiT-
Th0
CD4+10BiT-
CD4+10BiT-
Th1
CD4+10BiT+
CD4+10BiT-
Th2
CD4+10BiT+
Th17
CD4+10BiTCD4+10BiT+
Treg
CD4+10BiT- CD4+10BiT+
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
H3K4me3
H3K36me3
H3K27me3
H3K4me1
H3K27Ac
IL-4
IL-10
Gene
Gene
…
…
List of histone marks of all genes in the different subsets
IL-17
IL-10
IL-10?
10BiT+
Histone Modification Pattern Maps
10BiTTbet
Foxp3
Rorgt
Gata3
Identify differences in histone
patterns between IL-10
secreting vs non-secreting T
helper cells
Compare histone patterns in the
different T helper cell subsets;
gaining insight into “housekeeping”
vs activation vs lineage defining
Gene status
Histonepattern
Permissive TSS
H3K4me3
Transcribed gene
H3K4me3 + H3K36me3
Bivalent domain
H3K4me3 + H3K27me3
Repressed TSS
H3K27me3
Poised enhancer
H3K4me1
Active enhancer
H3K4me1 + H327Ac
Assign histone patterns to genes in
the different T helper cell subsets
Mustafa
Khokha Lab – Computational Goal
•
•
•
•
Define All Exons in X. tropicalis
How?
• Combine current gene models, transcriptome
assemblies, available and soon to be available
RNA-seq
Why?
• Genome sequence/annotation - imperfect
• Exon Capture Sequencing – mutant gene
identification
• Gap Capture – gene model improvements
• Analysis of RNA-seq data
When?
• Tomorrow would be good
Mary
Madhu
One exercise I am trying now is to predict the potential PfSUB2 (a
subtilisin-like protease with relative sequence-specificity at the
cleavage site) in Plasmodium falciparum protein database. I
downloaded predicted protein sequences for one chromosome
(Chr 13) and by using grep detected 38 sequences. After editing
output with sed to make it look like a fasta file (line numbers as
identifiers for each sequence), queried against the chr13 protein
database (blastp for max_target_seqs 1) to obtain accession
numbers.
I am yet to write/try any script.
What I would like to do: I have a Pfmsp7 knock out parasite line
which shows invasion phenotype. Once the technical hurdles are
passed (like getting rid off rRNA sequences with polyA in it), we
want to RNAseq analyse in order to identify genes that may have
been affected.