Introduction to BICFx

Download Report

Transcript Introduction to BICFx

Outline
1.
2.
3.
4.
Goals
Structure
Services
Flagship Projects
BICF Goals
• To build a core facility open to the research
community at UTSW with recognizable impact
on the scope and quality of clinical and basic
cancer research.
• Major Funding Source: CPRIT Core Grant and
Cancer Center
BICF Structure
BICF Recruits
New Hires
• Brandi Cantarel, Ph.D., Computational Biologist III, started on
11/1/2015
• He Zhang, Ph.D., Computational Biologists III, started on 1/22/2016
• Xiu Luo, Ph.D., Data Scientist I, in process.
Open Positions
• Computational Biologist – OPEN, high activity
• Computational Biologist – OPEN, high activity
• Data Scientist – OPEN SOON
• Scientific Programmer – OPEN SOON
BICF Services
1.
2.
3.
4.
Curated Databases
Software Pipelines
Educational Programs
Helpdesk Consults
5.
6.
7.
8.
Fellows Program
Program Development
Collaborations
Flagship Projects
BICF Services
Public / Curated Databases
•
Research data are increasingly available in the public domain




•
Gene Expression Omnibus (GEO)
The Cancer Genomics Atlas (TCGA)
Cancer Cell Line Encyclopedia (CCLE)
Literature…
Utilizing these data are challenging
 Scattered around in different places and frequently updated
 Organized in different formats and used different terminology
 Required specialized expertise to process
• BICF provides centralized infrastructure for public and curated
databases and intuitive-to-use tools to access these resources.
BICF Services
Next-generation Sequencing Analysis Pipelines
Next Step: Incorporate the pipeline into BioHPC cloud system
and set up educational programs for all researchers on campus.
Custom Software
Probemapper – EntrezToProbe engine for handling mappings between probes and
genes in microarray data.
MBCB – R package that provides model-based background correction incorporating
negative control beads in Illumina BeadArray data.
Pipeclip – Pipeline for identifying cross-linking sites in PAR-CLIP, HITS-CLIP, and iCLIP
data.
SbacHTS – R package for detecting and correcting spatial background noise in RNAi
screening experimental results.
BAYSIC – Variant Integration Tools, which uses Bayesian posterior probabilities to
determine highly confident variants predicted from a combination of variant calling
tools.
Term2Gene - Online tool for identifying list of genes associated with specific
diseases/biological pathways using PubMed's query term definitions.
Lung Cancer Explorer – Online tool that lets you explore and analyze gene expression
data from dozens of lung cancer datasets.
BICF Services
BICF Educational Programs
• Monthly Lectures – part of BioHPC training.
• Nano Courses – 2 day courses with lectures
and hands-on exercises.
Check BioHPC Training Calendar.
BICF Upcoming Courses
Organizer: Dr. Brandi Cantarel <[email protected]>
• 03/23: Introduction to Statistical Testing: Going beyond the T-test to
pick the right test for your data analysis.
• 04/27: Sequence Homology and Alignments: Understanding how
does BLAST/FASTA work in order to optimize your sequence
similarity searches.
• 05/25: Next Generation Sequence Technologies: From Sanger to
MinIONs, choosing the sequencing technology for your
experiments.
• 06/22: Introduction to Sequence Variation: SNV, INDELS and SVs,
predicting human variation in healthy vs. disease populations.
BICF Upcoming Courses
Organizer: Dr. Brandi Cantarel <[email protected]>
• 07/27: Introduction to ClipSeq Analysis: ClipSeq as a
method for detecting of genome-wide protein-RNA
interaction maps.
• 09/28: RNAseq Analysis: Gene expression profiling to
determine molecular functional differences in cells and
tissues.
• 11/16: Introduction to Microbiome ‘Omics Technologies:
What is a microbiome and how can it be studied?
BICF Services
BICF Help Desk
• Open to all UTSW research community.
–
–
–
–
consultation re data access
consultation re routine data analysis
consultation re study design
consultation re grant application
• Triage of proposals for collaboration.
• Triage of proposals for hourly consulting.
• Help desk hours:
Daily 10:00am – 11:00am
NB5.604
[email protected]
BICF Hourly Consulting
• Help Desk consults needing more than brief
interactions can be scheduled as a fee-forservice basis.
– more involved data access
– performance of simple data analysis
– more involved standard support in study design
– more involved standard support with grant
application
• Review / triage toward prog dev and collab
level
BICF Services
BICF Fellows Program
• 5 - 10 slots available
• Postdoctoral fellows and graduate students engaged in
a bioinformatics analysis move to BICF
–
–
–
–
Technical super-vision by trained staff
Environment with quantitative thinking
Peer-to-peer training among fellows
Integration with the Department of Bioinformatics
• Scientific responsibility and financing remains with
fellow’s lab
• Use of CPRIT funds
– Bridge funds for unbillable FTEs of staff working with
fellows
BICF Services
Program Development
• To provide bioinformatics expertise to newly
forming project teams.
– Attendance of project meetings
– Dedicated contributions to study design
– Acquisition of preliminary data
– Participation in grant application
• Application through helpdesk consultation.
• Prioritization by Steering Committee.
BICF Services
BICF Collaborations
• Provide cancer community with ‘bioinformatics personnel on demand’.
• Application through helpdesk consultation.
• Prioritization by Steering Committee.
– Staff stays located in BICF or moves temporarily to
project lab.
– FTEs paid by project lab.
Selected publications
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Cell, 2012 May 11, 149(4) 768-779
Cell. 2013 Sep 12;154(6):1269-84
Cell, 2013 Aug 29;154(5):1085-99
Nature. 2011 Dec 1;480(7375):113-7
Nature, 2012 Jan 26; 481:511-515
Nature. 2014 Aug 17. doi: 10.1038/nature13671
Science. 2012 Nov; 338(6109):956-959
Science. 2014 Sep 5;345(6201):1139-45.
Lancet, 2012 Mar 3;379(9818):785-7
Nature Biotechnology. 2014 Dec;32(12):1213-22.
Nature Biotechnology. 2015 Aug 10. [Epub ahead of print]
Science Signaling 2013 Oct 15;6(297):ra90
BICF Services
Flagship Projects
• Clinical Sequencing Project (Collaborate with Dr.
Jim Malter, Dr. Ward Wakeland)
• Lung Cancer Project (Collaborate with Dr. David
Gerber and John Minna)
• Kidney Cancer Project (Collaborate with Dr. Jim
Brugarolas)
• Clinical Database Developments
Flagship Projects
• Clinical Sequencing Project (Collaborate with Dr.
Jim Malter, Dr. Ward Wakeland)
– Set up the testing clinical server with local back ups
– Developed and tested the germline mutation calling
pipeline
– developed and testing somatic mutation calling
pipeline
– developed database and web portal for storing the
raw data and results
– curating available database/resources for clinical
actionable mutations
Flagship Projects
• Lung Cancer Project (Collaborate with Dr.
David Gerber and John Minna)
– Developed Lung Cancer Explorer for public data
– Team joined the IRB protocol for accessing UTSW
lung cancer patient and tumor sample data
– Work with John Minna and Adi Gazdar to develop
lung cancer cell line database
Comprehensive database development
Lung Cancer Explorer
Data curation and analysis
Data curation and
quality assessment
Identification of
appropriate datasets
Alignment and
integration
Data processing
Statistical analysis
Analysis functions for database infrastructure
Group comparison
Survival analysis
Correlation analysis
Flagship Projects
• Kidney Cancer Project (Collaborate with Dr.
Jim Brugarolas)
– Working with IR, Jim Brugarolas, Payal Kapur and
others on clinical data curation
– Developing prototype database and web portal
– Transferred, stored and analyzed large amount of
sequencing data
Pilot Kidney Cancer Explorer Data
Kidney Cancer Program
Total
RNA-Seq
Patients
1402
95
1
Mutation
101
2
Goal of the clinical database
Secure Account
System
User-friendly Data
Input and Search
Track Account Login
History
Collaborators
Online Record Tool
Track Clinical Data
Change History
Thank you