Aims of the Biostatistics Core
Download
Report
Transcript Aims of the Biostatistics Core
Biostatistics Bioinformatics Core
Personnel
Elizabeth Garrett, PhD Biostatistician
Giovanni Parmigiani, PhD Biostatistician
Data analysis and System support staff
Hardware
DELL server; linux OS
Linux and Windows workstations
Software
GeneX Database; R-based analysis tools
Labs: Affy Suite, others TBA
Contact Information
Elizabeth S. Garrett
[email protected]
Suite 1103, 550 Building
410-614-2588
Giovanni Parmigiani
[email protected]
Suite 1103 550 Building
410-614-3426
Aims of the Biostatistics Core
Specific Aim 1:
To provide biostatistical consultation and
support to projects in the program.
Special emphasis will be to assist in
visualization, analysis, quantitative
modeling and interpretation of results.
Aims of the Biostatistics Core
Specific Aim 2:
To help in identifying the appropriate data
structures; ensuring data quality and data
confidentiality; and developing efficient
data transferring and interfacing for data
analysis and data visualization under
different platforms.
Two important stages where we get involved
• Planning Stage:
Before the study:
– Experimental Design
How can I best address my
• How many samples?
hypothesis using minimal
• How many replicates?
resources to get maximal
• Housekeeping genes?
information?
• Dye swapping?
– What’s the big deal? You could spend a lot of time and money and
not able to answer your questions due to experimental errors, etc.
• Analysis Stage:
– Visualization
– Data Exploration
– Analytic Tools and Models
After the study:
Now that I have this
enormous amount of
data, how do I summarize
it and answer my
questions?
What we do
• One-on-one consultations with investigators for
planning experiments
• One-on-one consultations with investigators for
visualization, data exploration, and analysis.
• Tutorials for helping investigators use some of the
software for exploration and visualization
independently.
• Tutorials on basic statistical concepts, including
experimental design in gene expression studies
and basic analytic tools.
GeneX
• Web based database, data mining,
and data analysis tool
• Supports
* multiple users
* multiple species
* multiple microarray platforms
Common Denominator for data analysis
GeneX Components
•
•
•
•
Curation Tool (imports data)
Database (OpenSource SQL)
XML Data Exchange Protocol
Query and analytic routines
-- mining
-- biostatistics in R
Analytical Tools and Applications Included
or Co-developed with GeneX
• Clustering
• Visualization
• Principle Component Analysis
and Multi-Dimensional Scaling
• Significance testing with R
• Integration with other databases
Regulation of extracellular matrix changes
and fibrosis in inflammatory bowel disease.
Shukti Chakravarti
Feng Wu
Department of Medicine
Johns Hopkins University
TNBS-colon
Control
TNBS
TNBS-induced colitis model
TNBS dose time points (weeks)
0
2
4
6
inflammation
Disease
initiation
8
12
fibrosis
Harvest
• RNA
• Protein
• Histology
• Intestinal fibroblasts
activity
ECM/fibrosis
inflammation
time
Analysis Plan
•
•
•
•
Expression estimates using dChip
Additional normalization for scanner effect
Two-level regression model
Identification of reliably estimable time
trends in gene expression
• Grouping genes by patterns
Normalization
Empirical Bayes Ranking versus Statistical Significance
FDR < 1/2
P-value < .05
Patterns of gene expression over time
Red: positive slope, low fdr
Green: negative slope, low fdr
Orange and Brown: low p-value