Building a Knowledge Discovery System

Download Report

Transcript Building a Knowledge Discovery System

Building a Knowledge
Discovery System
Shuang Liang
●
Southern Medical University
Marcel Proust
The real act of discovery consists not in finding
new lands but in seeing with new eyes - Marcel Proust
Eying a picture is better than seeing a thousand words
Uncover the hidden links
Bioinformatics
Data
Gene
Protein
miRNA
Knowledge
Disease
Symptom
SM Drug
TCM
Focus
Data
Mining
Data
Integration
Platform
Building
Translational
Research
Omics,
Literature,
HT
Analytical
pipeline,
Mining Clinical & HT
data, Annotation,
Drug screening,
TCM, Target
ID,database,
Structure,
records/images
algorithm,
prediction/analytical tools
Sequence, Literature
etc. Medical Diagnosis,
Prognosis
Data Sources
 PubMed
 KEGG
 iHOP
 DrugBank
 GO
 Locate
 OMIM
 MGI
HT data
Literature
Structure
 NCBI - GEO
 GenBank
 EBI – ArrayExpress
 Transfac
 NextGen Sequencing
 miRBase
 Connectivity Map
 InterPro
 GWAS/SNP/aCGH
 TarFisDock
Others
Expression Profiling of TSG
Gene X
Tissue Specific
Gene Y
Tissue Selective
Tissue Types
TSG Mining
~130 Tissues
~4000 Samples
GeneLogic
+ Novartis
COXPRESdb
Novatis Human tissue compendium
liver
liver
Drug-TSG
Disease-Drug
Making the connection
Disease-TSG
symptom
Component
Prescription
TCM
Data Type
Tissue/Cell types
Number
127
TSGs
TSG - Disease relationships
3960
5672
TSG - Drug relationships
TSG - Subcellular Localization
TSG - GO annotation
2171
3687
47418
TSG - Pathway
6359
TSG - Mammalian Phenotype
32397
Functional Modules
Batch View
Tissue View
Multiple View
Gene
Title
inView
here
Discovery: From Diseases to Drug
p < 1E-5
p < 1E-5
Enrichment (p < 1E-5):
immune response
inflammatory response
Cytokine-cytokine interaction
Toll-like receptor signaling
Discovery: From Diseases to Drug
p < 0.05
TNF
inhibitor :
Etanercept
Adalimumab
Ortiz P, Bissada NF et al
Periodontal therapy reduces the severity of active rheumatoid arthritis
in patients treated with or without tumor necrosis factor inhibitors.
J Periodontol 80: 535–540, 2009.
Discovery:From Drug to Diseases
p < 1E-5
Simvastatin:
hypercholesterolemia
cardiovascular disease
Discovery:From Drug to Diseases
p < 0.05
Bruner-Tran KL, Osteen KG, Duleba AJ.
Simvastatin protects against the development of endometriosis in a nude mouse model.
J Clin Endocrinol Metab 94: 2489–2494, 2009.
Gene
Literature mining
Gene
4324 – drugbank
562 – C-Map
544 – Compound
1305+ - TCM
3960 – drugbank
17119 – non-TSG
2741 – gene set
611 – miRNA
Drug
Pathway
880 – KEGG + Reactome
38611 – gene ~ pathway
Gene
15188 – MeSH+OMIM
86 – TCM symptom
8703 – mammalian phenotype
KDS
Still growing
Disease
Localiza3687 – TSG-related
tion
52532 – gene ~ Go CC
TFBS
PFM
TFBS Prediction
研究策略
MAXLaps:TFBS Prediction Tool
Human Gene TFBS Prediction
Motif
HLF
C-FOS
HFH-1
MEME
NMICA
this study
200 bp
Yes
Yes
Yes
800 bp
No
No
Yes
200 bp
Yes
Yes
Yes
600 bp
No
No
Yes
800 bp
Yes
Yes
Yes
1400 bp
No
No
Yes
TFBS Prediction for TSG
TFBS PFM
CRM1
Tissue types
.
.
.
.
.
.
.
.
.
CRMk
CRM:
T1
T2
T3
Future Plan
 More data collection & integration
 Multiple verticals & prioritization
 Mining capability & feature enrichment
 Hypothesis generation & validation
 Translational use
 Collaboration
Xiaoqin Yang
Xia Chen
Guiping Wang
Yun Ye
Xuezhong Zhou
Shuang Liang
●
NCBI
KEGG
Reactome
MGI
Locate
Drugbank
GO
……
NSFC
GD EA
SMU
Southern Medical University