Transcript PPT

2nd ASEAN China International Bioinformatics Workshop
Biological Data Sharing and
International Cooperation
Yixure Li, Ph.D
Shanghai Center For Bioinformation
Technology (SCBIT)
Key Lab. of Systems Biology, SIBS, CAS
1
Outline

China becomes one of the significant biological data producers in the
world and will play significant role in the future

Biological data sharing promoted by Center and local governments

Progresses of biological data sharing in China

A proposal for future cooperative activities
2
China becomes one of the significant biological data producers
and also an important contributor to bioinformatics research
field of the world and will play significant role in the future
3
The position of bioinformatics research
from China of the world
from 2001-2005:
According to the number of bioinformatics
research publications and citation rate
China became the top 4 and top 11 country
of the world nations respectively.
What is behind it?
4
Money, People, Collaborations, etc.
5
2001-2010: ~1 billion USD
Biological and biomedical research
program supported by MoST of China
6
Field: Bio-technology, Med-technology &
Pharma-technology(2001-2005)
Main Research Fields:




Bio-engineering Technology
Gene Manipulation Technology
Bioinformation Technology
Biomedical Technology
Main Key Projects:





Functional Genomics and Bio-Chip
Novel Drugs
Tissue (Organ) Engineering
Bioreactors
Breeding of New Varieties of High-Quality and High-Yield Crops
7
Field: Bio-technology, Med-technology &
Pharma-technology(2006-2010)
Main Research Fields:




Industry Bio-technology
Gene Manipulation Technology
Bioinformation & BiocomputingTechnology
Biomedical Technology
Main Key Projects:





Drug desigh
Nano-biotechnology
Bio-chip & biomedical instrumentation
Biomarks & molecular profilings of major diseases
Genome & proteome technology
8
Bioinformation Technology
Seven Research Directions: (2001-2005)

Technology for Bio-Data Acquisition and Mining

Technology for Bio-Data Application

Structure Genomics and Proteomics

Molecular and Drug Design

Bio-Chip

High-Throughput Drug Screening

Novel Drugs
9
New Budget Launch Into Bioinformatics Field from
the 863 program (2006-2010)

Topic Program : Bioinformatics and Bio-computing

Key Program: Drug design and Molecular design

National Program:

Functional genomics and protoemics

Major disease related molecular genotype and personal
health care
10
Budget Launch Into Biological data producing
and managing from national high-tech program (20012010)
Biological data producing: ~150 million USD
Data sharing platform:
Center government: ~10 million USD
Local government(SH): ~ 5 million USD
11
New budget will be lunched next three year

Bioinformatics and Neuroinformatics


Biological database


Synthetic biology, gene regulation net work,
protein-protein interaction, whole-genome-wide
association study, computational systems biology,
neuroinformatics, etc.
Genomics, proteomics, transcriptomics, metabolomics,
national biological data center, etc.
Bioinformatics products
Biomedical database, clinical research support platform,
clinical trials information management system, TCM
informatics, etc.
Total budget: at least 10 million USD

12
China becomes a significant biological data
producer




Genome data
many new projects for microbial genome sequencing, meta-genome
sequencing and functional geneomics research;
a lot of new sequencers already intruoduced by many genome research
centers:
ABI, McBase, 454, Solexa, SOLiD, etc.
Bio-chip data
two national centers, a lot of institutions, Universities,
hospitals, key laboratories, etc.
Proteome data
many research centers, Labs. Hospitals, etc.
Metabolic data
Meta-genome projects, industry bio-technology
13
14
Biological data sharing promoted by Centre
and local governments
15

Found national bioinformatics and information base

Set up national wide bioinformatics service platforms

Promote to develop secondary biological databases
16

Develop algorithms and bioinformatics
platforms for genome annotations, functional
genomics research, candidate drug target
identification, population genetics research

Market driven algorithms and software tool
for computational biology research,
biotechnology instrumentations
17
Progresses of biological data
sharing in China
18
CBI Database System and Bioinformatics
Service Platform
19
CBI Database System and Bioinformatics
Service Platform
20
CBI Database System and Bioinformatics
Service Platform
21
BGI Rice genome database, with
related search tools and accessorial
data, like EST, Bio-Chip,
Proteome, etc.


Data Download
Map View





Over View
Scaffold View
Gene View
cDNA View
Compare View
22
RePS Package
http://rise.genomics.org.cn/rice/link/ts.jsp
23
Silkworm Genome Database







Data & Statistics
MapView
Search
Report
Tools & Services
Download
Schema
Contig
Search
Gene
Search
24
BGI Chicken
Variation
Database
Information Page
System Structure
Home Page
25
Chimpanzee Database Structure
(CHGC&SCBIT)
Public tier
Integrated protein database
Enzyme
GO
XML
format
Domain
途径库
Private tier
Chimpanzee Genome
GFF
format
26
Chimpanzee Database
27
ISHIP-Human protein-protein Interaction Database
STRING
DIP
BIND
MINT
HPRD
脚本处理
HPPI-ID
IntAct
DDI
Prediction
Ortholog
Prediction
Web
Search
28

Multiple data search
 Protein accession
 Gene symbol
 GeneID
 IPI accession
 Keyword
 BLAST
29
Microbial Genome Databases and
Annotation Platform (SMIGA)
Most in house sequenced microbial genome data were put into
this database system, such as, leptospira interrogans,
Staphylococcus epidermidis, Schistosomiasis, Xanthomonas
campestris pv. Campestris, etc.
30
Gene Regulatory
Information Database
South-east University
31
ncRNA Database: NONCODE
Institute of Biophysics, CAS
32
Integrated Protein-Protein Interaction
Database System- PPiDB
33
Bioinformatics Service Platforms
34
35
36
BOD (Bioinformatics On-Demand) System From
Qinghua University: Pepline
37
BOD Overview
38
Technology Structure of Microbial Genome
Databases and Annotation Platform (SMIGA)
Genomic Sequence
•ORFs (for
genomic)
•GC Content
•Promoter
•tRNA
•COG classification
•Protein features
•Pathway
•…
•Genome vs Genome
•Genome vs Fragment
•Genome vs Gene
•…
Protein Sequence
Workflow Selection
and Parameter Setting
Functional
Annotation
TIGR Genomes
Annotation Engine
Insert
Genome Comparison
DataBase
Directed to
SRS
•Homology Search
•Multi-sequence Alignment
•Further EMBOSS analysis
•…
export
Visualization
39
40
Software Packages Developed in SCBIT: Bio-Science
Information
Knowledge
Discovery Workflow
Real Time Data Integration
Discovery Services
Literature
Databases
Operational
Data
Dynamic Application
Integration
Intellectual Property Management
Using Distributed Resources
Images
Instrument
Data
41
Microbial Genome Pathway mapping :ComPath
42
Biological data sharing platform
located at Shanghai
43
Biological database system,
which provides services for data
submitting, downloading,
cleaning, searching and
analysis, developed by SCBIT
and supported by Shanghai
Municpality
44
DNA Data Submission Platform
45
Schistosoma Database
46
Schistosoma Genome Draft Sequence issued by Human Research
Center at Shanghai on May 16, 2006 for the world based on the data
sharing platform developed by SCBIT. Vice minister of Ministry of
Public Health, vice mayor of Shanghai Municipality and vice
President of CAS attended this data issue meeting.
47
48
49
2.5 million USD was paid for computational facilities
to support the data sharing platform at Shanghai.
50
Scientific Publications of SCBIT
First or corresponding authors:
Science, Nature, PNAS, NAR, JBC, Bioinformatics, Oncogene,
BMC Bioinformatics, Plos Computational Biology, MBC Genomics,
Molecular Systems Biology, BBRC, Comput Biol Chem, FEBS
Letters, J Theor. Biol, etc.
Participant authors:
Nature, Nature Biotechnology, Mol Cell Proteomics, Nature
Genetics, etc.
Since 2002: over 100 scientific papers.
51
Basic Problems of Biological Data
Sharing in China
We have already spent a lot of money for various
biological databases and service platform projects and
made significant progresses, but:
Concerted efforts are needed: we still do not have a national
database system to manage biological data produced by national
funding

Centralized databases are essential: we still have no authoritative
national platforms to support biological data sharing and manage
related issues

52
What Could We Do?
International cooperation and collaboration are needed
for data sharing and managing.
 We need cooperation and help to follow the existed
biological data standards and develop some new data
standards, which related with new emerged highthroughput biotechnologies.
 We should also join the existed international biological
alliance to let biologists world wide to share huge plenty
of data produced by Chinese scientists.
 Grid technology may provide us with a good basis for
world wide sharing of biological data.

53
Proposal for collaborative activities in
data-sharing:
Institution-to-institution and
Scientist-to-scientist
To establish a formal biological
data sharing committee and to
conduct regular international
collaborative activities
54
A Stepwise Procedure





Forming a steering committee with leaders from multi sides to
promote collaboration among ASEAN countries and the
Chinese counterpart
Establishing mirror sites of public international biological
databases in selected places of China (such as Shanghai and
Beijing)
Starting joint research projects for novel data standards and
ontologies
Helping to establish a national biological database system in
China
Starting data exchanging processes among China and existed
alliance in ASEAN, Korea, Japan and other countries.
55
Timeline and Major Milestones 2008-2012

2008: Set up a steering committee for planning collaborative
activities and data sharing protocols

2009: Host workshops on specific topics for scientists to
know each other and to establish collaborative projects

2010: Implement mirror sites of public international
biological database at selected place of China start to provide
service

2011: Set up formal alliance among Chinese national
biological database system and its international counterparts,
like ASEAN, Korea, Japan, etc. to start regular data sharing

2012: Improve and expand activities and collaborative scope
56
Thank you for your
attention!
57