Development of Generation CP Domain Models and Ontology
Download
Report
Transcript Development of Generation CP Domain Models and Ontology
Development of the Generation
Challenge Program Ontology for
Crops
Elizabeth Arnaud
(Bioversity International)
and
Rosemary Shrestha (CRIL-CIMMYT), Richard
Bruskiewich (IRRI)
TDWG 2008 Annual Conference,
20-25 October 2008
Fremantle, Western Australia
The Generation Challenge
Programme
Science for better crops in the tropics
For the majority of crop farmers in the
developing world, the ravages of drought, low
soil fertility, crop pests and diseases are
aggravated by their limited access to improved
crops.
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The Generation Challenge
Programme
Science for better crops in the tropics
By using advances in molecular biology and
harnessing the rich global stocks of crop genetic
resources, the Generation CP creates and
provides a new generation of plants that meet
farmer needs.
http://www.generationcp.org/
Consultative Group on International Agricultural
Research (CGIAR)
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP subprograms
SP1- Genetic Diversity of Global Genetic Resources
SP2 - Genomics towards gene discovery
SP3 - Trait Capture for Crop Improvement
SP4 - Bioinformatics and Crop Information Systems
Building an 'integrated platform' of molecular biology
and bioinformatics tools = Molecular breeding platform
SP5 - Capacity Building and Enabling Delivery
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The Generation Challenge
Programme
Target areas
Drought-prone environments
Mandate crops
All the CGIAR mandate crops = 22 crops
Commissioned and competitive projects
275 projects in 5 years
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP New Challenge initiatives
Cereals
1. Rice/drought/Africa
2. Wheat/drought/Asia
3. Sorghum/drought/Africa
4. Rice-Sorghum-Maize/soil problem/Asia & Africa
Legumes
5. Cowpeas/drought/Africa
6. Chickpeas/drought/Africa and Asia
Root and tubers
7. Cassava/virus/Africa
Integration across diverse crop
datasets
Volume and complexity of biological data is
increasing
Historical data are scattered in numerous crop
specific databases
Each database uses slightly different
terminologies for terms related to phenotypes
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Integration across Diverse GCP Crop Data
• Inventory
• Identification (passport)
• Genealogy
SP1
Genotype
has
has
• Genetic Maps
• Physical Maps
• DNA Sequence
• Functional Annotation
• Molecular Variation
(Natural or Induced)
e
SP2
Expressio
n
• Transcripteome
• Proteome
• Metabolome
• Physiology
affects
Environment
SP3
• Anatomical
• Developmental
• Field Performance
• Stress Response
Molecular
• Location (GIS)
• Climate
• Day Length
• Ecosystem
• Agronomy
• Stresses
Phenotyp
Germplasm
SP3
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
An integrated platform for
molecular breeding
To support and encourage researchers to share and
reuse information among agricultural databases
To form the basis for the generation of data
templates, web services and software.
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP Scientific Domain Model
Germplasm identification (“passport") and pedigree data
Phenotypic characterization and evaluation data
Geographic location and environmental descriptions
Genotype and molecular data
Genomic map data for markers and loci
Functional genomics data
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The exchange of new findings and joint work
on projects presuppose that all those
involved have the same understanding of
the terms they use. This calls the need for
an extensively standardized description of
plant development stages with phenological
characteristics and coding.
Prof. Dr. F. Klingauf
President of the Federal Biological Research Centre
for Agriculture and Forestry,
Berlin and Braunschweig
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Importance of crop ontology
Similar plant structures are described by their speciesspecific terms.
Grain or caryopsis
in Rice
Kernel
in Maize
Fruit
Pod
in Beans
Grain
in Wheat
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The GCP Ontology
"Thesaurus" of biological concepts that can be shared
and used across species to which genetic and
phenotypic data can be associated
integrative data mining on GCP annotated data using
the platform and web services
Developed with crop experts, for plant structure,
developmental stages, traits and expression of the
traits
for selected priority GCP crops: Wheat, Maize,
Sorghum, Chickpea, Banana & Plantain
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP Sources for mapping the
terms
International Crop Information Systems
ICIS model (http://www.icis.cgiar.org )
IMIS (maize)
IRIS (rice)
IWIS (wheat)
Musa germplasm information system (http://www.musa-diversity.org )
ICRISAT information system (Sorghum, chickpea)
CIP information system (potato)
Crop descriptors for traits (Bioversity International)
GCP data templates
GCP datasets
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
http://www.generationcp.org
GCP Ontology
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Developing the GCP ontology
1
Crop DB
2
GCP crop ontology
GCP concept ID
3
mapping
DBXref
PO concept ID & TO concept ID
GCP data Templates
4
Plant Structure
ontology
www.plantontology.org/
Trait Ontology
www.gramene.org/plant_ontology/
Data annotation with GCP ontology
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP ontology term has:
Term:
ID:
Namespace:
Definition:
plant height
GCP_322*.0000021
maize_trait
Synonyms:
Dbxrefs:
is_a:
PHT, PTHT, Planth. Shoot height
PO:10202TO:0000207, IMIS_TRAITID:1008
GCP_322.0000108
Measurement of plant height from soil surface
to the highest point in plant.
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Building ontology with OBO.Edit
http://oboedit.org/
Terms are linked by the relationships such as
is-a
part-of
has-a
disjoint from
derived from, etc.
It is structured as a hierarchical directed acyclic
graph (DAG)
Terms can have more than one parent and zero,
one or more children
Draft releases of the OBO formatted ontology files for rice,
wheat and maize trait are available at
http://cropforge.org/projects/gcpontology/
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Complex trait's names created
by breeders in the crop
databases
%HSATIVUM_TILLER1_FLAG_1
Complex trait name
Description:
The trait is scored for severity of the disease
caused by Helminthosporium sativum (leaf
spot) at tiller 1 and flag 1 stage in percentage.
to be decomposed into simple terms that are readable
for both human and computer and mapped against Ontology
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Ontology for Crops
Phenoptype
Genotype Factor (G)
Plant structure
Plant Ontology
Markers/alleles/sequence
ontology
Development
stages
“values” have Qualities &
“units”
Units Ontology
(units implicitly
indicates attribute)
EFFECTS
External environmental data (E)
Treatment, Location,
Climatic variables
/water, Growth conditions,
Stress
Management/agronomy
Temporal factor (T)
phenotypic
qualities
PATO
Qualifier
Assessment
Methods
Ontology
(e.g. ICIS)
Time Ontology
Experiment factor (ED)
Experimental design
GCP Ontology – present and future prospects:
GCP data templates
ICIS dataset
Data Source
CGIAR
General Science
Ontology
GCP Domain Module Ontology
General Germplasm Ontology
Taxonomic Ontology
GCP
Ontology
Plant Anatomy &
Development Ontology
Location & Environment Ontology
Structural & Functional
Genomic Ontology
Phenotype & Trait Ontology
Web Interface
(Chado/koios)
Query
Linkage to external
ontologies
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
http://pantheon.generationcp.org
http://pantheon.generationcp.org
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Thank you !
Crops' Harvest Celebration
San Isidoro Feria
Lucban, Philippines
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia