33-6-ET-V1-S1__biomi.. - e-Acharya Integrated E

Download Report

Transcript 33-6-ET-V1-S1__biomi.. - e-Acharya Integrated E

Mining Biological Data
Protein
Enzymatic Proteins
Transport Proteins
Regulatory Proteins
Storage Proteins
Hormonal Proteins
Receptor Proteins
Synaptic activity
Pre-Synaptic and Post-Synaptic Activity
The
cells are
held together
adhesion
Neurotransmitters
Synaptic
vesicles
are
fusestored
withbypre-synaptic
incell
bags
called
Post-synaptic
receptors
synaptic
membrane
vesicles
and release
their content
into
recognize
them
as
a
synaptic cleft
signal and get activated
which then transmit the
signal on to other
signaling components
Source - http://www.mun.ca/biology/desmid/brian/BIOL2060/BIOL2060-13/1319.jpg
Why do we need data mining?
Limitations of Human
Analysis
– Inadequacy of the human
brain when searching for
complex multifactor
dependencies in data
Several Repositories and Databases
• There are several protein data repositories
and databases available online from where we
can get necessary information about the
protein.
Uniprot
[the Universal Protein Resource]
Central repository of
protein sequence
and function
created by joining
information
contained in SwissProt, TrEMBL and
PIR
Prosite
PROSITE
Database of protein families and domains
 Prosite consists of:
 biologically significant sites,
 patterns
 and profiles
 that help to reliably identify to which known
protein family (if any) a new sequence belongs
What kind of knowledge can be
mined?
• Bio-informatics have become one of the most
important applications in data mining.
– DNA sequences
– Protein sequences
– Protein folding
– Microarray data
– ……
Contributions of Data Mining
• Semantic integration of heterogeneous,
distributed genomic and proteomic databases
• Alignment, indexing, similarity search, and
comparative analysis of multiple
nucleotide/protein sequences
• Discovery of structural patterns and analysis of
genetic networks and protein pathways
• Identifying co-occurring gene sequences and
linking genes to different stages of disease
development
• Visualization tools in genetic data analysis