The influence of membrane lipid structure on plasma

Download Report

Transcript The influence of membrane lipid structure on plasma

A Method for Protein Functional Flow Configuration and Validation
Woo-Hyuk Jang1
[email protected]
Suk-Hoon Jung1
[email protected]
Dong-Soo Han1
[email protected]
1 School of Engineering, Information and Communications University,
119, Munjiro,Yuseong-gu, Daejeon, 305-714, Korea
ABSTRACT
With explosively growing PPI databases, the computational approach for a prediction and configuration of PPI network has been a big stream in the bioinformatics area. Recent researches
gradually consider physicochemical properties of proteins and support high resolution results with integration of experimental results. With regard to current research trend, it is very close future to
complete a PPI network configuration of each organism. However, direct applying the PPI network to real field is a complicated problem because PPI network is only a set of co-expressive proteins or
gene products, and its network link means simple physical binding rather than in-depth knowledge of biological process. In this paper, we suggest a protein functional flow model which is a directed
network based on a protein functions’ relation of signaling transduction pathway. The vertex of the suggested model is a molecular function annotated by gene ontology, and the relations among the
vertexes are considered as edges. Thus, it is easy to trace a specific function’s transition, and it can be a constraint to extract a meaningful sub-path from whole PPI network. To evaluate the model, 11
functional flow models of Homo sapiens were built from KEGG, and Chronbach’s alpha values were measured (alpha=0.67). Among 1023 functional flows, 765 functional flows showed 0.6 or higher
alpha values
Background
Motivation & Related Work
At the early studies of PPI prediction, many prediction techniques were
developed based mainly on a few features of a protein (i.e., domain frequency
in the interaction protein pair), so they suffered from low prediction accuracy
problem. However, recent researches gradually consider physicochemical
properties of proteins and support high resolution results with integration of
experimental results. With regard to current research trend, it is very close
future to complete a PPI network configuration of each organism. The signal
transduction is a process which describes a cell change by external stimulus,
and it plays an important reference roll of most fundamental cellular processes.
Most of signal transduction are initiated by extra cellular signal, and cascade
intracellular activities by ligand-receptor binding are followed such as protein
phosphorylation and de-phosphorylation, PPI, and protein-small molecules’
interaction. Given the fact that signal transduction pathway is protein’s
cascading activities, identifying participants and their relationship for a specific
signal transduction from whole PPI network is an essential work. However,
even though the PPI network is completely configured, extracting signal
transduction pathway is complicated problem because PPI network is only a
set of co-expressive proteins or gene products, and its network link means
simple physical binding rather than in-depth knowledge of biological process.
Thus, most of the target signal transductions have been manually discovered
so far. To overcome the problem, we suggest a protein functional flow model
which can be a constraint to extract meaningful sub paths from whole PPI
network. Suggested model is a directed network based on a protein functions’
relation of signaling transduction pathway.
▣ Conventional PPI network can not be directly
applied to real fields such as signaling
transduction pathway prediction or metabolic
pathway prediction, because of a lack of indepth biological knowledge annotations.
▣ Protein pairs show a specific functional patterns on the PPI
network.
▣ Correlated interacting genes with GO annotations (~12% of
interacting genes had exactly same annotations; 27% had very
similar annotations)[1].
▣ Researchers’ focuses are moving from a single
protein-protein interaction possibility inspection
to a extracting meaningful sub networks
against whole protein interaction network.
▣ Found functional patterns from PPI network, and compared
them to random patterns respect to MIPS and KEGG
respectively[2].
▣ Functional template modeling by abstraction of enzyme
functions[3].
Figure 2. Gene Ontology (GO) annotations have hierarchical relationship each
other, so one function can be replaced to its parent function. With this
strategy, they finally make the most general functional template, the
Pathway Functionality Template (PFT).
Figure 1. To fulfill applied area’s needs, researchers’ focuses are
moving from single protein pairs to functionally related
sub-networks. In this situation, it is essential to develop a
reference model or rules to navigate the paths.
Functional Flow
Validation
▣ Concept 1: There are functional flow patterns in the meaningful protein interaction path such as
signaling transduction pathway.
▣ Top 10 Function Flows including & excluding general function Protein Binding (GO:0005515)
Function 1
Function 2
Sub Type
Count
Function 1
Function 2
Sub Type
Count
▣ Concept 2: If a certain protein pair show a functional pattern, a relation type (i.e. activation) of this pair
is similarly detected to other protein pairs which have same functional patterns.
GO:0005515
GO:0005515
phophorylation
35
GO:0004722
GO:0004708
Inhibition/dephosphorylation
5
GO:0005515
GO:0005515
activation
29
GO:0000287
GO:0004708
phosphorylation
5
GO:0005515
GO:0005515
compound
16
GO:0004707
GO:0003700
phosphorylation
5
▣ Concept 3: Usually, multiple molecular functions are annotated for one protein, but only one function
of them would be selected by other protein.
GO:0004435
GO:0005515
compound
13
GO:0046332
GO:0043565
activation
4
GO:0005515
GO:0005515
inhibition
12
GO:0046332
GO:0003700
activation
4
GO:0005515
GO:0005515
Binding/association
11
GO:0004722
GO:0005545
compound
4
GO:0004722
GO:0005515
compound
10
GO:0004722
GO:0005545
compound
4
GO:0005515
GO:0004674
phosphorylation
9
GO:0004722
GO:0005158
compound
4
GO:0005515
GO:0003700
phosphorylation
9
GO:0004722
GO:0016303
compound
4
GO:0005515
GO:0004672
phosphorylation
8
GO:0004722
GO:0019903
compound
4
Q68DJ6,
Q9ULC3
A4D1K5,
Q99835
RAB23
SMO
(SMOH)
Unknown
Q13635,
Q9Y6C5
PTCH2
GO:0005515,
GO:0004872
inhibition
GO:0005515, protein binding
GO:0004872, receptor activity
GO:0005113, patched binding
GO:0043237, laminin-1 binding
P98164
SHH
(HPE3, HLP3)
GO:0005515,
GO:0004872,
GO:0004888
n
4872
dissociatio
n
4888
LRP2
GO:0015485,
GO:0005113,
GO:0043237
activation
dissociatio
n
5515 dissociatio 5515
n
dissociatio
GO:0004888, transmembrane receptor
activity
GO:0015485, cholesterol binding
Q14623,
Q43323, … (4)
GO:0005515,
binding/associatio
n
43237
activation
15485
4888
activation
binding/association
activation
5515
activation
5113
dissociation
activation
4872
Figure 3. Some proteins and their
relationship in Hedgehog Signaling
Pathway.
▣ Figure 3 shows some proteins and
their relationship of Hedgehog
signaling transduction pathway.
Each protein has general
molecular functions and the
functions have relations such as
inhibition or dissociation.
▣ Based on the relation “binding/association” between “LRP2” and “SHH”, we consider that protein
binding(GO:0005515) , cholesterol binding(GO:0015485), patched binding(GO:0005113) and
laminin-1 binding(GO:0043237) functions have “binding/association” relation. Proteins whose
function is unknown were manually removed, and redundant count of functional flows was
utilized as a weight score. Similarly, total 1023 functional flow were extracted from 11 H. sapiens
signaling transduction pathways of KEGG database.
▣ Internal integrity of functional flow was measured via Chronbach’s alpha value. Chronbach’s
alpha value checks integrity or similarity of each questionnaire when single concept is asked
by many different questionnaires. The variables of Chronbach’s alpha values correspond to
followings.
• N = total count of functional flows which extracted from a specific
N
signaling transduction pathway

2 
   i
N 

1  i 1 2
N 1 
X







2

•  = a variance of a specific functional flows out of 11 signaling
transduction pathways.
i
2

• X = a variance of all functional flows in a specific signaling
transduction pathway.
▣ Note that the type of a certain functional flow has conflict in other signaling transduction
pathway, we decrease a appearance values from 11 to zero. The average alpha of overall
functional flows was 0.67, and 765 functional flows had 0.6 or higher alpha values.
Chronbach's alpha
GO term distance from root
0.9
7
0.8
6
0.7
5
0.6
0.5
4
Alpha
Di s t a n c e
0.4
3
0.3
2
0.2
1
0.1
0
04010
04012
04310
04330
04340
04350
04370
KEGG Sig. Path.
04630
04020
04070
04150
0
04010
04012
04310
04330
04340
04350
04370
04630
04020
04070
04150
KEGG Sig. Path.
[1] Tong, et al., “Global mapping of the yeast genetic interaction network”, Science, 2004.
[2] Mehmet E Turanalp and Tolga Can, “Discovering functional interaction patterns in protein-protein
interaction networks”, BMC Bioinformatics, 2008.
[3] Ali Cakmak and Gultekin Ozsoyoglu(CS, USA), “Mining biological networks for unknown pathways”,
BIOINFORMATICS, 2007, 23:20.
▣ This research was financially supported by the Ministry of Education, Science Technology (MEST) and
Korea Industrial Technology Foundation (KOTEF) through the Human Recourse Training Project For
Regional Innovation.
Intelligent Service Integration Laboratory, http://isilab.icu.ac.kr