MICROARRAY GAMES
Download
Report
Transcript MICROARRAY GAMES
GAME THEORY
F. Ascione, R. Liuzzi, R. D’Apolito,
A. Carciati, C. Taddei
APPLIED TO
GENE
EXPRESSION
ANALYSIS
OUTLINES
Biology introduction: from DNA to protein
Microarray technology and introduction to game theory
Cooperative games
Axiomatic characterization of the Shapley value for the microarray games
Cooperative games applied to microarray analysis
base pairs
A gene is a segment of a chromosome made up of DNA
Genes contain instructions for making proteins
FROM GENE TO PROTEIN
Cells are governed by a chain of command
DNA RNA protein
Transcription: the synthesis of RNA
under the direction of DNA; produces
messenger RNA (mRNA)
Translation: the actual synthesis of a
polypeptide, which occurs under the
direction of mRNA
Each gene specifies a protein via
transcription and translation
THE GENETIC CODE
The genetic code consists of 64
triplets of nucleotides. These triplets
are called codons.
Gene 2
DNA
molecule
Gene 1
Gene 3
DNA strand
(template)
3
A
C
C
A
A A C C
U
G
G
U
U U G
G
A G
T
5
TRANSCRIPTION
mRNA
5
G
C U
C
A
3
Codon
TRANSLATION
Protein
Trp
Phe
Gly
Ser
Amino acid
The genetic code is redundant but not ambiguous;
no codon specifies more than one amino acid
Each codon specifies one of the
20 aminoacids used in the synthesis
of proteins
PROTEIN STRUCTURE AND FUNCTION
Proteins are fundamental components of all
living cells, performing a variety of biological
tasks
Each protein has a particular 3D structure
that determines its biological function
DNA gets
all the glory,
but proteins do
all the work!
DNA MUTATIONS
ORIGINAL
DNA
SEQUENCE
BASE
SUBSTITUTION
BASE
INSERTION
BASE
DELETION
SICKLE CELL ANEMIA
The change in amino acid sequence causes hemoglobin molecules to crystallize when oxygen
levels in the blood are low. As a result, red blood cells sickle and get stuck in small vessels.
MICROARRAY
WHAT THEY ARE?
Collection of microscopic DNA probes attached to
a solid surface such as glass, plastic, or silicon
chip forming an array (matrix)
WHAT THEY DO?
They allow to simultaneously examine the
presence of many genes within a DNA sample
HOW DO THEY WORK?
Hybridization technique: it consists in fixing all
segments of DNA (probes) on a support and label
the nucleic acid that we want to identify (target).
OBJECTIVE
To compare gene expression profile of a patient with that of a
healthy one to identify which genes are involved in the
disease.
This step is repeated for the
wild type sample and for the
patient sample
RED: cDNA of cancerous cells
GREEN: cDNA of wild type cells
Hybridization technique
This step is repeated for the
wild type sample and for the
patient sample
RED: the mutated gene is present
GREEN: the wild type gene is present
YELLOW: both genes are present at
different levels
FROM MICROARRAY TO GAMES
Game theory is a branch of mathematics that deals with interactive decision making,
namely with situations where two or more individuals (Players) make decision that
affect each other. Game theory is widely applied also in biomedical field.
Non-Cooperative games:
Individual behavior; agents cannot make
agreement except for those which are
established by the rules of the game.
Cooperative games:
There is negotiations among rational agents
who can make binding agreements about how
to play the game. The emphasis is on the
COALITIONS of the players.
1) With whom to cooperate?
2) How to share profits/costs?
CHARACTERISTIC FORM GAME
A cooperative game with transferable utility or TU-game, is a pair (N, v), where N denotes
the finite set of players and v : 2N → R the characteristic function, with v(ø) = 0.
A n-person game in characteristic form with player set N={1,2,…,n} is a pair Γ= (N, v)
(TU-games)
2N subsets of N, i.e. coalitions
N grand coalition; |S| cardinality of S
v: 2N → R s.t. v(ø) = 0 characteristic function describes how much collective payoff a set
of players can gain by forming a coalition
THE PIRATES GAMES
3- players game P1, P2, P3;
Treasure value 1000 doubloons;
River width d;
2
P1, P2, P3 possess a perch of length 𝑑.
3
N=(1,2,3)
𝑣 ∅ = 0; 𝑣 1 = 𝑣 2 = 𝑣 3 =0; 𝑣 1, 2 = 𝑣 1, 3 = 𝑣 2, 3 = 𝑣 1, 2, 3 = 1000
SOLUTION
Given a TU-game (N,v) a solution concept is a vector 𝜙 ϵ RN that represents
the allocation to each player.
AXIOMS
SOLUTION
Given a TU-game (N,v) a solution concept is a vector 𝜙 ϵ RN that represents
the allocation to each player.
There exist several solution concepts based on different notions of fairness.
THE SHAPLEY VALUE
Theorem: the Shapley value φ of the game <N, v> is a vector 𝜙 𝑣 =[ϕ1 𝑣 , … , 𝜙𝑛 𝑣 ]
given by the general formula:
𝜙𝑖 𝑣 =
𝑆⊆𝑁:𝑖∉𝑆
𝑠! 𝑛 − 1 − 𝑠 !
[𝑣(𝑆 ∪ 𝑖 − 𝑣 𝑆 ]
𝑛!
∀𝑖 ∈ 𝑁
for each i ∈ N, where s=|S| and n=|N| are the cardinality of coalitions S and N,
respectively.
The value 𝜙𝑖 is a weighted sum of marginal contributions of player i to all coalitions he
can join; it can be interpreted as the expected marginal contribution of each player when
he enters a coalition.
PROPERTIES
Characterization: 𝜙 𝑣 =[𝜙1 𝑣 , … , 𝜙𝑛 𝑣 ] is the only vector satisfying eff, sym,
dum, add;
Existence for each game <N, v>;
Uniqueness for each game <N, v>;
Computational difficult for n large.
THE PIRATES GAMES
𝑣 ∅ = 0; 𝑣 1 = 𝑣 2 = 𝑣 3 =0; 𝑣 1, 2 = 𝑣 1, 3 = 𝑣 2, 3 = 𝑣 1, 2, 3 = 1000
𝑖 = 1; 𝑐𝑜𝑎𝑙𝑖𝑡𝑖𝑜𝑛𝑠 𝑛𝑜𝑡 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 1:
3𝜙
, 3(2,3)
𝜙1 𝑣 =∅𝜙,2 2𝑣 , =
𝑣 =
𝜙𝑖 𝑣 =
𝑆⊆𝑁:𝑖∉𝑆
𝑠! 𝑛 − 1 − 𝑠 !
[𝑣(𝑆 ∪ 𝑖 − 𝑣 𝑆 ]
𝑛!
=
+
SV =
1000
3
𝜙1 𝑣1000
= 1000
1000
;
; 3
3
3
0! 3 − 1 − 0 !
1! 3 − 1 − 1 !
𝑣 1 −𝑣 ∅ +
𝑣 1,2 − 𝑣 2 +
3!
3!
1! 3 − 1 − 1 !
1! 3 − 1 − 2 !
𝑣 1,3 − 𝑣 3 +
𝑣 1,2,3 − 𝑣 2,3 =
3!
3!
If you want to go fast,
go alone.
If you want to go far,
1
1
go0together.
=
1000 − 0 +
1000 −
=
3!
3!
- African proverbe -
1000
=
3
THE CORE
An imputation is a vector x ϵ RN satisfying
1. Efficency
2. Individual rationality
An imputation that satisfies the coalition rationality assumption is in the CORE
PROPERTIES
Existence not always
Uniqueness not always
Computation mathematical programming
Sometimes... the proportional rule is in the core (ω1 v (N), …., ωn v (N)),
with 0 ≤ ω1 ≤ 1, i = 1, …., n and
𝑛
𝑖=1 ωi
=1
THE PIRATES GAMES
𝑣 ∅ = 0; 𝑣 1 = 𝑣 2 = 𝑣 3 =0; 𝑣 1, 2 = 𝑣 1, 3 = 𝑣 2, 3 = 𝑣 1, 2, 3 = 1000
𝑥1, 𝑥2, 𝑥3 𝑖𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑜𝑟𝑒 𝑖𝑓:
a)
b)
c)
d)
e)
x1 + x2 + x3=1000;
xi ≥ 0, i = 1,2,3;
x1 + x2 ≥ 1000;
x1 + x3 ≥ 1000;
x2 + x3 ≥ 1000.
2 𝑥1 + 𝑥2 + 𝑥3 ≥ 3000
𝑎𝑑𝑑𝑖𝑛𝑔 𝑐), 𝑑), 𝑒) 𝑤𝑒 ℎ𝑎𝑣𝑒 2 𝑥1 + 𝑥2 + 𝑥3 ≥ 3000 that is impossible by
assumption a).
C(v) = ∅
SV =
1000 1000 1000
; 3 ; 3
3
THE SHAPLEY VALUE FOR MICROARRAY GAMES
Microarray game are a class of cooperative games with transferable utility (TU-game)
Solution
Core
Shapley Value
The Shapley value of a microarray game has been proposed as an index suitable to evaluate the
role covered by each gene in realizing the association between the expression property and the
biological condition of the original cell.
But we are sure to use the Shapley value for microarray games?
Axiomatic characterization
PARTNERSHIP OF GENES
In order to characterize the Shapley value by means of properties with genetic
interpretation, the definition of partnership of genes takes a basic role.
In other words a group of genes S such that does not exist a proper () subset of S which
contributes in changing the worth of genes outside S.
Example
These two sets are
partnerships of genes
in the corresponding
microarray game
s1
g1 0
g2 0
g3 1
s2
1
1
0
s3
1
1
1
AXIOMATIC CHARACTERIZATION OF THE SHAPLEY
VALUE FOR THE MICROARRAY GAMES
We call relevance index for genes a solution 𝐹: ℳ 𝑁 → ℝ𝑁 on the class of microarray games
with the set of genes N as the set of players. The properties for relevance indexes, related to
the concept of partnership of genes, are the following:
Partnership rationality
The PR property determines a lower bound of the power of a partnership, i.e., the total
relevance of a partnership of genes in determining the onset of the tumor in the individuals
should not be lower than the average number of cases of tumor enforced by the partnership
itself.
AXIOMATIC CHARACTERIZATION OF THE SHAPLEY
VALUE FOR THE MICROARRAY GAMES
Partnership feasibility
The PF properties determines an upper bound of the power of a partnership, i.e., the total
relevance of a partnership of genes in determining the tumor onset in the individuals should
not be greater than the average number of cases of tumor enforced by the grand coalition.
AXIOMATIC CHARACTERIZATION OF THE SHAPLEY
VALUE FOR THE MICROARRAY GAMES
Partnership monotonicity
Disjoint
Equivalent
Exhaustive
The genes in the smaller partnership should receive not less power index than genes in the
larger partnership.
AXIOMATIC CHARACTERIZATION OF THE SHAPLEY
VALUE FOR THE MICROARRAY GAMES
Equal splitting
Each sample should receive the same level of reliability. So the power of a gene on two
samples should be equal to the sum of the power on each sample divided by two.
Example
AXIOMATIC CHARACTERIZATION OF THE SHAPLEY
VALUE FOR THE MICROARRAY GAMES
Null gene
A gene which does not contribute to change the value (of activations of the tumor) of any
coalition of genes, should receive zero power.
The aim of these properties is to state how a relevance index should behave in very
simple situations of genes interaction.
Theorem
The Shapley value is the unique solution which satisfies NG, ES,
PM, PR, PF on the class of Microarray Games.
How can we calculate the Shapley value of thousands of genes?
THE SHAPLEY VALUE FOR MICROARRAY GAMES
In general let N be a set of players and R ⊆ N . The unanimity games (N, uR) is defined as
follows:
1 𝑖𝑓 𝑅 ⊆ 𝑇
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Every TU-game (N, v) can be written as a linear combination of unanimity games in a
unique way:
∀ T ⊆ N,
𝑢𝑅 𝑇 =
𝑣=
𝜆𝑠 (𝑣)𝑢𝑠
𝑆⊆𝑁,𝑆≠ø
The coefficients 𝜆𝑠 are called unanimity coefficients of the game (N, v)
The general formula of the Shapley value for the game (N, v) is:
𝜙𝑖 𝑣 =
𝑆⊆𝑁:𝑖∉𝑆
𝑠! 𝑛 − 1 − 𝑠 !
[𝑣(𝑆 ∪ 𝑖 − 𝑣 𝑆 ]
𝑛!
∀𝑖 ∈ 𝑁
An alternative representation of the Shapley value can be given in terms of the unanimity
𝜆 (𝑣)
coefficient 𝑆𝑆 of a game (N, v):
𝜙𝑖 𝑣 =
𝑆⊆𝑁:𝑖∉𝑆
𝜆𝑆 (𝑣)
𝑆
∀𝑖 ∈ 𝑁
MICROARRAY GAMES
…
Array1
Array2
Array3
Level of expression of gene 5 in
the sample 4
MICROARRAY GAMES
GOAL: Quantifying the relative relevance of genes on the basis of the information provided by
microarray experiments, taking into account the level of interaction among the genes.
Microarray expression data from
disease samples
Discretized matrix
Microarray expression data from
normal samples
Decision rule:
Cutoffs
MICROARRAY GAMES
The most important gene is gene 2 followed by gene 1 and 3 with the same score
But…
MICROARRAY GAMES
... a typical experiment consists of a table of numbers with more than 22000 rows (genes)
and 60 of arrays (samples).
…2 22000
coalitions…
Specific software for microarray analysis
MICROARRAY GAMES
Top ten genes with highest Shapley value on the microarray game
Some of these genes were previously observed in association with other cancer type
REFERENCES
1. S. Moretti, "Statistical analysis of the Shapley value for microarray games." Computers
& Operations Research 37.8 (2010): 1413-1418.
2. S. Moretti, F. Patrone and S. Bonassi. "The class of microarray games and the
relevance index for genes.“, Top 15, (2007): 256-280.
3. L.S. Shapley, “A value for n-person games” Contributions to the Theory of Games II.
Annals of Mathematics Studies, 28, (1953): 307–317.