What SAT can do for Bioinformatics

Download Report

Transcript What SAT can do for Bioinformatics

What SAT can do for
BioInformatics ?
Compact Models and Efficient Algorithms
Haplotype Inference
• SAT and SAT extensions provide compact models
and efficient algorithms
• Problem Description: Given a set G of n genotypes,
find a set H of 2n haplotypes (not necessarily
distinct), such that for each genotype gi in G there is
at least one pair of haplotypes (hj,hk), with hk and hj
in H such that the pair (hj,hk) explains gi
• Efficient solvers are available for both SAT and
SAT extensions
• Technology used: SAT and Pseudo Boolean
Optimization
• SAT is a well known problem and a area of
intensive research
• Reference: J. Marques-Silva and I. Lynce, AAAI 06
Checking Pedigree Consistency
SAT
&
SAT Extensions
• Problem Description: Given a pedigree check if it is
consistent with the Mendelian laws of inheritance,
i.e., check if every individual inherits exactly one
allele from each of its parents
• Technology used: SAT by translation into CNF
• Reference: P. Manolios, M. Galceran Oms and S.
Oliva Valls, TACAS 07
Genome Rearrangement Distance
• Problem Description: Compute the exact genomic
distance between two genomes in the presence of
duplications, namely, determining the maximum
number of common intervals between the genomes
• Technology used: Pseudo Boolean Optimization
• Reference: S.Angibaud, G.Fertin, I.Rusu and
S.Vialette, RECOMB-CG 06
Protein Folding
• Problem Description: Given a sequence of
amino acids of a protein, determine the
structure to which the protein folds
• Technology used: Pseudo Boolean
Optimization (PBO)
• Current stage: currently a first attempt to
directly translate the HP Lattice Model to
PBO has been done
Other…?
• Besides SAT itself, other extensions of SAT might
be used to solve problems of BioInformatics:
o Pseudo Boolean Optimization
o Satisfiability Module Theories
o SAT Enumeration
o Counting SAT
o Quantified Boolean Formulas
o…
• Given the sequence
“PHPPHPHP” the
energy of the
conformation in the
right is -2
• Expected future steps:
 Inclusion of lower bounds that are able to
efficiently reduce the search space
 Introduction of techniques to break and
remove symmetries
António Morgado - [email protected]
Dr. João Marques Silva - [email protected]
This work is being supported by Microsoft Research
through its European PhD Scholarship Programme