Transcript ppt

Efficient Implementation of
Complex Interventions in Large
Scale Epidemic Simulations
Jiangzhuo Chen
Joint work with Yifei Ma, Keith Bisset, Suruchi
Deodhar, and Madhav Marathe
Winter Simulation Conference
December 14, 2011
Network Dynamics & Simulation Science Laboratory
Talk Outline
• Background
– Interventions in large scale epidemic simulations
– Indemics simulation framework
• Productivity Enhancement with Indemics
– Efficient intervention implementation
– Comparison results from real experiences
• Performance Modeling and Prediction
– Methodology: explained by examples
– Experiment results
• Summary
Network Dynamics & Simulation Science Laboratory
Large Scale Agent-Based Epidemic Simulation
• Disease diffusion in a population (millions of agents)
through agent-agent contacts (billions)
• Real world intervention policy to epidemics can be very
complex and difficult to predefine/code.
• Many possible interventions with multiple configurable
parameters: large (factorial) simulation design
Ideally we would like:
• Fast simulation
• Capability to represent complicated realistic
interventions
• Appropriate experiment design for given study with a
deadline
Network Dynamics & Simulation Science Laboratory
Complex Interventions
• Vaccinate randomly chosen people
• Vaccinate people with high degrees in contact
network
• Keep all school age children home for 2 weeks
• Each county decides to close its schools if
number of diagnosed students in the county
exceeds threshold; students from closed schools
stay home.
• Same as above, plus for each student that stays
home, if age<12 then a guardian must stay
home too
Network Dynamics & Simulation Science Laboratory
Indemics: System Architecture
Indemics database running on a data server
Semistructured
database
Temporal
database
Relational
database
Indemics Server, running on head node of HPC
Indemics
Server
New Interventions
Queries & Interventions
New epidemic
dynamics
Indemics
Adapter
Indemics
Adapter
Interactive
Client
Batch
Client
Indemics web-interface client on PC
Analyst sees only this module
HPC Epidemic Simulator
(e.g. EpiFast)
Network Dynamics & Simulation Science Laboratory
Indemics: System Architecture
Indemics database running on a data server
Semistructured
database
Temporal
database
Relational
database
Indemics Server, running on head node of HPC
Indemics
Server
New Interventions
Queries & Interventions
New epidemic
dynamics
Indemics
Adapter
Indemics
Adapter
Interactive
Client
Batch
Client
Indemics web-interface client on PC
Analyst sees only this module
HPC Epidemic Simulator
(e.g. EpiFast)
Network Dynamics & Simulation Science Laboratory
Indemics: System Architecture
Indemics database running on a data server
Semistructured
database
Temporal
database
Relational
database
Indemics Server, running on head node of HPC
Indemics
Server
New Interventions
Queries & Interventions
New epidemic
dynamics
Indemics
Adapter
Indemics
Adapter
Interactive
Client
Batch
Client
Indemics web-interface client on PC
Analyst sees only this module
HPC Epidemic Simulator
(e.g. EpiFast)
Network Dynamics & Simulation Science Laboratory
Indemics: System Architecture
Indemics database running on a data server
Semistructured
database
Temporal
database
Relational
database
Indemics Server, running on head node of HPC
Indemics
Server
New Interventions
Queries & Interventions
New epidemic
dynamics
Indemics
Adapter
Indemics
Adapter
Interactive
Client
Batch
Client
Indemics web-interface client on PC
Analyst sees only this module
HPC Epidemic Simulator
(e.g. EpiFast)
Network Dynamics & Simulation Science Laboratory
Epidemic Intervention Implementation
EpiFast
diffusion
code (C++)
EpiFast
diffusion
code (C++)
intervention
code (C++)
Indemics
Intervention
script
framework
code (Java)
DBMS
Network Dynamics & Simulation Science Laboratory
Scenario 1: Benefit of Indemics
• EpiFast is a fast epidemic simulation tool in our lab
• It can represent intervention in the form:
– if predefined global conditions and local conditions are
satisfied for a predefined set of nodes, then change node
properties and/or labels of edges incident on them
– (a) antiviral prophylaxis to randomly chosen people
– (b) keep all primary school students home
• It took too much coding effort (weeks) to implement
– (a’) antiviral treatment to sick people
– (b’) keep all primary school students home and let their
guardians stay home too
• With Indemics, it took only hours to script (a’) or (b’)
Network Dynamics & Simulation Science Laboratory
Performance and Productivity
• We have been concerned about performance of
HPC simulation tools
• Human effort starts to be the bottleneck
– Understand, implement, and verify new intervention
strategies designed by epidemiologists
– Set up simulations; run simulations
– Post simulation analysis
• Indemics: improve human productivity while
maintaining simulation performance
– Development cost reduction
Network Dynamics & Simulation Science Laboratory
Compare Different Ways to Implement
Interventions
• Development cost
intervention
EpiFast implementation
Indemics implementation
line of code
effort
lines of script
effort
triggered
150
1 week
< 100
1 hour
targeted
600
12 weeks
< 100
1 hour
school/block
500
8 weeks
< 100
8 hours
• Triggered intervention: when fraction of diagnosed school-age children
exceeds 20% close all schools
• Targeted intervention: treat diagnosed school-age children with
antiviral
• School (block) intervention: vaccinate all people in any school (census
block) if over 5% in that school (block) are diagnosed
Network Dynamics & Simulation Science Laboratory
Improvement in Study Total
Turnaround Time
Network Dynamics & Simulation Science Laboratory
Big Saving in Human Effort
Network Dynamics & Simulation Science Laboratory
Reasonable Performance Overhead
Network Dynamics & Simulation Science Laboratory
Scenario 2: Motivation for Performance Modeling
• Epidemiologist in our lab wanted to run a large
experiment (factorial design) with complex
interventions
• Simulation results needed in a week
• Decided to use Indemics. But could simulation finish
in time?
• We applied performance model and predicted two
weeks running time
• Epidemiologist revised experiment design (cut half)
• Simulation done in one week
Network Dynamics & Simulation Science Laboratory
Example of Indemics Intervention Script
School intervention: provide vaccines to all students
in any school where more than 5% students are sick
initialization;
define School_Trigger as
SCHOOL_DIAGNOSED_TOTAL.persons > 0.05 * SCHOOL_INTERVENED.size;
reset table SCHOOL_INTERVENED intervened_day = NULL;
for Day = 1 to 10 do
count new_diagnosed each school save_to SCHOOL_DIAGNOSED_TODAY;
count accum_diagnosed each school save_to SCHOOL_DIAGNOSED_TOTAL;
set SCHOOL_INTERVENED intervened_day = Day
if intervened_day = NULL and School_Trigger = true;
apply Vaccination to school in SCHOOL_INTERVENED
where intervened_day = Day;
done
Network Dynamics & Simulation Science Laboratory
Translated into SQL
initialization;
define School_Trigger as
SCHOOL_DIAGNOSED_TOTAL.persons > 0.05 * SCHOOL_INTERVENED.size;
reset table SCHOOL_INTERVENED intervened_day = NULL;
update SCHOOL_INTERVENED set intervened_day = -1;
for Day = 1 to 10 do
count new_diagnosed each school save_to SCHOOL_DIAGNOSED_TODAY;
insert into SCHOOL_DIAGNOSED_TODAY
select school, count(pid) as persons, Day as diag_time from STUDENT,
DIAGNOSED where pid = diagnosed_pid and diagnosed_time = Day;
count accum_diagnosed each school save_to SCHOOL_DIAGNOSED_TOTAL;
set SCHOOL_INTERVENED intervened_day = Day
if intervened_day = NULL and School_Trigger = true;
apply Vaccination to school in SCHOOL_INTERVENED
where intervened_day = Day;
select pid from STUDENT, SCHOOL_INTERVENED where
STUDENT.school = SCHOOL_INTERVENED.school and intervened_day = Day;
done
Network Dynamics & Simulation Science Laboratory
Performance Prediction
• Prepare SQL atom statement performance lookup table
• Given Indemics intervention script (S) for a study
• Generate corresponding SQL query statements (Q1, Q2, …,
Qi)
• Decompose query statements into atoms
• Estimate configurations of the atoms (size of table and
result)
• Look up atom running time AP(a)
• For each query compute query time QP(Q)=sum of AP(a)
• Compute script running time SP(S) = sum of query time
QP(Q)
Predicted running time is only a rough estimate.
Network Dynamics & Simulation Science Laboratory
Atomic SQL Statements
Examples of basic SQL statement in Indemics script
Network Dynamics & Simulation Science Laboratory
Example of Performance Lookup Table
• Data collected for Oracle 10g on server with 4
quad-core 2.4GHz Xeon processors and 64GB
memory
Network Dynamics & Simulation Science Laboratory
School Intervention in Relational Algebra
Network Dynamics & Simulation Science Laboratory
DB Query Time for School
Intervention: Predicted vs. Actual
Network Dynamics & Simulation Science Laboratory
Summary
• Indemics is a database-driven high-performance
high-productivity epidemic simulation
framework
• It enables realistic representation and efficient
implementation of complex intervention
strategies
• We provide performance modeling for predicting
simulation running time before running it.
Network Dynamics & Simulation Science Laboratory