Towards Computational Epidemiology

advertisement
Towards Computational Epidemiology
Designing an Infectious Disease Outbreak Simulator
Armin R. Mikler
Department of Computer Science and Engineering
Department of Biological Sciences
University of North Texas
Armin R. Mikler
Towards Computational Epidemiology
Address broader aspects of Epidemiology
Disease Tracking, Analysis, and Surveillance
High Performance Computing (HPC)
Simulation
Data visualization.
Design and implement computational tools
– investigating Tuberculosis outbreaks and risk assessment
in spatially delineated environments
– modeling and simulating details of specific instances of
Tuberculosis occurrences in North Texas
– applicable to a wide variety of disease outbreaks in
spatially well-defined settings
Contribute towards establishing computational
epidemiology as a new research domain!!
Armin R. Mikler
Disease Outbreak Model
Local
• Local
– Delineated space
• Factory, homeless
shelter, school
– Airflow
– Heating and cooling
– Distances in feet
– Architectural properties
Global
• Global
– Demography
– Socio-economics
– Travel
– Transportation
– Geography
– Culture
Armin R. Mikler
Global Stochastic Cellular Automata and
the SWARM
Top Layer:Cellular Automata
Global
Middle Layer: Cellular Automata
Regional
Bottom Layer: SWARM
Local
Armin R. Mikler
The Focus of Study--Locality based
This study proposes to model the dynamics of
tuberculosis transmission within two facilities in
North Texas - a homeless shelter facility providing
both long and short-term occupancy with 800 beds,
and a factory.
Data was previously collected through interviews
during targeted surveillance screening of workers
in the factory and homeless people who use the
shelter.
Data has been Deidentified !!!
Armin R. Mikler
Homeless Shelter Data and Findings
For the homeless shelter, the data set
comprises screening records for each
case including:
•Date tested (relative to t0)
•Status of tuberculosis
•Location in the facility
•Length of time spent in the facility
•Other variables
Results of initial analysis suggest
that TB risk is not uniformly
distributed but depends on the
location of the sleeping bed and
duration and frequency of stay at
the night shelter.
Armin R. Mikler
Armin R. Mikler
Factory Data and Findings
In addition to basic screening records as collected for the homeless
shelter, other available data for the factory include measures of
duration and proximity to infected person such as:
• Hours per week in the factory
• Hours per week in the same workspace
• Hours per week within 3 feet of infected person
• Usual work area.
Results of initial analysis
indicate that proximity of
workspace to infected person
was a major determinant of
infection.
In fact 100% of those who
worked directly in the same
space with one infected
person were infected with the
same strain of TB.
Armin R. Mikler
Factory Layout
Armin R. Mikler
The Paint Area
Air vent system
The Restroom
The Eating Area
Armin R. Mikler
Modeling Approaches
 Agent based modeling




Level of exposure
Emergent behavior defined by individuals’ actions.
The average number of bacilli that are emitted (through coughing,
sneezing, etc.)
Spatial interaction.
 Stochastic Cellular Automata



Ambient temperature and airflow
Particle Suspension and Dispersion
Intrinsically stochastic.
Armin R. Mikler
From GIS data to Agent-Based
Simulation to Visualization
GIS/ Epidemiologic Data
Social Interactions
Particle suspension & Airflow
Visualization
Armin R. Mikler
Movement and Desire
D
S/D
A
B
C
D
…
C
A
B
A
A
A
C
…
B
B
B
C
…
C
C
C
C
…
D …
B …
C …
D …
- …
……
Agent at (xi, yi)
Desire Functions
7
6
Thirst Threshold
5
4
f(x)
Smoking
Smoking Threshold
Thirst
3
Example of functions that model
different types of desire as a function of
time.
2
1
0
1
6
11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
Time (t)
Armin R. Mikler
Particle Suspension and Dispersion
Settling of bacilli
Time
As a function of time, bacilli settle
toward the ground and may spread to
neighboring cells
Armin R. Mikler
 State of each cell Ci,j depends on
Ci,j+1, Ci,j-1, Ci+1,j, Ci-1,j, Ci+1,j-1, Ci+1,j+1,
Ci-1,j-1, Ci-1,j+1
 The color of a cell changes based on the majority color of its
neighbors
T0
T1
Armin R. Mikler
Visualization--‘Simulated’ Simulation
Pathogen Content

Obstructability
 Healthy Person
Normal
 Weaker Person
Low/Med TB
 Sick Person
High TB

Removed
Floor
Obstacle
Wall
Armin R. Mikler










Armin R. Mikler










Armin R. Mikler










Armin R. Mikler










Armin R. Mikler
The Future: Clusters and the GRID
 Faster hardware and
new high-bandwidth
networks demand that
we explore new cluster
architectures.
 Larger, more complex
cluster environments
make it imperative to
invest in new efficient
and scalable tools.
 Grand Challenge problems will
continue to drive the development
of computing infrastructure.
 Distributed HPC will become
common place. (DOE SciDAC)
 Management Tools designed for
single hosts or small clusters are
likely NOT to scale.
 New types of Middleware is
needed to decouple the underlying
distributed infrastructure from
the applications.
Armin R. Mikler
Grid Layers…virtualization
Data
Grid
Comp.
Grid
Bio
Grid
i.e., Scientific
Discovery through
Advance Computing
Applications
Application-Specific Grid Services (APIs)
Middleware
General Grid Services
Grid
Engine
Grid
Engine
Grid
Engine
Grid
Engine
Grid
Engine
Grid
Engine
Grid Access
Internet / Private
Networks
Armin R. Mikler
Matter of Facts….
 There is increasing demand for harnessing computational resources
 Increasing demand for Grid-based computing at the private sector
 Computing Power will become a commodity like Water, Gas, etc.
 As with ISPs, Grid Access Providers (GAPs) will have to guarantee
Quality of Service.
 Through Grid Services, we can provide a global computing
infrastructure and facilitate services for a large number of
application domains at the private and public sector!
Examples: Healthcare, Education, Industrial R&D, Entertainment,
Sciences, etc.
Armin R. Mikler
Cluster Semantics
Cluster Nodes
MASTER
NODE
Networking
Interconne
ct
Armin R. Mikler
Armin R. Mikler
Armin R. Mikler
People Behind - The Group
Armin R. Mikler
A Final Push to Control TB
Because the number of cases of
TB in the U.S. are lower than
they’ve ever been, we have the
opportunity to finally control TB in
the U.S.
Recent research suggests that
focusing on the dynamics of how
TB is transmitted in specific
locations is a much-needed final
push to TB control.
Homeless shelters and
overcrowded areas constitute
reservoirs of TB infection.
Yet little research exists on the
dynamics of localized TB
transmission in homeless
shelters.
Little attention has been given to
places like factories,
warehouses, healthcare facilities,
or schools where people work in
close proximity for long periods
of time.
Armin R. Mikler
Cray Y-MP & IBM Power4
 “Common” supercomputer in early 1990's
 ~$1 million from Cray
 Max speed: 2.3 gigaflops (record speed)
• Pentium III 1Ghz processors.
Same processors sold “off the
shelf”
• 64 gigaflops
• 198th on Top500 list
(http://www.top500.org)
Armin R. Mikler
Big Mac @ Virginia Tech
 Macintosh G5
workstations
 Infiniband
networking
interconnect
 3rd fastest
supercomputer in
the world
Armin R. Mikler
Cellular Automata
(4 Neighbors – von Newman)
 State of each cell Ci,j depends on
the neighbors Ci,j+1, Ci,j-1, Ci+1,j, Ci-1,j
 For example, the color of a cell depends on the majority
color of its neighbors
T0
T1
Armin R. Mikler
Download