1. Agent Data - Springer Static Content Server

advertisement
1
The Role of Subway Travel in an Influenza Epidemic: a New York
City Simulation
Philip Cooleya, Shawn T. Brownb,c, James Cajkaa, Bernadette Chasteena, Laxminarayana
Ganapathia, John J. Grefenstetteb, Craig R. Hollingswortha, Bruce Y. Leeb, Burton Levinea,
William D. Wheatona, Diane K. Wagenera
a
RTI International, Research Triangle Park, NC, USA. bUniversity of Pittsburgh, PA, USA.
c
Pittsburgh Supercomputing Center, Pittsburgh, PA USA
Correspondence: Phil Cooley, RTI International, 3040 Cornwallis Road, P.O. Box 12194,
Research Triangle Park, NC 27709, USA.
E-mail: pcc@rti.org
Keywords: influenza, pandemic, H1N1, mathematical model, subway transportation, commuting
behaviour
2
SUPPLEMENTARY INFORMATION
3
Contents
1. Agent Data ................................................................................................................................ 6
Region included ........................................................................................................................ 6
Generating Synthesized Households and Persons .................................................................... 7
Household Size and Age Structure of the Synthetic NYC Population..................................... 8
*
Household counts only ** Household + non household counts .............................................. 8
School Data and Allocation Model ............................................................................................. 9
Data Sources for Schools ......................................................................................................... 9
Allocating Students to Public Schools ................................................................................... 10
Allocating Students to Private Schools .................................................................................. 11
Assignment Results for Public and Private schools ............................................................... 12
Data Sources for Schools and Assignments ........................................................................... 13
Roads ...................................................................................................................................... 13
Private Schools ....................................................................................................................... 13
Assignment Results for Public and Private Schools .............................................................. 14
Workplace Data and Allocation Model .................................................................................. 15
Commuting Data .................................................................................................................... 18
2. Model Details .......................................................................................................................... 20
Details of the Transmission Model ........................................................................................ 20
Infection seeding .................................................................................................................... 23
Interventions ........................................................................................................................... 23
Computational resources ........................................................................................................ 27
3. Natural history and transmission parameters ..................................................................... 27
Pandemic influenza model parameterization ......................................................................... 28
Contacts within Household .................................................................................................... 30
Estimation of R0 from inter-pandemic influenza household data .......................................... 31
Assumptions about Population Behavior during a Pandemic ................................................ 31
4
Transmissibility Scenarios ..................................................................................................... 32
Sensitivity Analyses .................................................................................................................... 33
Worker/Student Ratio ............................................................................................................. 33
Transmissions by Workers Employed Outside of NYC ........................................................ 35
Within-place Group Structure and Targeting ......................................................................... 36
Proportion of Infections that become Clinical Cases ............................................................. 37
Behavior of Symptomatic Individuals .................................................................................... 38
Parameter Estimation Strategy Summary............................................................................... 39
Estimates of the Proportion of Infections that Occur on the Subway .................................... 41
Figures
Figure SI-1: The New York City’s Five Boroughs Region ............................................................. 6
Figure SI-2: Impact of Reducing Contacts on Subways for a R0 = 1.4 Epidemic......................... 24
Figure SI-3: Impact of Vaccinating NYC Adults........................................................................... 26
Figure SI-4: Latent Density (days 1-3) and Incubation Density (days 4 – 7) ................................ 28
Figure SI-5. Transmissibility Scenarios: with and without Immunity ........................................... 33
Figure SI-6. Comparison of Calibration Rules for R0 = 1.4 Epidemics ........................................ 34
Figure SI-7. Comparison of Baseline Epidemic with and without non NYC Workplace Infections
........................................................................................................................................................ 37
Figure SI-8. Comparison of Epidemic Curves for the 67% versus 50% Symptomatic Case
Assumption..................................................................................................................................... 38
Figure SI-9. Comparison of Epidemic Curves for the 25%, 50% - Baseline, 75% Stay Home
assumptions. ................................................................................................................................... 39
5
Tables
Table SI-1: Synthetic population age distribution compared to Census data for NYC ................... 8
Table SI-2: Synthesized household size compared to Census data for NYC................................... 9
Table SI-3: Student school Assignment Summary by Age, Borough and Private versus Public
Schools ........................................................................................................................................... 15
Table SI-4: Distribution of Firms by Size and Borough ................................................................ 17
Table SI-5: Distribution of Workers by Borough of Residence..................................................... 18
Table SI-6: Estimated Distance-to-Work Distribution Derived from Census Commuting Data and
Synthesized Household Locations ................................................................................................. 19
Table SI-7: Transmission Parameters ............................................................................................ 22
Table SI-8: Effect of Subway Specific Contact Reducing Interventions ....................................... 25
Table SI-9: Effect of Contact Reducing Interventions ................................................................... 25
Table SI-10: Estimated Person-to-Person Contact Values (Contacts per Day) ............................. 30
Table SI-11: Age Specific Attack Rates (AR) for Alternative Calibration Methods..................... 35
Table SI-12: Estimates of Subway Rider Infections................................................................. 42
6
1. Agent Data
Region included
Our region of study is shown in dark shading on Figure SI-1; it includes New York City’s five
borough region. We used a proportional iterative method developed by Beckman, et al. [1] to
generate an agent population from the US Census Bureau’s Public Use Microdata files (PUMs)
and Census aggregated data [2]. See Wheaton, et al. for a detailed description [3]. Our model
contained a total of 7,847,465 computer agents, or virtual people. Each person resided in one of
the five boroughs and had a set of socio-demographic characteristics and daily behaviors that
included age, sex, employment status, occupation, and household location and membership. A
total of 2,005,024 people were under 18 years of age, and 848,590 were over 65.
Figure SI-1: The New York City’s Five Boroughs Region
7
Generating Synthesized Households and Persons
We used three primary data sources to generate the synthesized agents and households. All three
data sources are produced by the US Census Bureau.
US Census Bureau TIGER Data. The TIGER (Topologically Integrated Geographic Encoding
and Referencing) data provide the spatial context for decennial Census data collection [4].
TIGER defines, among many other things, the boundaries of states, counties, Census tracts, block
groups, and blocks. Census tabulation data are aggregated into these various geographic
boundaries. The smallest Census geographic boundary for which the full suite of Census
variables (including socioeconomic variables) is available is the Census block group. In addition,
the TIGER files include data on boundaries of bodies of water and road networks, which we also
used in generating the synthesized database.
Summary File 3 (SF3) Data. The SF3 data contain the demographic variables from the Census,
organized and aggregated to many different geographic boundaries [5]. Data variables on
population and housing are available in these files.
Public Use Microdata Sample (PUMS). The PUMS data contain records representing a five
percent sample of responses to Census long-form questionnaires that retain family structure
information [6]. Data on households (including number of persons in the household, number of
bedrooms, age of building, access to telephone service, type of heating, mortgage data, and many
other variables) are provided. PUMS also provides data on individuals within each household
(including age, sex, ethnicity, language spoken, school enrollment, occupation, travel time to
work, military service, and many other variables). In addition, the PUMS data set maintains
linkages between individuals and households, thus allowing the household population structure to
be brought forward through further analyses.
The PUMS data are available for predefined Census areas known as Public Use Microdata Areas
(PUMAs). PUMAs are defined by each state, rather than by the US Census Bureau. The Census
Bureau requires that each PUMA contain about 100,000 persons, but otherwise, states have wide
8
latitude to define the shape and extent of each PUMA. PUMAs tend to be relatively small in
densely populated areas and relatively large in sparsely populated areas.
Household Size and Age Structure of the Synthetic NYC Population
Table SI-1 compares the age structure of the NYC synthetic population with the 2000 Census
data. Overall, the comparisons are close, with slightly more synthetic agents under 20 years of
age and over 70 years of age compared to the Census data. Note that the synthetic data only
represented household dwellers, whereas the Census data included persons living in group
quarters (prisons, university dorms, and nursing homes).
Table SI-1: Synthetic population age distribution compared to Census data for NYC
*
Age
Synthetic*
Census**
0-4
565,376
532,676
5-9
579,563
562,978
10-19
1,056,597
1,047,294
20-39
2,488,302
2,608,925
40-69
2,508,337
2,577,237
70+
649,290
593,334
Total
7,847,465
7,922,444
Household counts only
**
Household + non household counts
Table SI-2 compares the distribution of household sizes of the NYC synthetic population with the
2000 Census data, comparing household occupants only for both sources of data. The numbers
agree very closely.
9
Table SI-2: Synthesized household size compared to Census data for NYC
Number of
Model
Census
1
964,234
961,941
2
802,494
801,950
3-4
866,228
868,563
5-6
309,364
310,022
7+
80,154
80,001
Total
3,022,474
3,022,477
Household
Members
School Data and Allocation Model
Our model also depicted NYC schools and assigned persons of school age in the synthetic
population to schools using methods described below. To begin, school age individuals were
assigned to either public schools or private schools. The information about the type of school that
individuals might potentially attend is available from the 2000 Census of Population and Housing
Public Use Microdata sample. Cajka et al. describe these data and assignment methods [7].
Assignment methods depend on the school types.
Data Sources for Schools
To identify the schools in our study area and their location, we used a database that included all
public and private NYC schools. The school’s addresses were geocoded and used to assign
school aged children in the synthetic data to schools. The dataset included information on
numbers of students by school year, which we used to specify an age-specific student capacity for
each school.
The National Center for Education Statistics (NCES) [8] maintains downloadable files, available
at http://nces.ed.gov/ccd/bat, which contain information about all known public schools in the
10
United States. Using these files, we were able to retrieve enrollment data by grade for each US
public school, as well as additional information, including the school’s name, address, and NCES
School ID. We converted these data into a text file and sent it to Tele Atlas, North America, to be
geocoded according to the school’s address. With the Environmental Systems Research
Institute’s (ESRI’s) ArcGIS software product, we then used the latitude and longitude pairs in the
geocoded data set to convert the text file into a spatial data layer. We interactively processed this
spatial data layer to resolve any ambiguous geocoding results (e.g., address not found, Post
Office box address), using a variety of Internet mapping resources and aerial photography. We
also checked to ensure that, at a minimum, the school was located in the correct county. Spot
checking revealed that overall, schools were very well located. After the geographic locations
were checked, we loaded the data into a SQL Server database running ESRI’s ArcSDE
middleware.
Allocating Students to Public Schools
Determining which person attends what public school depends on the individual school district’s
specific goals and many factors are involved, such as geographic proximity, socioeconomic
characteristics, physical barriers, availability of busing, and politics. There is no one formula or
set of criteria that applies to school assignments nationwide. To create a process by which
students were assigned to public schools in NYC, we used the method described in Cajka et al.
[7]. The major underlying assumptions we used were:

Geographic proximity is a major criterion for making assignments.

Students are assigned to a school on the basis of distance along a network (roads) rather
than distance along a straight line.

Students attend school only in their Borough of residence.

Students are assigned to a school according to the school’s capacity for their grade.
11

No special allowances are made to assign siblings to the same school, other than the fact
that they shared the same geographic location and therefore should be assigned to the
closest school that had capacity for their grade levels.
We selected the LocateAllocate command from within the ArcPlot module of the workstation
version of ArcGIS to assign students to schools. One of the required parameters was specifying a
network along which the assignments could occur. In this case, the roads network connected
potential students to their neighboring schools. The US Census Bureau’s 2000 TIGER/Line files
supplied the roads data.
Allocating Students to Private Schools
The principal assumption in making private-school assignments was that while students could
attend a school anywhere within the NYC region, they would be more likely to attend a school
close to their residence. To start this process, we extracted persons aged 4 to 17 years whose
enrollment code in the Census data indicated they attended private school from the synthetic
population database, and we extracted private schools with total enrollment greater than zero
from the schools layer.
The Cajka et al.’s private-school assignment process divided the area around each private school
into three concentric rings. The first ring extended out from the school location to a distance of
10 km; the second ring started at 10 km and extended out to 15 km; and the third ring extended
from 15 km to 20 km. The assignment process used the ArcPlot Reselect command and selected
50 percent of the students from the first ring, 25 percent from the second ring, and 25 percent
from the third ring. We established these proportions after running the command several times
and after reviewing the results for assignment completion rates. Despite our having established
these proportions, at the end of the process not all privately enrolled students were assigned to a
private school. Students were unassigned because the private schools’ reached their capacity, or
because privately enrolled students lived more than 20 km from a private school that taught their
grade. To assign these students, we created a private-school post processing step that selected
unassigned students and assigned them to the nearest private school within 80 km that taught
12
their grade. This step made the allocation much more complete, although it did overfill some
schools. These assignments are shown in Table SI-3.
Assignment Results for Public and Private schools
There were 2,073 public and private schools we included in the NYC region with 1,400,430
attended by students of school age (5 – 17). Table SI-3 below identifies the total population of
students assigned by age within each of the five boroughs.
Our model also depicts NYC public schools, and assigns synthetic students to schools using the
methods described below. School age individuals were selected for assignment to public or
private schools. The information about type of school that individuals might potentially attend is
available from the 2000 Census of Population and Housing, Public Use Microdata sample. Cajka
et al. describe these data and assignment methods [7]. Again, assignment methods depend on the
school types.
We used a database that included all public and private NYC schools and had each school’s
addresses geocoded. We used this data to assign children of school age in the synthetic data to
schools. The dataset included information on number of students by school year, which was used
to specify an age-specific student capacity for each school. Again, as there is no one formula or
set of criteria applicable to school assignments nationwide, we therefore used the Cajka et al.
method to create a process to assign students to public schools in NYC [7].
The major underlying assumptions are:

Geographic proximity is a major criterion for making assignments.

Students are assigned to a school based on distance along a network (roads) rather than
distance along a straight line.

Students attend school only in their Borough of residence.

Students are assigned to a school according to the school’s capacity for their grade.
13

No special allowances are made to assign siblings to the same school, other than the fact
that they shared the same geographic location and therefore should be assigned to the
closest school that had capacity for their grade levels.
Data Sources for Schools and Assignments
As with the private school process, we used NCES downloadable files, these containing
information about all known public schools in the United States [8]. We used these files to
retrieve enrollment data by grade for each US public school, as well the school’s name, address,
and NCES School ID; we then converted these school data into a text file and sent it to Tele
Atlas, North America, where it was geocoded. Using ArcGIS, we converted the text file into a
spatial data layer, then interactively processed this spatial data layer to resolve any ambiguous
geocoding results (e.g., address not found, Post Office box address), using a variety of Internet
mapping resources and aerial photography. As with the private school process we performed
checks to ensure that the school was located in the correct county, and a spot check to ensure
schools were well located. After checking the geographic locations, we loaded the data into a
SQL Server database running ESRI’s ArcSDE middleware.
Roads
Again, we used the LocateAllocate command from within the ArcPlot module of ArcGIS to
assign students to schools. One of the required parameters was specifying a network along which
the assignments could occur. In this case, we used the roads network to connect students to their
neighboring schools, and the US Census Bureau’s 2000 TIGER/Line files provided the roads
data.
Private Schools
Our principal assumption in making private-school assignments was that students could attend a
school anywhere within the NYC region but would more likely attend a school closer to their
residence. To start this process, we extracted persons aged 4 to 17 years whose Census data
enrollment code indicated they attended private school from the synthetic population database.
We also extracted private schools with total enrollment greater than zero from the schools layer.
14
Again, the private-school assignment process divided the area around each private school into
three concentric rings starting at the school out to 10 km, from 10 km to 15 km, and from 15 km
to 20 km. The assignment process used the ArcPlot Reselect command and selected 50 percent of
the students from the first ring, 25 percent from the second, and 25 percent from the third.
We established these proportions after running the command several times and after reviewing
the results for assignment completion rates. Despite our having established these proportions, not
all privately enrolled students were assigned to a private school due to private schools’ reaching
their capacity or privately enrolled students living more than 20 km from a school that taught
their grade.
To assign these students, we created a private-school post processing step that selected
unassigned students and assigned them to the nearest private school within 80 km that taught
their grade. In general this step made the allocation much more complete, although it did overfill
some schools. The students to school assignments are shown in Table SI-3.
Assignment Results for Public and Private Schools
A total of 2,073 public and private schools were included in the NYC region with a total of
1,400,430 students of school age (5 – 17) in attendance. Table SI-3 below identifies the total
population of students assigned by age within each of the five boroughs.
15
Table SI-3: Student school Assignment Summary by Age, Borough and Private versus Public Schools
Age
005
047
061
081
085
Total
5-11
146,897
218,911
82,972
172,955
36,245
657,980
12-14
53,250
89,187
31,928
65,948
13,855
254,168
15-18
50,537
86,038
31,374
65,653
12,463
245,905
250,524
394,136
146,274
304,556
62,563
1,158,053
5-11
22,190
51,438
22,255
31,482
10,693
138,058
12-14
8,478
20,099
9,017
12,671
4,400
54,665
15-18
8,473
17,524
8,001
10,977
4,682
49,657
39,149
89,061
39,273
55,130
19,775
242,380
289,665
483,197
185,547
359,686
82,338
1,400,430
Public
Schools
Total
Public
Private
Schools
Total
Private
Total
All
The total of 1,400,430 students includes 1,158,000 public school students and 242, 000 private
school students. This compares favorably with a total elementary and high school enrollment of
1.3 million children reported for the years 2006-2008 by the US Census Bureau, American
FactFinder (http://factfinder.census.gov/servlet/NPTable?_bm=y&-geo_id=16000US3651000&qr_name=ACS_2008_3YR_G00_NP01&-ds_name=&-redoLog=false).
Workplace Data and Allocation Model
To create a network of employed agents linked to their places of employment, we needed to
specify an assignment process that established a realistic workplace social network in our study
16
area. Assigning workplaces required a different approach and data sources than the student to
school assignments. In particular, the underlying assumption that people live close to their
workplace is not the general case.
We needed two sources of information to implement this process: (1) a count of the number of
persons who lived in one Census tract but worked in another and (2) a count of firms by size by
the same Census tract. The US Census Bureau published the Census 2000 Special Tabulation
Product 64 (STP64), [9] that summarizes the number of persons by Census tract of work and
Census tract of residence, combined. In addition, the commercial company InfoUSA has
compiled the number of US firms by firm size category and Census block group. We used the
InfoUSA firm data to create synthetic workplaces and locate them at the centroid of the block
group indicated by the firm’s address. We then assigned workers to those workplace sites using
the STP64 data. Basically, we assigned workers who reported working in a specific block group
at random to a firm located within that block group. The workplaces also included schools,
hospitals and other types of institutions that could be used to specifically track special synthetic
agents such as teachers, health care workers, and others.
We also had to account for persons not assigned a specific work tract in the STP64 data in this
process. In most cases, the STP64 data file contained the number of persons who lived in one
Census tract and worked in another. However, some records listed the number of persons who
lived in one Census tract and worked somewhere in that same county. To assign these workers,
we apportioned them to tracts within the county in accordance with the number of other persons
working in those tracts. This meant that the tracts already employing the most people received
most of these additional workers. We also used basic US Census counts to calculate the
percentage of persons who were 15 to 54 years old and who were 55 to 74 years old for each
Census tract. This calculation allowed the program to assign workers proportionately in areas
with a larger proportion of older people (e.g., retirement areas) and in areas with a larger
proportion of younger people (e.g., college towns).
The NYC five borough region included 184, 617 workplaces with 2,856,727 employees. The
distribution of firm by size and borough is presented in Table SI-4. The distribution of firms by
size is:
17

4,857 with over 100 employees,

6,019 with 50 to 99 employees,

9,216 with 20 to 49 employees,

21,425 with 10 to 19 employees, and

143,100 with less than 10 employees.
Nearly 70% of these workers commute to work five to seven days a week by subway, by bus, by
their own vehicles, or as part of a car pool, but a significant percentage (> 16%) walk to work
[10].
Table SI-4: Distribution of Firms by Size and Borough
Size
Total
Bronx
Brooklyn
Manhattan
Queens
Staten Is.
1-9
143,100
12,590
30,003
66,291
28,425
5,791
10-19
21,425
1,614
5,184
9,886
3,807
934
20-49
9,216
847
1,896
4,551
1,598
324
50-99
6,019
580
1,297
2,811
1,101
230
100+
4,857
337
916
2,452
704
178
Total
184,617
15,968
39,296
85,991
35,635
7,457
The 184,617 firms in NYC employ a total of 2,856,727 workers that live in the five borough
region. This is more than 91.5 % of the total workers living in NYC.
Table SI-5 presents information on the distribution of NYC workers by borough of residence.
Approximately eight (8.09) percent of people who live in NYC work outside of the five borough
region, mainly in other NY state locations but also in New Jersey and Connecticut. The OtherTravel row in Table SI-5 represents the less than one half percent of persons who were traveling
beyond NYC and its suburbs when they filled out the STP64 part of the census survey. We
assigned these workers randomly to firms within their borough of residence.
18
In terms of risk, the eight percent of NYC workers that work outside of the city are at risk for
infection from their fellow workers. If transmission of influenza occurs at the workplace, they
would return to NYC and transmit the infection via their household, subway and community
mixing behaviors. Consequently, we assess the impact of including this source of transmission in
the Sensitivity Analysis section below.
Table SI-5: Distribution of Workers by Borough of Residence
Borough of
Number of
% working in
% of All
Residence
Workers
NYC
Workers
Bronx
214,978
7.53
6.89
Brooklyn
571,976
20.02
18.32
Manhattan
1,506,944
52.75
48.26
Queens
459,598
16.09
14.72
Staten Is.
103,231
3.61
3.31
NYC
2,856,727
100.0
91.50
Outside NYC
252,564
-
8.09
Other –Travel
12,680
-
.41
Total
3,121,971
-
100.0
Commuting Data
The number of commuters by their mode of travel is available from the 2000 US Census data.
We used these data to estimate the number of synthetic agents commuting to work by subway.
We obtained non-commuting patterns of travel from the New York Household Travel Patterns: A
Comparison Analysis [10], and the 1997/8 Regional Travel/Household Interview Survey [11].
The 2006 Community NYC Health survey identified characteristics of subway riders [12]. We
assigned each virtual agent a probability of using the subway based on the model developed by
Levine et al. [13] that linked agent traits to the incidence of subway ridership.
19
Because the STP64 does not identify the specific address of the workplace (beyond the block
group) the census population commutes to, we assigned the centroid location of the
administrative units used (census block groups) as the destination location. Using the location of
the household and the workplace destination, we constructed distributions of the distance
travelled (assuming block distance) to work shown in Table SI-6.
Table SI-6: Estimated Distance-to-Work Distribution Derived from Census Commuting Data and Synthesized
Household Locations
Maximum
Borough of Workplace
Average Miles
Bronx
8.430
42.929
Brooklyn
7.007
32.447
Manhattan
3.854
39.531
Queens
8.548
45.324
Staten Is.
11.068
42.626
Total
7.101
45.324
Distance
The estimate of 7.1 miles per trip compares favorably with the Travel 2001 survey estimate of
8.82 miles (See [10], Table 3.2b, page 3-7).
One important issue in the STP64 data is how the Census asks the question that is the source of
the commuting estimate. Respondents were asked to identify the place they spent the most time
working at in the previous week. This means that the US dataset contains data on regular
commutes to the individual’s typical workplace as well as occasional work-related trips. As work
trips lasting most of a week can be expected to involve longer distances than a typical commute,
one might attribute the greater than expected number of very long distance commutes to such
occasional work-related travel.
Our response was to reassign this small number of NYC workers that indicated employment
outside of the New York, New Jersey, and Connecticut region to firms located within the NYC
20
region. We also assumed that workers living in NYC but working in one of those three states
commuted to those states.
The 2006 Community NYC Health survey identified characteristics of subway riders [12]. We
assigned each virtual agent a probability of using the subway based on the model developed by
Levine et al. [13] that linked agent traits to the incidence of subway ridership.
2. Model Details
Details of the Transmission Model
Our transmission model is a stochastic, spatially structured individual-based simulation. Each
simulation weekday, agents moved among their respective households, their assigned workplaces
(or schools, depending on age), and various locations in the community, where they interacted
with other proximal agents. Agents interacted more frequently with agents with whom they had
closer relationships (e.g., family members, household members, classmates, and office mates).
Large firm employees interacted more closely with their office mates but also encountered people
who worked in different offices of the same firm. Workers in firms that have a single workplace
(office) repeatedly contacted the same people each day. On weekends, schools and many
workplaces were closed. This caused agents to suspend their activities but increase their
community interactions by 50%. A minority (20%) of employees continued to work on
weekends.
Following the approach of Longini et al. [14], we described transmission within each contact
group by a contact transmission probability ci (Table SI-7), which depended on the age of both
the infectious and susceptible persons. This contact probability represented the likelihood (within
each 24-hour simulated time period) of two individuals in a distinct social setting having a
contact of duration and closeness sufficient to transmit an infectious dose of influenza virus. The
different settings in Table SI-7 are the important social networks represented in the model and
consist of households, schools, classrooms within schools, firms (corporate level), workplaces
within firms, subway commuters, and non-commuter subway riders.
21
A community social network represented a “catchall” class of contacts that portrayed all other
contacts not explicitly represented by the other categories. Because the Longini et al. and
Ferguson et al. models do not consider subway riders as distinct social networks, the set of
community contacts included in our model excludes the set of contacts that would occur while
riding the subway. The probability of transmission, given a specific contact, P, is a single number
that multiplies each contact probability. The degree of contagiousness of P can be linked to the
basic reproductive number, R0, without modifying the underlying social interaction network
parameters assumptions. We did not allow for any seasonal or weekly effects but we did
represent weekend effects by reducing or eliminating contact rates in some social networks
(workplaces, subway commutes and schools) and increasing them in other networks (households
and students within the community network). No births or nonflu-related deaths were
represented.
The model computed the probability of infection for each susceptible individual each day based
on the transmission probabilities for each potential infectious contact, pi(a1,a2) = P × ci(a1,a2)
where a1 , a2 = a, an adult, or c, a child and pi(a,c) = the probability of an infectious child flu
transmitting to a susceptible adult. If the infectious contact was receiving treatment of some type,
this transmission probability was multiplied by (1 – Ti), where Ti is the treatment efficacy for
infectiousness. The transmission probability pi can be further reduced for asymptomatic (yet
infectious) contacts. The probability of a susceptible agent becoming infected in a given day Pt
was computed as a product of all of the possible infectious contacts that occur each day.
The probability that this susceptible adult becomes infected by a child is Pt = 1− (1− phh(a,c)) (1pwp(a,a)) (1-ps) (1-pc)3 where c and a denote child and adult, respectively. A Bernoulli trial
results by generating a random number with two possible outcomes; transmission and notransmission. Thus, if the random number generated was less than Pi, the susceptible adult
became infected and entered the infection stage. The source of infection could be determined by
tracking the first occurrence of infection that occured from the sequence of individual
contributions of each infectious contact. In the case above, all six infectious contact settings had a
finite probability of being identified as the source. To assess the bias resulting from the order of
22
social network processing, we varied the order as part of the sensitivity analysis assessment and
noted that it had a negligible impact.
Ccommunity transmission depended explicitly on distance, as it represented random contacts
associated with travel within the region for shopping, attending recreational events, personal
visits, and the like.
Table SI-7: Transmission Parameters
Transmission
Contact Group
Infected
Susceptible
Household
Adult
Adult
0.4
Household
Child
Adult
0.3
Household
Adult
Child
0.3
Household
Child
Child
0.6
Elementary School
Student
Student
0.0435
Middle School
Student
Student
0.0375
High School
Student
Student
0.0315
Workplace
Adult
Adult
0.0575
Community
All
All
0.0048
a
Contact transmission probability estimates ci from Longini et al [14].
Probabilitya
23
Infection seeding
Initially, we seeded each run with 10 infected cases selected at random. Additional scenarios
explored the effects of changing the number and the timing of introducing new influenza cases to
NYC and changing the initial seed from 10 to 100 cases. These changes did not affect the general
findings.
We also examined the effects of adding additional daily and weekly seeds reflecting the infection
being imported by workers living in NYC but working outside of the city. These additional seeds
induced a more aggressive epidemic with a higher peak (5,000 additional cases) that occurred a
few days earlier, and higher cumulative infections (4.4%) than the epidemic without the external
seeding patterns. Figure below SI-7 presents this comparison.
Interventions
We examined a small number of different interventions to evaluate the effect of target strategies
intended to restrict disease transmission on the subway only. We then compared these policies
with a more realistic intervention applied to the total NYC population. Because we do not
represent the bus riding agents explicitly (they are included in community action), we do not
examine explicitly the effects of reducing or eliminating subway service because that would
result in subway riders switching to an alternative mode of travel. Instead, we simulated the
“perfect” intervention by eliminating all infections occurring to all subway riders by other riders.
This would illustrate the effect of any completely effective subway intervention and would
provide a frame of reference against which all related interventions could be compared.
Also, we did not simulate school closure, as that has been studied by others; see Lee et al. [15].
Instead, we focused on personal hygiene and social distancing behaviours as well as vaccination
policies to explore reducing transmission for a fixed level of contacts. The policies we
investigated are:
24
Hand Washing, Microbial use, and Mask Wearing on subways: We investigated the
collective effects of restricting contacts only on subways. We do not argue that these are realistic,
well crafted interventions. Instead, we investigated whether they would be effective enough to
pursue their adoption. In the first instance, we assume that some combination of Hand Washing,
Microbial Applications and Mask Wearing that specifically targeted subway riders would reduce
the effective number of contacts by a fixed percent. We further assumed a 10%, 20% and 30%
reduction in transmissions on subways. The impact of these assumptions was very low, as shown
in Figure SI-2.
Figure SI-2: Impact of Reducing Contacts on Subways for a R0 = 1.4 Epidemic
Incident
Infections
120000
100000
Subway 100
80000
Subway 30
Subway 20
60000
Subway Int 10
40000
20000
0
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
Days
Days
The subway-only interventions had a small effect on the peak daily infection rate and the
cumulative number of infections. These are further illustrated in Table SI-8. Even the totally
effective subway targeted intervention dropped the peak from 104,944 to 84,604 (19%) and the
cumulative infections from around 2,600, 000 to 2,270, 000 (12%). The principle point is that
because of the relatively small portion of total infections that occur via the subway mixing
process, subway targeted interventions can only have a limited effect on containing an epidemic.
25
Table SI-8: Effect of Subway Specific Contact Reducing Interventions
Contact Reductions %
Peak
Subway Infections
Total Infections
0
101,557
114,377
2,596,176
10
100,354
100,427
2,564,504
20
97,974
87,434
2,512,033
30
88,110
74,232
100
84,604
0
2,491,400
2,271,697
Hand Washing, Microbial use, and Mask Wearing in the Community: We also simulated
contact reducing effects within the general community by assuming contact reductions occurred
on the subway as well as in the community at large. We also assumed that the same level of
workplace contacts occur within the workplace, the home and the school. As above, we
investigated the effects of a 10%, 20% and 30% reduction in contacts, with are shown in Table
SI-9.
In these scenarios, the interventions were broader and directed at a larger population and a
population that also rides the subway. These larger impacts are best illustrated by noting that
even a 10% Contact Reducing intervention in the community sector drops the peak infection by
19% and cumulative infections by 11%, which is as large as the “totally effective” subway
targeted intervention.
Table SI-9: Effect of Contact Reducing Interventions
Contact Reductions %
Peak
Subway Infections
Total Infections
0
101,557
114,377
2,596,176
10
82,092
80,167
2,315,767
20
75,047
75,047
2,060,896
30
53,276
64,093
1,824,884
26
Vaccination programs: We also evaluated a low compliance vaccination program as a
potentially interesting intervention. We were motivated to analyze vaccination because Levine et
al. [13] reported that subway commuters specifically have lower vaccination rates than other
segments of the population. Here we did not specifically target subway riders, nor limit our focus
to contact behaviours outside of the workplace and school. Instead, we simulated the effect of
targeting all adults with a 10%, 20% and 30% vaccination rate applied on day 14 of the epidemic
to all adults. The efficacy rate of the vaccine was assumed to be 80%.
The infection curves are shown in Figure SI-3. The curves indicate a steady decline in cumulative
infections (2, 475, 566; 2, 362, 695; 2, 268, 802) with an increase in vaccination rate (10%, 20%,
30%). The proportion of infections that occurred on the subway were consistently just above 4%
of the total.
Figure SI-3: Impact of Vaccinating NYC Adults
Incident
Infections
120000
100000
80000
Vaccinate - 10%
Vaccinate - 20%
60000
vaccinate 30%
40000
20000
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
0
Days
27
Computational resources
The simulation was written in C++ using the GNU compiler. The US simulation used
approximately 15GB of RAM, and each realization ran in 10-15 minutes on a single CPU
Opteron 854 based server.
3. Natural history and transmission parameters
We employed the influenza natural history model described by Longini et al.; see [14] and [17].
This model has infected people progressing through the latent state (mean, 1.2 days) followed by
an infectious state (mean, 4.1 days), after which they recover with immunity or die. The
incubation period is longer than the latent period, so that people who are infected and develop
influenza symptoms will do so on average 1.9 days after infection. A portion of those infected
(33%) do not develop symptoms. The probability distributions of the latent, incubation, and
infectious periods are shown in Figure SI-4. The 67% of infected people who develop influenza
are twice as infectious as those without influenza symptoms. Additionally, this model withdraws
a significant fraction of symptomatic persons from all of their mixing groups except household
family members.
28
Figure SI-4: Latent Density (days 1-3) and Incubation Density (days 4 – 7)
Percent
50
45
40
35
30
25
20
15
10
5
0
1
2
Latent Density (days 1-3)
3
4
5
6
7
Days
Incubation Density (days 4 – 7)
The natural history model assumes constant infectiousness from the end of infection latency to
recovery. Others, including Ferguson et al. [16], use an incubation function distinct from the
Longini et al. model, in that variable infectiousness is incorporated. Their estimated generation
time for human influenza of Tg=2.6 days is shorter than the function used by the Longini [14]
model.
Pandemic influenza model parameterization
The potential pandemic influenza strain was assumed to have the age-dependent attack rate
pattern of the historical 1957-8 “Asian” influenza A (H2N2), see Longini et al. [14].
29
Accordingly, we calibrated our model using the Ferguson et al. approach from historical (1957–
58, 1968–69) influenza pandemics. Our calibration targets followed the guidelines established by
others, [15], [16], [17], and [18]. We specifically used the 30–70 rule developed by Ferguson et
al. in [16] in which 70% of all transmission occurred outside the household—33% in the general
community and 37% in schools and workplaces.
The additional requirement that transmission rates in schools are double those in workplaces was
a sensitivity analysis target in [16] as well as our NYC model. Calibrating the model involved
targeting an epidemic with a 33% attack rate (AR) consistent with the age specific parameters
derived from the 1957-58 pandemic. Daily contact rates were treated as endogenous parameters
and were interpreted as the daily contact rates that reproduced a pandemic with a 33% AR in a
population with no acquired immunity and satisfied the 30–70 rule. Therefore, our estimated
contact patterns produced a NYC epidemic designed to be similar in transmissibility to the 1957–
58 epidemic with an AR of 33% and a basic reproductive rate (r0) of approximately 1.4. Table
SI-10 lists the estimated number of contacts per day we applied per social network category.
These results offer an additional source of validation information because if daily contacts
estimated by some other source were available, they could provide an additional source of
confirming (or not) information.
We also assumed that 50% of sick students and workers stayed at home and did not interact with
anyone outside of the household. Our workplace absentee rate is consistent with other models.
However, we used a school absentee rate that is generally lower than other models (Ferguson et
al. [16] use a 90% absentee rate). Additionally, we assumed that student community and adult ⁄
community contacts increased by 50% on weekends.
30
Table SI-10: Estimated Person-to-Person
Contact Values (Contacts per Day)
Mean Contacts Per
Place
Participant
Day
Social Network
*Within School
Student
14.98
School
*Per firm
Worker
1.84
Workplace
*Subway
Worker (commuter)
33.88
Subway
*Subway
Non-Worker
6.75
Subway
*Community
Non-Student
34.80
Community
*Household
All
.9221
Household
**Classroom
Student
29.96
School
Student
7.50
Community
School
Student
11.24
Community
**Per office
Worker
3.68
Workplace
**Community weekday nonSchool
**Community weekend non-
1
Daily contact probability per person.
* Estimated.
**Based on estimates in rows 1-6.
Contacts within Household
We observed that daily mutual contacts between each member of the family overstated
transmission within households. This made calibration to the 30-70 target criteria impossible
unless within household contacts were restricted to between 2 to 5 contacts per week per
interacting pair. Accordingly, this parameter was treated as an endogenous variable in our model
and was estimated as part of the calibration process.
31
Estimation of R0 from inter-pandemic influenza household data
We calculated the value of R0 by two different methods. The first was to average the number of
secondary infections that resulted from the agents used to seed the epidemic. The difficulty with
this approach is that the sample size is too small and led to statistical errors too great to determine
R0 within a small (0.1) rate of precision.)
The second method is an approximation based on the slope of the cumulative number of cases;
see Lipsitch et al [19]. In this method, the reproductive number R(t) = 1 + λ γ + f (1-f) (λ γ)2
where λ
f is the
relative duration of the latent period (i.e., 1.2 days / 5.3 days for our model), and γ
derivative of the logarithm of the cumulative number of cases N, i.e., γ
N)] / dt. We then
calculated the basic reproductive number as a function of time. Although there are large
oscillations at early times due to the larger statistical errors (from fewer cases), it is clearly
noticeable that R(t) is largest early in the epidemic and drops off rapidly as the epidemic
progresses. This is related to school children being particularly important spreaders in the initial
stages of an influenza outbreak due to their strong household and school interactions.
The first R0 calculation method can only calculate an averaged static value as an estimator,
whereas as the second method can give information about the time development of R(t). Here,
early-time fluctuations and enhancing the reproductive number clearly demonstrated the
difficulty in measuring this quantity from available data in a real epidemic. Furthermore, this
behavior is even more complicated by the spatiotemporal spread of the epidemic, which causes
local variations of R(t) in time.
Assumptions about Population Behavior during a Pandemic
During a pandemic, researchers naturally assume that various spontaneous behavior changes
would occur, including temporary school and workplace closures, once disease incidence exceeds
some threshold. We know that during past pandemics, schools and theatres closed, some mass
gatherings were cancelled and healthy adults stayed at home to look after sick family members;
32
see Barry [20]. However, even after the H1N1 epidemic of 2009, there is little or no data that
researchers can use to accurately parameterize changes in contact rates. Accordingly, we adopted
a baseline epidemic scenario that assumed no such behavioral changes – and then we examined
social distancing and other interventions against this baseline.
The baseline scenarios used here should therefore be viewed as worst cases in terms of the speed
with which the epidemic peaks and therefore the consequent peak height. While early spread may
occur at the maximal transmission rates seen in past pandemics, later spread may be slowed by
spontaneous behavioral changes that reduce peak incidence and extend the duration of the
epidemic rather than reduce cumulative attack rates.
We assumed clinical disease affects individual behavior. Thus, while we assumed that clinical
cases are twice as intrinsically infectious as non-clinical cases, we also assumed that 50% of
symptomatic individuals reduced their school, workplace and community contact rates by 50%.
This represented the effect of sickness-related absenteeism and withdrawal to the home. We were
unable to find data from past pandemics to estimate these contact rate reductions.
Transmissibility Scenarios
In light of the analysis above, we focused on two transmissibility scenarios: (1) a 1957-58 type
epidemic with a R0 =1.4, and (2) a 2009 scenario of R0 =1.2, which represented the 1957-58
epidemic but with acquired immunity levels consistent with those reported by Miller et al. [21] in
England for the recent H1N1 epidemic (see Figure SI-5).
33
Figure SI-5. Transmissibility Scenarios: with and without Immunity
Incident
Infections
120000
100000
80000
With Immunity
60000
Without Immunity
40000
20000
1
12
23
34
45
56
67
78
89
100
111
122
133
144
155
166
177
188
199
210
0
Days
Sensitivity Analyses
Worker/Student Ratio
As indicated above, we made assumptions about the proportion of transmission that occurs in
schools and workplaces, because there were no data to allow us to estimate these parameters. For
the NYC model, our baseline assumption was that 37% of transmission occured in these contexts,
the within-school transmission coefficient being twice that of the within-workplace coefficient.
However, this choice was arbitrary. We tested the sensitivity of the calibration scheme against an
alternative calibration method also described by Ferguson et al [17]. This test investigated the
alternative 30-70 calibration rule that assumed equal transmission rates in schools and
34
workplaces (in contrast to the rule that assumed a 2 to 1 transmission ratio of schools and
workplaces.
Figure SI-6 depicts both alternatives applied to a 1957-58 type, R0 = 1.3 epidemic. The
assumption of equal transmission rates produced flatter infection curves with lower peak attack
rates and small shifts to the right, but with only a slight reduction (1%) in cumulative incidence.
The more rapid spread in the school level population by the 2 to 1 epidemic is in contrast with the
1 to 1 epidemic in Table SI-11. The age specific attack rate columns contrast the differential
spread in school aged children presented for each assumption. The higher student transmission
rates fostered by the 2 to 1 assumption epidemic results in infections spreading earlier and faster
in children, but the epidemic also dissipates faster. There is no clear evidence for favoring one
alternative over the other.
However, because an epidemic incorporating a 2 to 1 transmission peaks more rapidly, policy
makers may find it a more difficult scenario to contend with, and for this reason and no other we
assigned it as the baseline scenario.
Figure SI-6. Comparison of Calibration Rules for R0 = 1.4 Epidemics
Incident
Infections
120000
100000
80000
Baseline 2-1
60000
baseline 1-1
40000
20000
1
14
27
40
53
66
79
92
105
118
131
144
157
170
183
196
209
0
Days
35
Table SI-11: Age Specific Attack Rates (AR) for Alternative Calibration Methods
Low
High
Infected1
AR1
Infected2
AR2
Pop
0
4
129,587
22.92
125,042
22.12
565,376
5
14
778,264
69.26
661,243
58.85
1,124,576
15
24
385,727
36.69
390,984
37.18
1.051,279
25
44
735,965
28.96
798,454
31.42
2,541,471
45
64
403,019
24.25
335,045
20.16
1,662,265
65+
-
162,656
18.02
172,872
19.15
902,498
-
-
2,595,218
33.07
2,593,640
33.05
7,846,465
1
2 to 1 transmission rate assumption
2
1 to 1 transmission rate assumption
Transmissions by Workers Employed Outside of NYC
Workers living in NYC but who work in one of the three surrounding states were assumed to
commute to those states. These workers would likely be at risk from a comparable epidemic;
some of them would become infected at their workplace and possibly infect their family members
upon returning home. We used two basic assumptions. Our first ignored this possibility and
assumed that no external infections would occur. In the second, we modified our seeding
submodel and seeded new infections that occurred at a rate comparable to the main epidemic.
The different curves produced by the two assumptions are presented in Figure SI-6.
We used a concept devised by Ferguson et al. [16] to seed the infection in the NYC region. We
modeled the epidemic outside of NYC with a simple deterministic SEIR model [22] and with the
same secondary infection rate ( R0 =1.4).
The simple model calculated the incidence of infection through time, I(t). The expected number
of imported infections per day, M(t), is then given by
36
M (t ) 
nL
I (t ) , where n = 252,564 is the number of workers working outside of NYC, L=1.2
365 N
days is the average latent period of the disease (we assumed symptomatic individuals would not
travel to work) and N=19.1106 is the population of the greater metropolitan NY region.
Figure SI-7 shows the baseline case with and without the effects of workplace infections that
occur in workers employed in places outside of the five borough region. The additional infections
caused a slightly more rapid epidemic with a slight increase of 3% in cumulative infections.
Within-place Group Structure and Targeting
In a manner similar to the scenario reported by Ferguson et al. [17] we incorporated into our
model a group structure within the definition of schools and firms to represent school classes or
groups of closely working colleagues in a workplace. We assumed that the school classroom and
the workplace are social units in which the same people see each other every school/work day in
relatively confined spaces. We also assumed that they are social units that might be targeted by
control policies. While everyone in a school or workplace can contact any other person in that
location, we assumed higher contact rates between people in the same group than between people
in different groups.
We also assumed that contacts in schools and firms were double those of classroom and
workplace settings. This assumption produced slightly larger epidemics and overall had very
little impact on baseline dynamics.
37
Figure SI-7. Comparison of Baseline Epidemic with and without non NYC Workplace Infections
Incident
Infections
120000
100000
80000
Baseline 2-1
60000
External Seeds 2-1
40000
20000
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
0
Days
Proportion of Infections that become Clinical Cases
We assumed 67% of infections would be clinically severe enough for individuals to potentially
seek healthcare and thus be defined as a clinical case. We also assumed these individuals to be
more infectious than non-cases and more likely to be absent from school or work. This is
consistent with Longini et al. [15] and the methods described in Halloran et al. [23].
Other authors [16] assume that half of infections become cases. Making the same assumption for
our model decreased baseline cumulative clinical attack rates from 33% for R0 =1.4 to 30%, or an
8% drop in cumulative infections. Consequently, we believe, based on this modest change in
cumulative infections, that assuming 67% of infections, while producing a more intense
epidemic, also produces one that responds to interventions more dramatically and consequently
could provide a more optimistic assessment regarding the feasibility of certain control measures.
38
Figure SI-8. Comparison of Epidemic Curves for the 67% versus 50% Symptomatic Case Assumption
Incident
Infections
120000
100000
80000
Asym 67%
60000
Asym 50%
40000
20000
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
0
Days
Figure SI-8 shows the effect of the change in assumed asymptomatic case rate for a R0 =1.4
epidemic and all other parameters identical. Epidemics were seeded on day 0 with 10 infections.
Behavior of Symptomatic Individuals
Our default assumption was that half of the 67% children who are severely symptomatic (i.e.,
clinical cases) will stay home from school, and that half of the symptomatic adults will also.
Additionally, we assumed that symptomatic individuals have a 50% reduced community contact
rate. We have found no data with which to verify these assumptions, particularly in a pandemic
setting.
To check whether these assumptions significantly affected model projections, we raised this rate
to 75% (from 50%) as well as lowering it to 25% in all cases. Therefore, we generated
simulations that assumed 75% (and 25%) of symptomatic adults were absent from work, and that
39
community contact rates were reduced by 75% (25%) for all symptomatic individuals. Keeping
R0 fixed, the effect of these changes is shown in Figure SI-9 and represents a swing of
approximately 6% change in cumulative attack rates.
Figure SI-9. Comparison of Epidemic Curves for the 25%, 50% - Baseline, 75% Stay Home assumptions.
Incident
Infections
140000
120000
100000
Retire to home 25
80000
Retite to home 50
60000
Retire to home 75
40000
20000
1
10
19
28
37
46
55
64
73
82
91
100
109
118
127
136
145
154
163
172
181
190
199
208
0
Days
Parameter Estimation Strategy Summary
In our approach, we:

obtained ridership data from published sources;

used this data to create two separate subway rider groups: commuters and others in which
being represented in both groups is possible;

treated the total number of subway commuters as a fixed (exogenous) variable;

assumed that classroom daily contacts were twice the estimated school contacts and that
workplace contacts were twice the office/firm contacts; and
40

treated the set of contacts as endogenous variables and “fit” the epidemic function to
reconstruct disease spread with the properties of a 1957-58 pandemic occurring in present
day NYC.
The criteria we use to estimate the contacts was the sum of the absolute difference between a
target value and a model estimate of the following 6 response variables:

overall Attack Rate (a proxy for r0),

the proportion of subway riders that are commuters,

source of infections – a proportion: households,

source of infections – a proportion: schools,

source of infections – a proportion: workplaces, and

source of infections – a proportion: community.
The 6 exogenous contact variables we estimate to minimize (5) are:

daily workplace (flu transmissible) contacts,

daily school contacts,

daily community contacts,

daily subway contacts by commuters,

daily subway contacts by non-commuters, and

daily household contacts,
We included the probability that a NYC resident rides the subway for non-commuting purposes.
The estimated set of contacts is intended to reproduce the 1957-58 pandemic in a NYC scenario
and constitute the baseline run.
The assessments of interventions are based on the set of contacts that define the baseline
epidemic.
41
Estimates of the Proportion of Infections that Occur on the Subway
A major question we sought to address is “what is the role of the subway in influenza
transmission?” Clearly, our estimate was based on many assumptions we incorporated in the
model logic. Our model provides a method for estimating the answer to this question, but it is an
estimate with substantial variance. We estimated the variance of the contact estimates based on
the Monte Carlo process and found the estimates to be overly optimistic (i.e., small) because they
provided an estimate based in a fixed set of assumptions. To provide an estimate that depends on
changing assumptions, we present summary results in Table SI-12, which records the proportion
of infections occurring on the subway for each of the assumptions we used in the sensitivity
analysis assessment. Two potential sources of sensitivity not discussed above are included. These
new sources are the probability of infection given contact with an infected person = .0048 (as
defined in Table SI-7) plus and minus 10% and the proportion of commuters that ride the subway
= .47 (see Table 3) plus and minus 10%. The average proportion of total infections that occur on
the subway is 4.29 percent and the 99 percent confidence interval is 4.06 to 4.54.
In summary, we estimated the proportion of total influenza cases that transmit while riding the
subway to be 3.5 to 5.0%.
42
Table SI-12: Estimates of Subway Rider Infections
Proportion of
Subway
Total
153,084
2,598,059
4.38
Baseline 2-1
135,091
2,599,910
4.73
Baseline 1-1
164,442
2,733,654
4.10
Baseline 2-1 + WP Seeds
133,982
1,592,351
3.66
R0 = 1.2
122,236
2,430,375
4.09
Stay Home 75%
184,126
2,753,788
4.67
Stay Home 25%
125,789
2,365,661
4.26
Asymptomatic = .5
90,736
2,314,707
3.92
Transmission probability – 10%
133,908
2,838,113
4.72
Transmission probability + 10%
117,325
2,601,461
4.51
108,807
2,596,812
4.19
Subway Infections
Assumption
Estimate of commuter subway
riders + 10%
Estimate of commuter subway
riders – 10%
43
References
Beckman RJ., Baggerly K, McKay M. Creating synthetic baseline populations. Transportation
Research Part A: Policy and Practice. 1996; 30(6): 415-429.
Centers for Disease Control and Prevention (CDC). Serum cross-reactive antibody response to a
novel influenza A (H1N1) virus after vaccination with seasonal influenza vaccine. MMWR.
2009;58(19):521-4.
Wheaton, WD, Cajka, JC, Chasteen, BM, Wagener, DK, Cooley, PC, Ganapathi, L. Synthesized
population databases: A U.S. geospatial database for agent-based models. RTI Press. 2009;
http://www.rti.org/pubs/mr-0010-0905-wheaton.pdf
US Census Bureau; Department of Commerce, Economics and Statistics Administration. 2000
Topologically Integrated Geographic Encoding and Referencing (TIGER) system. Washington,
DC: US Census Bureau; 2005.
US Census Bureau; Department of Commerce, Economics and Statistics Administration. 2000
Census of population and housing, summary file 3. Washington, DC: US Census Bureau; 2005
March.
US Census Bureau; Department of Commerce, Economics and Statistics Administration. 2000
Washington, DC: US Census Bureau; 2005 Dec.
Cajka, JC, Cooley, PC, Wheaton, WD. Attribute Assignment to a Synthetic Population in
Support of Agent-Based Disease Modeling RTI Press. 2010; http://www.rti.org/pubs/mr-0019-1009cajka.pdf
National Center for Education Statistics. Common Core of Data Build a Table [Internet].
Washington, DC: US Dept. of Education [date unknown] [cited 2010 Sep 3]. Available from
http://nces.ed.gov/ccd/bat.
44
US Census Bureau. Census 2000 special tabulation: Census tract of work by Census tract of
residence (STP 64) [CD-ROM]. Washington, DC: US Dept. of Commerce, Economics and
Statistics Administration; 2004 [cited 2010 Sep 3]; Available from:
http://www.census.gov/mp/www/cat/decennial_census_2000/census_2000_
special_tabulation_census_tract_of_work_by_
census_tract_of_residence_stp_64.html
Hu PS, Reuscher TR. New York Household Travel Patterns: A Comparison Analysis. Office of
Transportation Policy and Strategy, New York State Department of Transportation, prepared by
Oak Ridge National Laboratory. 2007. ORNL/TM-2006/624.
New York City Department of City Planning. Changes in Employment and Commuting Patterns
among Workers in New York City and the New York Metropolitan Area, 2000-2007. 2008;
http://www.nyc.gov/html/dcp/pdf/census/census_commute_patterns0007.pdf.
New York City Department of Health and Mental Hygiene. 2006 Community Health Survey.
http://home2.nyc.gov/html/doh/html/survey/survey-2006.shtml. Accessed September 16, 2010.
Levine, B, Wilcosky, T, Wagener, D, Cooley, P. Mass commuting and influenza vaccination
prevalence in New York City: Protection in a mixing environment. Epidemics. 2010;
doi:10.1016/j.epidem.2010.07.002.
Longini I, Nizam A, Xu S, et al. Containing pandemic influenza at the source. Science.
2005;309:1083-1087.
Lee BY, Brown ST, Cooley P, Potter MA, Wheaton WD, Voorhees RE, et al. Simulating School
Closure Strategies to Mitigate an Influenza Epidemic. J Public Health Manag Pract.
2010;16(3):252-61.
45
Ferguson NM, Cummings DA, Fraser C, Cajka JC, Cooley PC, Burke DS. Strategies for
mitigating an influenza pandemic. Nature. 2006;442(7101):448-52.6
Germann T, Kadau K, Longini IJ, Macken C. Mitigation strategies for pandemic influenza in the
United States. PNAS. 2006;103(15): 5935-5940.
Ferguson N, Cummings D, Cauchemez S, et al. Strategies for containing an emerging influenza
pandemic in Southeast Asia. Nature. 2005;437:209-214.
Lipsitch, M., Cohen, T., Cooper, B., Robins, J. M., Ma, S., James, L., Gopalakrishna, G., Chew,
S. K., Tan, C. C., Samore, M. H., et al. (2003) Science 300, 1966-1970.
Barry, J. M. The Great Influenza: The Epic Story of the Deadliest Plague in History (Penguin,
2005).
Miller E, Hoschler K, Hardelid P, Stanford E, Andrews N, Zambon M. Incidence of 2009
pandemic influenza A H1N1 infection in England: a cross-sectional serological study. Lancet.
2010; 375(9720): 1100-1108.
Anderson, R. M. & May, R. M. Infectious Diseases of Humans; Dynamics and Control (Oxford
University Press, Oxford, 1991).
Halloran, E., M., Eubank, S., Ferguson, M., N., Longini, M., I., Barrett, C., Beckman, R., Burke, S., D.,
Cummings, A., D., Fraser, C., Germann, C., T., Kadau, K., Lewis, B., Macken, A., C., Vullikanti, A.,
Wagener, K., D., & Cooley, C, P. (2008). Modeling targeted layered containment of an influenza
pandemic in the USA. Proceedings of the National Academy of Sciences, published online.
Download