Slides - Center for AIDS Prevention Studies (CAPS)

Network Scale-Up to Estimate the
Population Size of High-Risk Groups for HIV
Methods Core Seminars – Center for AIDS Prevention Studies/UCSF – 20 Sep. 2013
Ali Mirzazadeh MD. MPH. PhD.
Institute for Health Policy Studies / Global Health Sciences Institute
UCSF, San Francisco, CA, USA [ali.mirzazadeh@ucsf.edu]
Regional Knowledge Hub, and WHO Collaborating Center for HIV Surveillance,
Kerman University of Medical Sciences, Kerman, Iran [ali.mirzazadeh@hivhub.ir]
This presentation has the following parts
•
•
•
•
P1: PSE methods overview
P2: Network Scale-up method overview
P3: Network size estimation
P4: Correction for biases in NSU
2
Part 1
PSE methods overview
3
Why do MARP size estimates?
• Know, track, and predict your epidemic
– Disproportionate impact in low level,
concentrated, and generalized epidemics
• Program planning
– Advocacy, development, M&E
• Because you were asked to
– UNAIDS, UNGASS, PEPFAR, MOH
• Resource allocation
– Right population, right priority, right amount on
right programs
4
How to do MARP size estimates?
Scientific rigor
• There is no gold
standard, no census
• We do not know which
method is best
• We are not able to fully
calibrate or correct
• Many methods
Cost
5
Census
Population-based survey
Network scale up
Oil wells
Multiple sample recapture
Done with
Capture-recapture
surveys of
Plant recapture
Unique object multiplier
MARPs
Truncated Poisson
Multipliers, multiple multipliers
Unique event multiplier
Mapping with census and enumeration
Nomination counting
Place, RAP, ethnography
Registries, police, SHC, drug treatment, unions, workplace
Wisdom of the crowds
Delphi
Consensus
Discrepancies
Done by literature review,
Soft modeling
experts, stakeholders, models
Borrow from thy neighbor
Conventional Wisdom
Straw man
Scientific rigor
Done in surveys of the general
population
Cost
6
Direct Methods
Done with surveys of MARPs
7
Direct questions to population-based surveys
Strengths
Weaknesses
• Surveys are familiar
• Easy if a survey is
underway
• Straightforward to
analyze
• Sampling is easy to
defend scientifically (“gold
standard”)
• Low precision when the
behaviors are rare
• Respondents are unlikely to
admit to stigmatized behaviors
• Only reaches people residing
in households
• Privacy, confidentiality, risk to
subjects
Mirzazadeh A, Haghdoost AA, Nedjat S, Navadeh S, McFarland W, Mohammad K. Accuracy of HIVRelated Risk Behaviors Reported by Female Sex Workers, Iran: A Method to Quantify Measurement
Bias in Marginalized Populations. AIDS Behav. 2013 Feb;17(2):623-31
8
Census and enumeration
Ali Mirzazadeh, Faran Emmanuel, Fouzia Gharamah, Abdul Hamed Al-Suhaibi, Hamidreza Setayesh, Willi
McFarland, Ali Akbar Haghdoost; HIV prevalence and related risk behaviors in men who have sex with men,
Yemen 2011; AIDS Behav. 2013 Jul 23. [Epub ahead of print]
9
Census and enumeration
Strengths
Weaknesses
• It is a real count, not an estimate
• Can produce credible lower limit
• Can be used to inform other
methods
• Use in program planning,
implementation, evaluation
• At-risk populations hidden,
methods miss some members
(China: multiply by 2 – 3!)
• Stigma may cause members to
not identify themselves
• Time-consuming and expensive
• Staff safety
• Subject safety
10
Capture-recapture
11
Capture-recapture
Strengths
Weaknesses
• Relatively easy
• 4 conditions hard to meet:
• Does not require much 1) two samples must be
data
independent , not correlated
• When no other data or 2) each population member should
studies are available
have equal chance of selection
3) each member must be correctly
identified as ‘capture’ or
‘recapture’
4) no major in/out migration
12
Nomination method
S Navadeh, A Mirzazadeh, L Mousavi, AA Haghdoost, N Fahimfar, A Sedaghat; HIV, HSV2 and Syphilis
Prevalence in Female Sex Workers in Kerman, South-East Iran; Using Respondent-Driven Sampling Iran J
Public Health. 2012 Dec 1;41(12):60-5. Print 2012.
13
Nomination method
Strengths
Weaknesses
• Relatively easy
• Snowball or chain
sampling methods
• Need the target group to be
connected/network
•Time consuming
•Broken chains
•Biased to visible and
accessible part of a target
population (new statistical
methods coming)
•Promise to provide services
14
Multiplier methods
STI Clinic
Johnston LG, Prybylski D, Raymond HF, Mirzazadeh A, Manopaiboon C, McFarland W. Incorporating the Service
Multiplier Method in Respondent-Driven Sampling Surveys to Estimate the Size of Hidden and Hard-to-Reach
Populations: Case Studies From Around the World Sex Transm Dis. 2013 Apr;40(4):304-10
15
Multiplier methods
Strengths
Weaknesses
• Uses available data
sources
• Flexible in sampling
methods
• When already doing an
IBBSS
• Two sources of data must be
independent
• Data sources must define
population in the same way
• Time periods, age,
geographic areas must align
• Inaccuracy of program data
and survey data
16
Indirect Methods
Done in surveys of the general population
17
Proxy respondent method
Proxy Respondent (Alter)
Respondent
Member of
Hidden Pop.
Mirzazadeh A, Danesh A, Haghdoost AA. Network scale-up and proxy respondent methods
in prisons [ongoing]
18
Proxy respondent method
Strengths
Weaknesses
• Estimates from
general population
rather than hard-toreach populations
• Doesn’t require
directly asking sensitive
questions or lengthy
behavioral survey
•Some subgroups may not
associate with members of the
general population
• Respondent may be unaware
the alter engages in the behavior
of interest
• Biases may arise by types of
questions asked
19
Network scale-up
Shokoohi M, Baneshi MR, Haghdoost AA. Size Estimation of Groups at High Risk of HIV/AIDS
using Network Scale Up in Kerman, Iran. Int J Prev Med. 2012 Jul;3(7):471-6.
20
Network scale-up
Strengths
Weaknesses
• Estimates from
general population
rather than hard-toreach populations
• Doesn’t require
directly asking sensitive
questions or lengthy
behavioral survey
• Average personal network size
difficult to estimate
• Some subgroups may not
associate with members of the
general population
• Respondent may be unaware
someone in network engages in
the behavior of interest
• Biases may arise by types of
questions asked
21
Part 2
NSU method overview
22
NSU Basic Concepts
• A random sample of the general population
describes their social networks
– network sizes (C)
– the presence of individuals belonging to special
sub-populations of interest
• Based on the prevalence and presence of subpopulations in the social network of the
selected sample, the sizes of the hidden subpopulations in a community are estimated.
23
NSU – Main Questions
• How many people do you know over the past
two years?
• Of those, how many injected drug (over the
past two years)?
• Do you know at least one person in your
network who injected drug (over the past two
years)?
24
NSU – Frequency Approach
• T= total population with the size of t
• C= one individual’s acquaintances (or personal
network size)
• m = the number of individuals belonged to the
target population among those acquaintances
• E = the hidden population with the size of e
e m

t c
25
NSU – Frequency Approach
26
NSU – Probability Approach
Probability Approach
Frequency Approach
27
Confidence Interval 95% - Conventional
• Frequency approach:
• Point Estate
E=Pxt
• 95%CI Upper Limit E = (P + 1.96 se) x t
• 95%CI Lower Limit E = (P - 1.96 se) x t
28
Confidence Interval - Bootstrap
29
Size Estimation of Groups at High Risk of HIV/AIDS using
Network Scale Up in Kerman, Iran
Kerman
T = 132,651
Age 15-45 Male
Shokoohi M, Baneshi MR, Haghdoost AA. Size Estimation of Groups at High Risk of HIV/AIDS
using Network Scale Up in Kerman, Iran. Int J Prev Med. 2012 Jul;3(7):471-6.
30
Seems to be easy / but challenging
• What do you mean by ‘know’?
• Subgroups (Sex, Age Groups, Local/National)?
• How to define MARPS (IDU, FSW, MSM)?
31
Example – Data Collection Tool
32
Part 3
Network size estimation
33
Definition of social network
•
•
•
•
•
Global network
Active network
Supportive network
Sexual network
Sub-networks
–
–
–
–
–
Family
Coworkers
Classmates
Sport
…
34
Definition of “know”
• People whom you know and who know you, in
appearance or by name, with whom you can
interact, if needed.
AND
• With whom you have contacted over the last two
years in person, or by telephone or e-mail
AND
• Living in your area/country
AND
• ………..
35
Direct methods
• Overall question? How many do you know?
– Active network
– Supporting network
– Sub-networks
• Sub-groups (summation method)
–
–
–
–
–
–
–
Family
Coworkers
Sports
Ex-classmates
Clubs
Church
……..
36
Disadvantages of direct methods
• Reliability and validity issue
• Double counting in summation method
37
Indirect methods
• C is estimated based on the frequency of
members belonging to a sub-populations with
known sizes (reference groups):
– Number of birth in last year
– Number of death due to cancer/car accident in
last year
– Number of marriage in last year
– Number of people with specific first name
• It is a type of back calculation
38
C – Network Size
39
Specific criteria for reference groups
• Prevalence between 0.1-4%
• one-syllable name
• Stable prevalence over time and in different
ethnicities
40
Back calculation of the size of reference groups
• At least 20 reference groups are needed in the
first step
• Some of these reference group may generate
bias estimates
• Step by step, non-eligible reference groups
has to be detected and dropped form the
calculation:
– Ratio Method
– Regression Method
41
Ratio method algorithm
• Step 1: including all reference groups, calculate C
• Step 2: back-calculate the size of all reference groups (given
C)
• Step 3: calculate bias ratio [(Real size/Estimated size)–1]
for every reference group
• Step 4: exclude the most biased reference group, and
recalculate C
• Step 5: back-calculate the size of all remaining reference
groups (given new C)
• Step 6: recalculate bias ratio for every reference group
• Step 7: check if all bias ratios are between 0.5 and 1.5
• Step 8: if not, got to step 4 and continue till all bias ratios
fall between 0.5 and 1.5
42
Computer Lab 1
Calculate the network size
C_estimation(withoutsolution).xml
43
Real vs. predicted size for 23 Ref. groups – Kerman NSU
800000
y = 0.4025x + 167392
R² = 0.4692
700000
Predicted Size
600000
500000
400000
300000
200000
100000
0
0
200000 400000 600000 800000 1000000 1200000 1400000
Real Size
44
Ratio Method – Kerman NSU
# Steps
C
Min-Ratio
Max-Ratio
Removing
group
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7
Step 8
Step 9
Step 10
Step 11
Step 12
Step 13
Step 14
Step 15
268.6
288.8
333.1
366
414.7
383.6
379.2
371
365.9
360.2
382.5
398.9
386.1
372.2
365.5
0.06
0.07
0.08
0.09
0.23
0.25
0.27
0.29
0.41
0.44
0.46
0.44
0.49
0.57
2.57
2.61
1.96
1.91
1.76
1.74
1.71
1.68
1.66
1.57
1.54
1.49
1.43
1.41
m8
m1
m7
m5
m12
m21
m10
m19
m11
m16
m9
m15
m20
m14
45
Ratio Method – Kerman NSU
Plot real versus predicted size of reference groups
variable
Real
Estimate
Ratio
700000
m2
478423
409956.1
1.17
600000
m3
610018
535452.8
1.14
500000
m4
252786
347730.6
m6
137200
103534.8
m13
206942
274524.2
m17
119784
113992.9
m18
249592
177264.2
m22
81321
143798.4
0.57
m23
73800
103534.8
0.71
1.33
0.75
Estimate
0.73
400000
300000
200000
1.05
100000
1.41
0
0
100000
200000
300000
400000
500000
600000
700000
Real
46
Regression Method
• NSU assumes a linear association between prevalence of reference groups
in the society (e/t) and average number of people respondents knew in
each reference group (Average of m)
• To detect reference groups that does not satisfy the linearity assumption,
fit a regression line and calculate standardize DFBETA for all reference
groups.
• The reference group with the highest SDFBETA is excluded.
• The process is continued in an iterative fashion to remove all reference
groups with SDFBETA higher than 3/√n (n is the number of reference
groups)
47
Regression Method – Iran NSU
48
Regression Method – Iran NSU
• STATA commands:
reg meanm propm, beta
dfbeta
disp 3/sqrt(23)
id
m1
m2
m3
m4
m5
m6
m7
m8
m9
m10
m11
m12
m13
m14
m15
m16
m17
m18
m19
m20
m21
m22
m23
propm
.015655
.006402
.008163
.003383
.011498
.001836
.008614
.008182
.003746
.000292
.000259
.000311
.002769
.000497
.000839
.005542
.001603
.00334
.000204
.000866
.000148
.001088
.000988
meanm
1.75703
2.00512
2.61893
1.70077
2.1509
.506394
1.09974
.624041
.910486
.450128
.329923
1.43478
1.34271
.381074
.731458
1.2046
.557545
.867008
.283887
.754476
.242967
.703325
.506394
_dfbeta_1
.9383446
-.1981893
-.6207414
-.2671462
.3982361
-.014008
.0457343
-.2765261
-.0042227
.062993
.0412282
-.2617438
-.0880711
.0366919
.050291
.0195055
.009407
-.0029265
.0327013
.0480117
.0244004
.043369
.0317765
49
Real vs. Predicted Size
Ratio and Regression Methods
700000
Regresion
Ratio
600000
Linear (Regresion)
Linear (Ratio)
Estimate
500000
400000
300000
Final Network Size
Ratio M. 380
Regression M. 308
200000
100000
0
0
100000 200000 300000 400000 500000 600000 700000
Real
Glob J Health Sci. 2013 Jun 17;5(4):217-27. doi: 10.5539/gjhs.v5n4p217.
The estimation of active social network size of the Iranian population.
Rastegari A, Haji-Maghsoudi S, Haghdoost A, Shatti M, Tarjoman T, Baneshi MR.
50
Part 4
Correction for biases in NSU
51
Main Biases in NSU
• Transmission effect: a respondent may be
unaware someone in his/her network engages
in the behavior of interest.
• Barrier effects: some subgroups may not
associate with members of the general
population.
52
NSU adjustment factors (1)
Transparency (also known as visibility ratio, transmission
error, transparency rate, transmission rate, and masking)
Respondents may know people who are drug
users, but might not know if they inject drug, a
phenomenon called information transmission
error
-> Failure to adjust for it may lead to an
underestimate of unknown size
53
NSU adjustment factors (2)
Barrier Effect (Also known as Popularity ratio,
Degree ratio)
People with high-risk behaviors might, on
average, have smaller networks than the general
population making them less likely to be counted
by individuals reporting on people they know.
-> Failure to adjust for it may lead to an
underestimate of unknown size
54
NSU adjustment factors (3)
Social Desirability Bias (also know as response
bias)
Respondents may know people who are for example
sex worker, but may be unwilling to provide this
information because of the possible stigma involved.
-> Failure to adjust for it may lead to an
underestimate of unknown size
Mirzazadeh A, Haghdoost AA, Nedjat S, Navadeh S, McFarland W, Mohammad K. Accuracy of HIV-Related
Risk Behaviors Reported by Female Sex Workers, Iran: A Method to Quantify Measurement Bias in
Marginalized Populations. AIDS Behav. 2013 Feb;17(2):623-31
55
Game of contacts: transmission rate
• In a sample of target high-risk population, let’s say
300 IDU, we ask the number of people they know
with A, B etc. name.
• And how many of them (i.e. those people named A,
B…) know about their behavior (e.g. injecting drug).
• The transmission rate is estimated by dividing the
summation of the number of alters of respondents
that are aware of their behavior by the total number
of alters.
56
Game of contacts: transmission rate
57
Game of contacts: popularity ratio
• The game of contacts estimates the relative
personal network size of members of the highrisk population and the general population
– Selecting a list of first names
– Asking from a sample of the general population
how many people they know with one of the
selection first names
– Asking from a sample of the target population
how many people they know with one of the
selection first names
58
Game of contacts: popularity ratio
59
Visibility and Popularity Factors – NSU Iran
• VF for
– IDU: 54% (95% UL: 50%, 58%)
– FSW: 44% (95% UL: 41%, 49%)
• PF for
– IDU: 69% (95% CI: 59%, 80%)
– FSW: 74% (95% CI: 68%, 81%)
60
Visibility and Popularity Factors – NSU Iran
Sample size = 12814 people (400 per province)
Total Pop size = 75,149,699
Pop. Size Estimates
Pop. Size Estimates
Iran NSU
Prevalence (95% CI)
(Point Estimate)
(95% CI)
1,300,858
(1,195,530 - 1,426,513)
1.73 (1.59 - 1.90)
1,101,411
(973,129 - 1,273,240)
1.47 (1.29 - 1.69)
493,156
(437,521 - 565,938)
0.66 (0.58 - 0.75)
Amphetamine, ecstasy and LCD
224,357
(205,823 - 247,362)
0.30 (0.27 - 0.33)
Cristal
439,861
(387,124 - 502,428)
0.59 (0.52 - 0.67)
Heroin / Crack
262,344
(235,188 - 296,184)
0.35 (0.31 - 0.39)
Marijuana / Hashish
352,592
(311,572 - 402,857)
0.47 (0.41 - 0.54)
Any drug injection
207,722
(182,671 - 238,363)
0.28 (0.24 - 0.32)
Alcohol
Opium
Opium sap (Shireh)
61
Validation Study: Social Desirability Bias
Mirzazadeh A, Haghdoost AA, Nedjat S, Navadeh S, McFarland W, Mohammad K.
Accuracy of HIV-Related Risk Behaviors Reported by Female Sex Workers, Iran: A
Method to Quantify Measurement Bias in Marginalized Populations. AIDS Behav.
2013 Feb;17(2):623-31
Mirzazadeh A, Mansournia MA, Nedjat S, Navadeh S, McFarland W, Haghdoost
AA, Mohammad K; Bias analysis to improve monitoring an HIV epidemic and its
response: approach and application to a survey of female sex workers in Iran; J
Epidemiol Community Health 2013;67:10 882-887 Published Online First: 27
June 2013
62
Key Resources
• Rastegari A, Haji-Maghsoudi S, Haghdoost A, Shatti M, Tarjoman T, Baneshi
MR. The estimation of active social network size of the Iranian
population. Glob J Health Sci. 2013 Jun 17;5(4):217-27
• Shokoohi M, Baneshi MR, Haghdoost AA. Size Estimation of Groups at
High Risk of HIV/AIDS using Network Scale Up in Kerman, Iran. Int J Prev
Med. 2012 Jul;3(7):471-6.
• Bernard HR, Hallett T, Iovita A, Johnsen EC, Lyerla R, McCarty C, Mahy M,
Salganik MJ, Saliuk T, Scutelniciuc O, Shelley GA, Sirinirund P, Weir S,
Stroup DF; Counting hard-to-count populations: the network scale-up
method for public health; Sex Transm Infect. 2010 Dec;86 Suppl 2:ii11-5.
• www.hivhub.ir (publications)
63
Thank You So Much
64