Network Scale-Up to Estimate the Population Size of High-Risk Groups for HIV Methods Core Seminars – Center for AIDS Prevention Studies/UCSF – 20 Sep. 2013 Ali Mirzazadeh MD. MPH. PhD. Institute for Health Policy Studies / Global Health Sciences Institute UCSF, San Francisco, CA, USA [ali.mirzazadeh@ucsf.edu] Regional Knowledge Hub, and WHO Collaborating Center for HIV Surveillance, Kerman University of Medical Sciences, Kerman, Iran [ali.mirzazadeh@hivhub.ir] This presentation has the following parts • • • • P1: PSE methods overview P2: Network Scale-up method overview P3: Network size estimation P4: Correction for biases in NSU 2 Part 1 PSE methods overview 3 Why do MARP size estimates? • Know, track, and predict your epidemic – Disproportionate impact in low level, concentrated, and generalized epidemics • Program planning – Advocacy, development, M&E • Because you were asked to – UNAIDS, UNGASS, PEPFAR, MOH • Resource allocation – Right population, right priority, right amount on right programs 4 How to do MARP size estimates? Scientific rigor • There is no gold standard, no census • We do not know which method is best • We are not able to fully calibrate or correct • Many methods Cost 5 Census Population-based survey Network scale up Oil wells Multiple sample recapture Done with Capture-recapture surveys of Plant recapture Unique object multiplier MARPs Truncated Poisson Multipliers, multiple multipliers Unique event multiplier Mapping with census and enumeration Nomination counting Place, RAP, ethnography Registries, police, SHC, drug treatment, unions, workplace Wisdom of the crowds Delphi Consensus Discrepancies Done by literature review, Soft modeling experts, stakeholders, models Borrow from thy neighbor Conventional Wisdom Straw man Scientific rigor Done in surveys of the general population Cost 6 Direct Methods Done with surveys of MARPs 7 Direct questions to population-based surveys Strengths Weaknesses • Surveys are familiar • Easy if a survey is underway • Straightforward to analyze • Sampling is easy to defend scientifically (“gold standard”) • Low precision when the behaviors are rare • Respondents are unlikely to admit to stigmatized behaviors • Only reaches people residing in households • Privacy, confidentiality, risk to subjects Mirzazadeh A, Haghdoost AA, Nedjat S, Navadeh S, McFarland W, Mohammad K. Accuracy of HIVRelated Risk Behaviors Reported by Female Sex Workers, Iran: A Method to Quantify Measurement Bias in Marginalized Populations. AIDS Behav. 2013 Feb;17(2):623-31 8 Census and enumeration Ali Mirzazadeh, Faran Emmanuel, Fouzia Gharamah, Abdul Hamed Al-Suhaibi, Hamidreza Setayesh, Willi McFarland, Ali Akbar Haghdoost; HIV prevalence and related risk behaviors in men who have sex with men, Yemen 2011; AIDS Behav. 2013 Jul 23. [Epub ahead of print] 9 Census and enumeration Strengths Weaknesses • It is a real count, not an estimate • Can produce credible lower limit • Can be used to inform other methods • Use in program planning, implementation, evaluation • At-risk populations hidden, methods miss some members (China: multiply by 2 – 3!) • Stigma may cause members to not identify themselves • Time-consuming and expensive • Staff safety • Subject safety 10 Capture-recapture 11 Capture-recapture Strengths Weaknesses • Relatively easy • 4 conditions hard to meet: • Does not require much 1) two samples must be data independent , not correlated • When no other data or 2) each population member should studies are available have equal chance of selection 3) each member must be correctly identified as ‘capture’ or ‘recapture’ 4) no major in/out migration 12 Nomination method S Navadeh, A Mirzazadeh, L Mousavi, AA Haghdoost, N Fahimfar, A Sedaghat; HIV, HSV2 and Syphilis Prevalence in Female Sex Workers in Kerman, South-East Iran; Using Respondent-Driven Sampling Iran J Public Health. 2012 Dec 1;41(12):60-5. Print 2012. 13 Nomination method Strengths Weaknesses • Relatively easy • Snowball or chain sampling methods • Need the target group to be connected/network •Time consuming •Broken chains •Biased to visible and accessible part of a target population (new statistical methods coming) •Promise to provide services 14 Multiplier methods STI Clinic Johnston LG, Prybylski D, Raymond HF, Mirzazadeh A, Manopaiboon C, McFarland W. Incorporating the Service Multiplier Method in Respondent-Driven Sampling Surveys to Estimate the Size of Hidden and Hard-to-Reach Populations: Case Studies From Around the World Sex Transm Dis. 2013 Apr;40(4):304-10 15 Multiplier methods Strengths Weaknesses • Uses available data sources • Flexible in sampling methods • When already doing an IBBSS • Two sources of data must be independent • Data sources must define population in the same way • Time periods, age, geographic areas must align • Inaccuracy of program data and survey data 16 Indirect Methods Done in surveys of the general population 17 Proxy respondent method Proxy Respondent (Alter) Respondent Member of Hidden Pop. Mirzazadeh A, Danesh A, Haghdoost AA. Network scale-up and proxy respondent methods in prisons [ongoing] 18 Proxy respondent method Strengths Weaknesses • Estimates from general population rather than hard-toreach populations • Doesn’t require directly asking sensitive questions or lengthy behavioral survey •Some subgroups may not associate with members of the general population • Respondent may be unaware the alter engages in the behavior of interest • Biases may arise by types of questions asked 19 Network scale-up Shokoohi M, Baneshi MR, Haghdoost AA. Size Estimation of Groups at High Risk of HIV/AIDS using Network Scale Up in Kerman, Iran. Int J Prev Med. 2012 Jul;3(7):471-6. 20 Network scale-up Strengths Weaknesses • Estimates from general population rather than hard-toreach populations • Doesn’t require directly asking sensitive questions or lengthy behavioral survey • Average personal network size difficult to estimate • Some subgroups may not associate with members of the general population • Respondent may be unaware someone in network engages in the behavior of interest • Biases may arise by types of questions asked 21 Part 2 NSU method overview 22 NSU Basic Concepts • A random sample of the general population describes their social networks – network sizes (C) – the presence of individuals belonging to special sub-populations of interest • Based on the prevalence and presence of subpopulations in the social network of the selected sample, the sizes of the hidden subpopulations in a community are estimated. 23 NSU – Main Questions • How many people do you know over the past two years? • Of those, how many injected drug (over the past two years)? • Do you know at least one person in your network who injected drug (over the past two years)? 24 NSU – Frequency Approach • T= total population with the size of t • C= one individual’s acquaintances (or personal network size) • m = the number of individuals belonged to the target population among those acquaintances • E = the hidden population with the size of e e m t c 25 NSU – Frequency Approach 26 NSU – Probability Approach Probability Approach Frequency Approach 27 Confidence Interval 95% - Conventional • Frequency approach: • Point Estate E=Pxt • 95%CI Upper Limit E = (P + 1.96 se) x t • 95%CI Lower Limit E = (P - 1.96 se) x t 28 Confidence Interval - Bootstrap 29 Size Estimation of Groups at High Risk of HIV/AIDS using Network Scale Up in Kerman, Iran Kerman T = 132,651 Age 15-45 Male Shokoohi M, Baneshi MR, Haghdoost AA. Size Estimation of Groups at High Risk of HIV/AIDS using Network Scale Up in Kerman, Iran. Int J Prev Med. 2012 Jul;3(7):471-6. 30 Seems to be easy / but challenging • What do you mean by ‘know’? • Subgroups (Sex, Age Groups, Local/National)? • How to define MARPS (IDU, FSW, MSM)? 31 Example – Data Collection Tool 32 Part 3 Network size estimation 33 Definition of social network • • • • • Global network Active network Supportive network Sexual network Sub-networks – – – – – Family Coworkers Classmates Sport … 34 Definition of “know” • People whom you know and who know you, in appearance or by name, with whom you can interact, if needed. AND • With whom you have contacted over the last two years in person, or by telephone or e-mail AND • Living in your area/country AND • ……….. 35 Direct methods • Overall question? How many do you know? – Active network – Supporting network – Sub-networks • Sub-groups (summation method) – – – – – – – Family Coworkers Sports Ex-classmates Clubs Church …….. 36 Disadvantages of direct methods • Reliability and validity issue • Double counting in summation method 37 Indirect methods • C is estimated based on the frequency of members belonging to a sub-populations with known sizes (reference groups): – Number of birth in last year – Number of death due to cancer/car accident in last year – Number of marriage in last year – Number of people with specific first name • It is a type of back calculation 38 C – Network Size 39 Specific criteria for reference groups • Prevalence between 0.1-4% • one-syllable name • Stable prevalence over time and in different ethnicities 40 Back calculation of the size of reference groups • At least 20 reference groups are needed in the first step • Some of these reference group may generate bias estimates • Step by step, non-eligible reference groups has to be detected and dropped form the calculation: – Ratio Method – Regression Method 41 Ratio method algorithm • Step 1: including all reference groups, calculate C • Step 2: back-calculate the size of all reference groups (given C) • Step 3: calculate bias ratio [(Real size/Estimated size)–1] for every reference group • Step 4: exclude the most biased reference group, and recalculate C • Step 5: back-calculate the size of all remaining reference groups (given new C) • Step 6: recalculate bias ratio for every reference group • Step 7: check if all bias ratios are between 0.5 and 1.5 • Step 8: if not, got to step 4 and continue till all bias ratios fall between 0.5 and 1.5 42 Computer Lab 1 Calculate the network size C_estimation(withoutsolution).xml 43 Real vs. predicted size for 23 Ref. groups – Kerman NSU 800000 y = 0.4025x + 167392 R² = 0.4692 700000 Predicted Size 600000 500000 400000 300000 200000 100000 0 0 200000 400000 600000 800000 1000000 1200000 1400000 Real Size 44 Ratio Method – Kerman NSU # Steps C Min-Ratio Max-Ratio Removing group Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8 Step 9 Step 10 Step 11 Step 12 Step 13 Step 14 Step 15 268.6 288.8 333.1 366 414.7 383.6 379.2 371 365.9 360.2 382.5 398.9 386.1 372.2 365.5 0.06 0.07 0.08 0.09 0.23 0.25 0.27 0.29 0.41 0.44 0.46 0.44 0.49 0.57 2.57 2.61 1.96 1.91 1.76 1.74 1.71 1.68 1.66 1.57 1.54 1.49 1.43 1.41 m8 m1 m7 m5 m12 m21 m10 m19 m11 m16 m9 m15 m20 m14 45 Ratio Method – Kerman NSU Plot real versus predicted size of reference groups variable Real Estimate Ratio 700000 m2 478423 409956.1 1.17 600000 m3 610018 535452.8 1.14 500000 m4 252786 347730.6 m6 137200 103534.8 m13 206942 274524.2 m17 119784 113992.9 m18 249592 177264.2 m22 81321 143798.4 0.57 m23 73800 103534.8 0.71 1.33 0.75 Estimate 0.73 400000 300000 200000 1.05 100000 1.41 0 0 100000 200000 300000 400000 500000 600000 700000 Real 46 Regression Method • NSU assumes a linear association between prevalence of reference groups in the society (e/t) and average number of people respondents knew in each reference group (Average of m) • To detect reference groups that does not satisfy the linearity assumption, fit a regression line and calculate standardize DFBETA for all reference groups. • The reference group with the highest SDFBETA is excluded. • The process is continued in an iterative fashion to remove all reference groups with SDFBETA higher than 3/√n (n is the number of reference groups) 47 Regression Method – Iran NSU 48 Regression Method – Iran NSU • STATA commands: reg meanm propm, beta dfbeta disp 3/sqrt(23) id m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 m12 m13 m14 m15 m16 m17 m18 m19 m20 m21 m22 m23 propm .015655 .006402 .008163 .003383 .011498 .001836 .008614 .008182 .003746 .000292 .000259 .000311 .002769 .000497 .000839 .005542 .001603 .00334 .000204 .000866 .000148 .001088 .000988 meanm 1.75703 2.00512 2.61893 1.70077 2.1509 .506394 1.09974 .624041 .910486 .450128 .329923 1.43478 1.34271 .381074 .731458 1.2046 .557545 .867008 .283887 .754476 .242967 .703325 .506394 _dfbeta_1 .9383446 -.1981893 -.6207414 -.2671462 .3982361 -.014008 .0457343 -.2765261 -.0042227 .062993 .0412282 -.2617438 -.0880711 .0366919 .050291 .0195055 .009407 -.0029265 .0327013 .0480117 .0244004 .043369 .0317765 49 Real vs. Predicted Size Ratio and Regression Methods 700000 Regresion Ratio 600000 Linear (Regresion) Linear (Ratio) Estimate 500000 400000 300000 Final Network Size Ratio M. 380 Regression M. 308 200000 100000 0 0 100000 200000 300000 400000 500000 600000 700000 Real Glob J Health Sci. 2013 Jun 17;5(4):217-27. doi: 10.5539/gjhs.v5n4p217. The estimation of active social network size of the Iranian population. Rastegari A, Haji-Maghsoudi S, Haghdoost A, Shatti M, Tarjoman T, Baneshi MR. 50 Part 4 Correction for biases in NSU 51 Main Biases in NSU • Transmission effect: a respondent may be unaware someone in his/her network engages in the behavior of interest. • Barrier effects: some subgroups may not associate with members of the general population. 52 NSU adjustment factors (1) Transparency (also known as visibility ratio, transmission error, transparency rate, transmission rate, and masking) Respondents may know people who are drug users, but might not know if they inject drug, a phenomenon called information transmission error -> Failure to adjust for it may lead to an underestimate of unknown size 53 NSU adjustment factors (2) Barrier Effect (Also known as Popularity ratio, Degree ratio) People with high-risk behaviors might, on average, have smaller networks than the general population making them less likely to be counted by individuals reporting on people they know. -> Failure to adjust for it may lead to an underestimate of unknown size 54 NSU adjustment factors (3) Social Desirability Bias (also know as response bias) Respondents may know people who are for example sex worker, but may be unwilling to provide this information because of the possible stigma involved. -> Failure to adjust for it may lead to an underestimate of unknown size Mirzazadeh A, Haghdoost AA, Nedjat S, Navadeh S, McFarland W, Mohammad K. Accuracy of HIV-Related Risk Behaviors Reported by Female Sex Workers, Iran: A Method to Quantify Measurement Bias in Marginalized Populations. AIDS Behav. 2013 Feb;17(2):623-31 55 Game of contacts: transmission rate • In a sample of target high-risk population, let’s say 300 IDU, we ask the number of people they know with A, B etc. name. • And how many of them (i.e. those people named A, B…) know about their behavior (e.g. injecting drug). • The transmission rate is estimated by dividing the summation of the number of alters of respondents that are aware of their behavior by the total number of alters. 56 Game of contacts: transmission rate 57 Game of contacts: popularity ratio • The game of contacts estimates the relative personal network size of members of the highrisk population and the general population – Selecting a list of first names – Asking from a sample of the general population how many people they know with one of the selection first names – Asking from a sample of the target population how many people they know with one of the selection first names 58 Game of contacts: popularity ratio 59 Visibility and Popularity Factors – NSU Iran • VF for – IDU: 54% (95% UL: 50%, 58%) – FSW: 44% (95% UL: 41%, 49%) • PF for – IDU: 69% (95% CI: 59%, 80%) – FSW: 74% (95% CI: 68%, 81%) 60 Visibility and Popularity Factors – NSU Iran Sample size = 12814 people (400 per province) Total Pop size = 75,149,699 Pop. Size Estimates Pop. Size Estimates Iran NSU Prevalence (95% CI) (Point Estimate) (95% CI) 1,300,858 (1,195,530 - 1,426,513) 1.73 (1.59 - 1.90) 1,101,411 (973,129 - 1,273,240) 1.47 (1.29 - 1.69) 493,156 (437,521 - 565,938) 0.66 (0.58 - 0.75) Amphetamine, ecstasy and LCD 224,357 (205,823 - 247,362) 0.30 (0.27 - 0.33) Cristal 439,861 (387,124 - 502,428) 0.59 (0.52 - 0.67) Heroin / Crack 262,344 (235,188 - 296,184) 0.35 (0.31 - 0.39) Marijuana / Hashish 352,592 (311,572 - 402,857) 0.47 (0.41 - 0.54) Any drug injection 207,722 (182,671 - 238,363) 0.28 (0.24 - 0.32) Alcohol Opium Opium sap (Shireh) 61 Validation Study: Social Desirability Bias Mirzazadeh A, Haghdoost AA, Nedjat S, Navadeh S, McFarland W, Mohammad K. Accuracy of HIV-Related Risk Behaviors Reported by Female Sex Workers, Iran: A Method to Quantify Measurement Bias in Marginalized Populations. AIDS Behav. 2013 Feb;17(2):623-31 Mirzazadeh A, Mansournia MA, Nedjat S, Navadeh S, McFarland W, Haghdoost AA, Mohammad K; Bias analysis to improve monitoring an HIV epidemic and its response: approach and application to a survey of female sex workers in Iran; J Epidemiol Community Health 2013;67:10 882-887 Published Online First: 27 June 2013 62 Key Resources • Rastegari A, Haji-Maghsoudi S, Haghdoost A, Shatti M, Tarjoman T, Baneshi MR. The estimation of active social network size of the Iranian population. Glob J Health Sci. 2013 Jun 17;5(4):217-27 • Shokoohi M, Baneshi MR, Haghdoost AA. Size Estimation of Groups at High Risk of HIV/AIDS using Network Scale Up in Kerman, Iran. Int J Prev Med. 2012 Jul;3(7):471-6. • Bernard HR, Hallett T, Iovita A, Johnsen EC, Lyerla R, McCarty C, Mahy M, Salganik MJ, Saliuk T, Scutelniciuc O, Shelley GA, Sirinirund P, Weir S, Stroup DF; Counting hard-to-count populations: the network scale-up method for public health; Sex Transm Infect. 2010 Dec;86 Suppl 2:ii11-5. • www.hivhub.ir (publications) 63 Thank You So Much 64