Plausible values and Plausibility Range 1 Prevalence of FSWs in some west African Countries 0.1% 4.3% 2 Plausible values • In west African countries, prevalence of FSWs ranged 0.1% to 4.3%. • Suppose you implement a study in another country in this region, and get a prevalence of 10%. • How plausible this figure is? • Did you implement the study in high risk locations? • What are the potential biases in your study (selection of respondents, data collection, …)? • What are the main cultural and socioeconomic differences between this country and others? 3 Comparison of prevalence across risk zones • Suppose, we have stratified the country into low, intermediate and high risk zones. • We have selected one province from each zone. • The prevalence in low zone was higher than that of high zone. • How plausible it is? • Have you implemented standard approach in all provinces? • Have you trained the interviewers of the study? • Have you used the right criteria to define the risk zones? 4 Point Estimate vs. plausible range • One of the aims of statistics is estimating population parameters from sample statistics • For example, in a randomly selected sample of prisoners, 25 out of 200 ones reports sharing of injection equipment • Thus in the sample, 12.5% of the prisoners share injection equipments • This value of 12.5% is called a point estimate of the population proportion 5 Sampling Variation • Point estimate is a value derived from one randomly selected sample • We use it as the best guess for the population parameter • What would happen if we select another random sample? • If you repeat the mapping or the NSU survey, do you expect to get the same estimates? • What is the impact of respondents, locations, and time … 6 Construction of a Range • It is preferred to report a range of possible values, instead of a single point estimate • It is conventional to create 95% range which means that 95% of the time constructed range contains the true value of the parameter of interest • The width of the range provides some idea about uncertainty of the unknown parameter • A very wide interval may indicate that more data should be collected before anything very definite can be said about the parameter. 7 Advantages of Reporting a Range • A smaller confidence interval is always more desirable than a larger one because it shows that the population parameter can be estimated more accurately • Point estimation gives us a particular value as an estimate of the population parameter • Interval estimation gives us a range of values which is likely to contain the population parameter 8 Interpretation of Range • The upper and lower bounds of the interval give us information on how big or small the true parameter might be • Wide range indicates great uncertainty in the true value of the parameter 9 Different Names for Range • Statistical terminology – Confidence Interval – Uncertainty Limit – Credibility Interval • Non-statistical terminology (in this course) – Plausibility Range 10 How to Construct Statistical Ranges? • Standard Formulas Based on Normal approximation • Monte Carlo • Bootstrapping – Works based on resampling with replacement from the original sample – Estimation of parameter of interest in each sample – Use of 2.5 and 97.5 percentiles at lower and upper bounds 11 Application of available formulas • To estimate number of IDUs, capture-recapture study has been implemented: 12 13 14 15 16 How to Construct Non-Statistical Ranges? • In the following slides we introduce some approaches followed by other researchers • In addition, we introduce some other approaches based on common sense 17 Other Countries Experience • Indonesia applied the following formula: __ ( X i X )2 n 1 – x(i) = estimated size in district (i) __ –X = –n = mean of district sizes number of districts • Probably they used this statistics as SE and applied normal approximation theory 18 Ad Hoc Methods (1) • In other study, time-varying parameters were assigned uncertainty bounds in the model up to ± 50% of the best parameter estimates. • Parameter estimates:50000 • 20%*50000=10000 • uncertainty bounds: 50000 ± 10000 • (40000, 60000) 19 Ad Hoc Methods (2) • Ask respondents to provide a range, instead of a single value • For example, in NSU, ask respondents to count minimum and maximum of FSWs they know • Analysis lower bound data should provide the lower bound of the plausibility range • Analyzing the upper bound data should provide the upper bound of the plausibility range 20