Cross Sectional Studies Son Hee Jung 2013/03/25 Type of Epidemiological Studies Type of study Experimental RCT Observational Ecological Cross sectional Case-control Cohort Alternative name Unit clinical trial individuals correlational prevalence case-reference follow up population individuals individuals individuals Study Designs & Corresponding Questions • Cross-sectional • Ecologic • Case-control • Prospective How common is this disease or condition? What explains differences between groups? What factors are associated with having a disease? How many people will get the disease? What factors predict development? Contents Definition Basic approach Advantage & disadvantage Sampling Measures of disease – Prevalence • Bias • • • • • Cross-sectional study-definition Cross Sectional Study 연구대상 집단 한 시점 연구 진행 요인 노출과 질환에 관한 정보 수집 Cross-sectional study- Characteristics Basic approach • Include a sample of all persons in a population at a given time without regard to exposure or disease status • Typically exposure and diseases assessed at that one time • Exposure subpopulations can be compared with respect to disease prevalence Basic approach • For some questions, temporal ordering between exposure and disease is clear and cross sectional studies can test hypothesis – Example: genotype, blood type • When temporal ordering is not clear can be used to examine relations between exposure and outcomes descriptively, and to generate hypotheses • Can combine a cross sectional study with follow up to create a cohort study Basic approach • Issues with addressing etiology – Temporal ordering between exposure and outcome cannot be assured – Length biased sampling • Cases with long duration will be over represented Cross -Sectional Studies: Advantages • Inexpensive for common diseases • Should be able to get a better response rate than other study designs • Relatively short study duration • Can be addressed to specific populations of interest Cross-Sectional Studies : Disadvantages • Unsuitable for rare or short duration diseases • High refusal rate may make accurate prevalence estimates impossible • More expensive and time consuming than casecontrol studies • No data on temporal relationship between risk factors and disease development Why sample? Sampling from the source population Non-probability sampling • Common convenience sampling methods – Street surveys • Use convenient place such as mall, hospital – Mail-out questionnaires • Most dangerous • Feel very strongly about the issue->bias – Volunteer call • Selection bias Non-probability sampling-Convenience sampling • Select a sample through an easy, simple or inexpensive method • Problem – High risk of creating a bias – May provide misleading information – Can be accepted, but… • Be careful in assessing • And the results they produce Basic probability sampling • Simple random sampling – Each sample of the chosen size has the same probability of being selected Basic probability sampling • Systematic sampling – Obtain a lost of an available population, ordered according to an unrelated factor – Pick a number n as step size – Pick every n-th subject of the list Stratified random sampling Cluster random sampling Multistage sampling The National Health and Nutrition Examination Survey (NHANES) NHANES Interviews & Examinations • ㅍ NHANES Sample Design Analyses of NHANES Data Weighting in NHANES • ㅍ NHANES base probability of selection • ㅍ Oversampling Sample Weights Why weight? Probability weight – simple example Example of weighting • Imagine 100 male & 100 female in sample • But only 80 males & 75 females respond • Male respondent will get weight of – 100/80->1/(80/100)=1.25 • Female respondent will get weight of – 100/75->1/(75/100)=1.33 국민건강영양조사의 표본추출방법 예 다단계 표본추출 • 단순무작위 표본추출의 실제적 어려움을 해결하 기 위해 고안된 방법 – 전국 규모의 여론조사에 이용 – “series” of simple random samples in stages 국가 시군구 random sampling 시도 • 국민건강영양조사 random sampling random sampling 읍면동 유병률 산출: 가중치 적용 • 목적: 국민건강영양조사의 표본이 우리나라 국민 을 대표하도록 가중치를 사용 Direct age adjustment-before A B population No. of death Death rates per 100,000 population No. of death Death rates per 100,000 900,000 862 96 900,000 1,130 126 A Age group population No. of death B Death rates per 100,000 population No. of death Death rates per 100,000 All ages 900,000 862 96 900,000 1,130 126 30-49 500,000 60 12 300,000 30 10 50-69 300,000 396 132 400,000 400 100 70+ 100,000 406 406 200,000 700 350 Direct age adjustment-after A B population No. of death Death rates per 100,000 population No. of death Death rates per 100,000 900,000 862 96 900,000 1,130 126 Age group All ages 30-49 50-69 70+ Total Standard population 1,800,000 800,000 700,000 300,000 Age-adjusted rates Age-adjusted rates: “A" age-specific mortality rates per 100,000 Expected No. of deaths using “A" rates 12 132 406 96 924 1,218 2,238 124.3 2238/1800000=124.3 “B" age-specific Expected No. of d mortality rates per 10 eaths using 0,000 “B" rates 10 100 350 80 700 1,050 1,830 101.7 1830/1800000=101.7 Indirect age adjustment (Standardized Mortality Ratio) • When – number of deaths for each age-specific strata are not available – Study mortality in an occupational exposure population • Defined Observed number of deaths per year SMR= Expected number of deaths per year X100 • SMR of 100 • Observed number of deaths is the same as expected number of deaths Sampling, Inference, and generalization Sampling, Inference, and generalization Sampling, Inference, and generalization If you tell the truth you don't have to remember anything. by Mark Twain 1894 Why do we measure disease prevalence? Measuring burden: prevalence Prevalence Measuring burden: prevalence Person-time at risk: exposed and unexposed Censored individuals Censoring Measuring of prevalence Point and period prevalence: example Point prevalence at several time points Period prevalence Lifetime prevalence Life time prevalence 4/5 Prevalence of diabetes Utility of prevalence Sloppy use of risk Sloppy use of rate Classic example of rate that is not a rate Case fatality(rate?) Proportional mortality (rate?) Total deaths united states 2004 Deaths , U.S. 2004 ages 20-24 Years What ‘s a possible inferential problem with proportional mortality? Measuring risk: cumulative incidence Measuring risk: cumulative incidence Cumulative incidence is a proportion Calculating the cumulative incidence Odds Odds Odds Odds Odds and probabilities • The higher the incidence, the higher the discrepancy. Prevalence, Incidence, disease duration Disease prevalence depends on Incidence rates can be calculated for each transition in health status Incidence rates can be calculated for each transition in health status Relationship among prevalence, incidence rate, disease duration at steady state Relationship among prevalence, incidence rate, disease duration at steady state Relationship among prevalence, incidence rate, disease duration at steady state Mean duration of disease Relationship among prevalence, incidence rate, disease duration at steady state Relationship among prevalence, incidence rate, disease duration at steady state Relationship among prevalence, incidence rate, disease duration at steady state What does steady state mean in the context of estimating P from I and D? Example varying prevalence, incidence rates and duration of disease Cross-sectional Bias • Incidence-Prevalence bias – Type of selection bias – If exposed cases have different duration that no-exposed prevalent cases, prevalence ratio will be biased – E.g., cases with severe emphysema more likely to smoke, have higher fatality than cases with less severe emphysema, so the prevalence of emphysema in smokers will be underestimated compare to incidence – Solution-use incident cases – Duration ratio bias – Point prevalence complement ratio bias • Temporal bias – Information bias Incidence-Prevalence bias • PR과 IRR의 관계 – Prev= incidence X duration X (1-prev) PR * Duration ratio bias * Point prevalence complement ratio bias Duration ratio bias • Type of selection bias • 드문 질환에서 이환기간이 노출여부와 상관없이 동일하다면 비뚤림 발생하지 않음 • 노출여부에 따라 질병 이환기간이 다를 때 발생 • 만성질환의 경우 질병의 duration이 생존기간과 관련이 있기 때문에 이런 경우 생기는 bias가 survival bias Point prevalence complement ratio bias • 이환기간이 동일하다면, PR이 IRR을 과소측정하는 경향이 발생 • 노출그룹의 유병률: 0.04, 비노출그룹 유병률: 0.01 PR : 4 Point prevalence complement ratio=0.96/0.99=0.97 • 노출그룹의 유병률: 0.4, 비노출그룹 유병률: 0.1 PR : 4 Point prevalence complement ratio=0.6/0.9=0.67 • PR, 유병률 크면 → bias 크기 커짐 Selection bias -- Berkson’s bias • Admission-rate bias • Cases and/or controls selected from hospitals • Result from differential rates of hospital admission for cases and controls • If hospital based cases and controls have different exposures that population based, OR will be biased. • E.g., If hospital controls are less likely to have exposures, OR will be over-estimated. • E.g., Case control for pancreatic cancer and coffee drinking: Controls were selected from GI patients. However, GI patients are less likely to drink coffee that population. OR was artificially increased. • Solution: use population based control, or controls with disease not related to the exposure Temporal bias • 시간적 선후관계가 모호 – 질병의 위험요인 검정 측면에서의 결정적 단점 – 예: 영양결핍과 우울증 연구 – 시간적 경과에 따른 변동이 없는 노출요인의 경우에는 이러한 제한점에 구애 받지 않음 – 유전적 요인 • 시간적 선후관계가 뒤집어져 있는 연구는 비추 – 예: 가설) 식이요인이 초경나이에 미치는 영향 대상) 중년여성을 대상으로 초경나이와 최근 의 식이습관 조사 • 전체 유병환자 중 Incident cases만 포함하여 분석함으로 단점을 최소화 또 다른 bias ? • Historical information 으로 단점 최소화 screening is most likely to pick up less aggressive cancers, because they have a longer interval of being visible on scans while remaining asymptomatic you find out something earlier but don’t actually change the outcome, and therefore the apparent survival after diagnosis is longer without better survival Simpson’s paradox aggregated disaggregated Simpson’s paradox • Aggregated and disaggregated data tell two different stories 치료 종류 환자 수 성공 실패 성공률(%) 273 289 77 61 78 83 81 234 6 36 93 87 192 71 73 55 25 69 합계 (n=700) 개복술 350 경피술 350 돌의 크기 < 2cm (n=357) 개복술 87 경피술 270 돌의 크기 ≥ 2cm (n=343) 개복술 263 경피술 80 단면조사연구 정리 특정 시점 또는 짧은 기간 동안 표본 추출조사 – “스냅 사진” 장점 편리하고 비용 효과적 여러 노출과 질병 연구 가능 가설 생성 가능 일반적 인구집단을 대표 단점 시간적 선후관계 모호 생존자만 연구, 비뚤림 가능 짧은 이환 기간의 질환은 과소측정 Any question? If you tell the truth you don't have to remember anything. by Mark Twain 1894