Case Control Studies - 6th Summer Course on RMHS 2015

advertisement

Case Control Studies

Önder Ergönül, MD, MPH

Koç University, School of Medicine

Summer Course on Research Methodology in Health Sciences

Julne 16, 2015, Istanbul

An Alerting Situation!

National Case-Control Study of Kaposi’s Sarcoma and Pneumocystis carinii Pneumonia in Homosexual Men

Jaffe HWf, et al. Ann Int Med 1983; 99: 145-151.

Objective

To identify risk factors for the occurrence of Kaposi's sarcoma and

Pneumocystis carinii pneumonia in homosexual men, we conducted a case-control study in New York City, San Francisco, Los Angeles, and

Atlanta.

Methods

Fifty patients (cases) (39 with Kaposi's sarcoma, 8 with pneumocystis pneumonia, and 3 with both) and 120 matched homosexual male controls

(from sexually transmitted disease clinics and private medical practices)

Results a larger number of male sex partners

Cases were also more likely to have been exposed to feces during sex, have had syphilis and non-B hepatitis, have been treated for enteric parasites, and have used various illicit substances.

Italians (HLA-DR5)

2008 Nobel Awards

Francoise Barre-Sinoussi Luc Montagnier Harald zur Hausen

The Cohort Study

• Purpose: to identify or designate a group of people (cohort) and follow them over time

– Exposure is typically measured prior to the onset of disease

Population

Cohort

Exposure

Assessment 1 ……

Exposure

Assessment N

Disease

Assessment

Benefits of Case-Control Studies

• Cohort Studies are inefficient in the event of:

1. Significant time between exposure and measurable disease outcome

Exposure

Assessment

2. Rare outcomes

Many, many, many years

Initial Cohort

N=5,000

Exposure

Assessment

Diseased Group

Identified

N=30

Disease

Appearance

• In these cases, cohort studies are often expensive

Case Control Studies

Case-Control Studies: Basic Outline

1. Define source population: group that yielded the

cases (people with the disease outcome)

2. Determine which cases were/were not exposed to the variable(s) of interest

3. Sample control group from the source population

• Controls must be sampled independently of exposure

4. Determine which members of the control group were/were not exposed to the variables of interest

5. Compare rates of exposure in the case and control groups

Steps in Case Control Study

1. Define source population: group that yielded the cases

2. Sample cases

3. Sample control group (from the source population)

4. Determine exposure for all participants

5. Compare rates of exposure in cases vs. controls

– Compare using statistics

– Odds ratios, chi-square, etc.

Crucial Validity Points

1. Selection of controls

2. Obtaining exposure history

Selection of Controls

• Controls provide (an estimate of) the background rate of exposure

1) Controls must be representative of case population

2) Control selection must be independent of exposure

Basic Concept Graphically

Source Population

= exposed = unexposed

Control Group:

Should contain representative exposure distribution

Cases

Selection of Controls

Controls must be representative source population

• Example: Is a new blood pressure medication associated with myocardial infarction?

– Cases – patients with MI from the cardiology ward of hospital

– Controls – patients without MI from the ER ward of the same hospital

• Problems?

– Cardiology ward is referral center for whole state but ER serves mostly the local city

– BP medication may be available to local city but not the rest of the rural state

• Solutions:

1. Choose controls from the whole state OR

2. Exclude all cases who live outside the city

Selection of Controls

Control selection must be independent of exposure

• Example: Is a new blood pressure medication associated with myocardial infarction?

• Problem:

– What if the drug slows reaction time, causing automobile accidents that lead to ER visits?

– Control sample is not independent of exposure

– Bias is introduced – control group has higher proportion of individuals exposed to the new drug

Control Selection

In density case-control studies, control group sampled to represent person-time distribution of total exposed and unexposed cohorts

As such (in brief):

• ‘ Risk set approach’- select controls from people in the source population who are at risk of becoming a case at the exact time a case is diagnosed

• P (selection into control group) should be proportional to person-time contribution

• e.g. a person at risk for disease for 3 years should have a 3 times higher probability of being selected as a control than a person at risk for 1 year

Controls should be eligible to become cases both at the time of selection, and throughout their time in the study

• Each case in a C-C study should have been eligible to be a control

Sources of Controls

Population

• Cases come from clearly defined group

• Controls sampled from this population

• Most feasible in population registry cases

Neighborhood

• If group cannot be fully identified (enumerated)

• Involves sampling homes

• Often employs ‘matched sample’ design

Sources of Controls

Hospitals or Clinics

• Cases are people who received treatment at clinic X when they got the disease

• Controls are people who would be treated at clinic X if they got the disease

• May be difficult to identify this group!

• e.g. How do you define the population who would seek treatment at a regional medical center?

How might you define and seek such controls?

Telephone Recruiting

• Limit case eligibility to people who have telephones in a specific area

• Randomly calling telephones in that area can approximate a random sample of the source population

How many control groups?

– Use two separate control groups (e.g., hospital and community controls) and if results are the same, supports validity

OR

– Select 1 large, well-reasoned control group and invest

$$ and effort in this group.

How many controls per case?

• If costs are similar for cases and controls:

– Most efficient: roughly equal numbers

• If number of cases is small and cannot be increased:

– Increasing the number of controls up to a ratio of 4 controls: 1 case improves power of the study

– Beyond 4:1 little power increase

– Increasing number of controls narrows the confidence interval around the odds ratio, but DOES NOT address validity

Sampling Controls: Density Case-

Control Design

• There are many methods of sampling controls, density-based sampling is one of the most common

• Comes from the term incidence density (incidence rate)

• For a given period of time:

For the exposed: For the unexposed:

I

1

= a / PT

1

I

0

= b / PT

0

Where I

1

, a, and PT

1 are the incidence rate, number of cases, and total person time accumulated in the exposed group (symbols are equivalent for the unexposed group)

Sampling Controls: Density Case-

Control Design

• Goal of density-based sampling: estimate the contribution of the exposed and unexposed cohorts in the source population to total person time

– The odds ratio should estimate the incidence rate ratio c / d

PT

1

/ PT

0

Where: c = number of exposed people in control sample d= number of unexposed people in control sample

Sampling Controls: Density Case-

Control Design

• Accordingly

I

1

I

0

 a

PT

1 b

PT

0

 a b

PT

0

PT

1

 a b

 d c

Fundamental assumption: d c

PT

0

PT

1

As such, accurately identifying the source population and taking a random sample is essential

Exercise 1: Analyzing Density-Based

Case-Control Data

A. Skin cancer cases

B. Cohort findings (person-years)

C. Control group (people)

D. Rate (cases/10,000 person years)

Exposure: Severe sun burn

Yes (%) No (%) Total

87 (75.0) 29 (25.0) 116

53,697 (59.6) 36,348 (40.4) 90,045

591 (59.1)

16.2

409 (40.9)

8.0

1,000

12.9

OR

87

409

29

591

2 .

08 |OR – IRR| = 0.05

Increasing sample size reduces standard error. Standard error estimates the difference between a sample statistic (e.g. proportion of population exposed) and a true value (actual proportion exposed). The quality of a C-C estimate depends on the quality of the control sample!

Odds ratio significance test, large sample approximation method

L

 log( a

 d b

 c

)

 log(

87

29

409

591

)

0 .

73

SE

1 a

1

 b

1

 c

1 d

1

87

1

29

1

591

1

409

0 .

22

95 % CI for OR

[exp( L

1 .

96 ( SE )), exp(L

1.96(SE))]

[exp(0.73

1.96(0.22) ), exp(0.73

1.96(0.22) )

( 1 .

35 , 3 .

20 )

How do we interpret this? Can we say that these data suggest severe sun burn is a risk factor for skin cancer?

Case Control Study

25

Measurement of Exposure

• Recall bias: cases remember exposures more often than controls

• Example: study of oral contraceptive use and breast cancer

– Women with breast cancer have greater reason to search their memories for exposures that might be associated with their disease

– The cases are more likely to remember oral contraceptive use than controls

– Bias – over-estimates the risk

Measurement of exposure

• Interviewer bias: Data gatherers should be blind to case status

– If not possible, keep interviewers blind to main hypothesis

• Problem: If interviewers are not blind, they may

(inadvertently) elicit information differently.

– Spend more time data gathering

– Ask more follow-up questions

Measurement of exposure; solutions

• Memory aids

(Photographs, diaries, timelines etc.)

• Example: study of oral contraceptive use

– photographs of pills with names

– timeline to recall major life events.

Nested Case-Control Studies

• The term typically refers to a CC nested within a clearly defined cohort (e.g. the Nurses’ Health Study or the

ELSPAC groups)

• Rationale:

– You may want more data (specific to your outcome) than is available

– Too expensive to get new information from everyone in the cohort

– Instead, sample controls randomly from the cohort and only collect relevant data from them

• A good (and common) opportunity to employed

matched CC designs

The Case-Crossover Design

Crossover studies: an experiment in which two interventions are compared; each subject acts as their own control

– Each subject receives both interventions

• The effects of each intervention are compared for every subject

• There needs to be enough time between interventions to see effects

Intervention 1

Measure effect 1

Intervention 2

Measure effect 2

The Case-Crossover Design

• Case-Crossover Design: all subjects are cases

• Controls are not different people- they are a sample of the cases’ time before disease

• Useful when considering short-term ‘trigger effects’

– e.g. coffee and asthma attack

– H

0

= coffee elevates risk of asthma attack for 1 hr

– ‘Case time’- one hour periods after drinking coffee

– ‘Control time’- a sample of 1 hour periods not after drinking coffee

– Calculate IRR

Matched Case-Control Studies

• Matching is employed because:

– It addresses confounding in the design phase of a study

– It improves the efficiency of the analysis

• Example of confounding

Cigarette smoking

Heart

Attack

Age

• What happens if cases are older than controls? Bias

Matched Case-Control Studies

• Matched analyses are more difficult

• If matching variable are ‘important’ (related to your variable of interest), they should be included in your analysis

– If matching is used simply to structure sampling

(e.g. the neighborhood recruitment example) , and the matching variable are not important, they do not have to be included

Multivariable Models and Case-Control

Studies

• Odds ratios can stand alone only in the absence of confounding and modifying variables: very rare!

• The multivariate extension of a non-matched case control study is logistic regression log(

1

P ( case )

P ( case )

)

 

0

 

1

( sunburn )

 

2

( latitude )

 

3

( skintone )

 

4

( skintone * sunburn )

Odds (log odds) associated with sunburn

Controlling for potential confounder

Allowing effect of sunburn to vary between people with different skin tones

Design Comparison:

Cohort and Case-Control Studies

Cohort Study

Complete source population

Case-Control Study

Sampling from source population

Very expensive Less expensive

Convenient for studying many diseases Convenient for studying many exposures

Can calculate incidence rates/risks and their differences/ratios

Can be retrospective or prospective

Can usually only calculate the ratio of rates or risks

Can be retrospective or prospective

Advantages of Cohort Studies

cohort case-control

Complete exposure information;

Less bias for exposure

Recall and selection bias

Can examine temporal relation Not always

Study multiple outcomes

Incidence rates and RR

Results easy to understand, straightforward

Only one outcome

OR

More difficult interpretation

Disadvantages of Cohort Studies

cohort

Inefficient for rare disease, unless the attributable risk percent is high

If prospective, extremely expensive and time consuming

If retrospective, requires the availability of the records

Validity of the results can be seriously affected by losses to follow-up case-control

Optimal for rare disease

Cheaper and quick

Download