Study design

advertisement

Cohort, case-control & cross- sectional studies

Kostas Danis

EPIET Introductory course

Menorca, Spain

16/9-12/10/2012

Source: Alain Moren, EPIET Introductory courses

Epidemiological studies

Two types

Observation

Experiment

Experiment

Exposure assigned

Exposed

Not exposed

Unethical to perform experiments on people if exposure is harmful

Disease occurrence

If exposure not harmful

Treatment

Preventive measure (vaccination)

R andomised

C ontrolled

T rial

Blinded

Doses

Time period

Risk - effect

No bias

If RCT not possible

Left with observation of experiments designed by Nature

Cohort studies

Cross-sectional studies

Case control studies

Cohort studies

marching towards outcomes

What is a cohort?

 One of 10 divisions of a Roman legion

 Group of individuals

sharing same experience

followed up for specified period of time

 Examples

EPIET cohort 2012

birth cohort

cohort of guests at barbecue

occupational cohort of chemical plant workers

influenza vaccinated in 2011-12

follow-up period

end of follow-up

Calculate measure of frequency

 Cumulative incidence

incidence proportion

attack rate (outbreak)

 Incidence rate

Cohort studies

 Purpose

Study if an exposure is associated with outcome(s)

Estimate risk of outcome in exposed and unexposed cohorts

Compare risk of outcome in two cohorts

 Cohort membership

Being at risk of outcome(s) studied

Being alive and

Being free of outcome at start of follow-up

exposed

Cohort studies

unexposed

exposed

Cohort studies

Incidence among exposed unexposed

Incidence among unexposed

Presentation of cohort data:

2x2 table

ill not ill Total a b a+b ate ham did not eat ham c d c+d

Risk in exposed= a/a+b

Risk in unexposed= c/c+d

Incidence rate

Number of NEW cases of disease

Total person - time of observation

Incidence rate

Number of NEW cases of disease

Rate

Total person - time of observation

Denominator:

- is a measure of time

the sum of each individual’s time at risk and free from disease

C

D

E

A

B

Person-time

90 91 92 93 94 95 96 97 98 99 Time at risk x x

6.0

6.0

10.0

8.5

5.0

Total years at risk 35.5

-- time followed x disease onset

Incidence rate (IR)

(Incidence density)

90 91 92 93 94 95 96 97 98 99 00 Time at risk

A

B

6.0

6.0

C 10.0

D

E

= 0.056 cases / person year x

8.5

5.0

= 5.6 cases / 100 person years

Total years at risk 35.5

= 56 cases / 1000 person years

-- time followed x disease onset

Presentation of cohort data:

Person-years at risk

Tobacco smoking and lung cancer,

England & Wales, 1951

Person-years Cases

Smoke 102,600 133

Do not smoke 42,800 3

Source: Doll & Hill

Presentation of data:

Various exposure levels

Daily number of cigarettes smoked

Person-years at risk

Lung cancer cases

> 25

15 - 24

1 - 14 none

25,100

38,900

38,600

42,800

57

54

22

3

Prospective cohort study

Exposure Study starts

Disease occurrence

Study starts Exposure

Disease occurrence time time

Retrospective cohort study

Exposure

Disease occurrence

Study starts time

Recipe: Cohort study

 Identify group of

exposed subjects

unexposed subjects

 Measure incidence of disease

 Compare incidence between exposed and unexposed group

exposed

Cohort studies

Incidence among exposed unexposed

Incidence among unexposed

Effect measures in cohort studies

 Absolute measures

Risk difference (RD)

I

e

- I

ue

 Relative measures

Relative risk (RR)

Rate ratio

Risk ratio

I e

= incidence in exposed

I ue

= incidence in unexposed

I

e

I

ue

Cohort study

Exposed

Total

100

Not exposed

100

Cases

Non cases Risk %

50 50 50 %

10 90 10 %

Risk ratio 50% / 10% = 5

ate ham did not eat ham ill not ill Incidence

49 49 98 50 %

4 6 10 40 %

Risk difference 50% - 40% = 10%

Relative risk 50% / 40% = 1.25

Interpretation of Risk Ratios

RR>1

RR=1

RR<1

Risk factor

No association

Protective factor

Does HIV infection increase risk of developing TB among drug users?

Exposure

HIV +

HIV -

Population

(f/u 2 years)

215

298

Cases

8

1

Incidence

(%)

3.7

0.3

Relative

Risk

11

Vaccine efficacy (VE)

Status Pop.

Vaccinated 301,545

Unvaccinated 298,655

Total 600,200

Cases

Cases per

1,000

150

515

0.49

1.72

RR

0.28

Ref.

665 1.11

VE = 1 - RR = 1 - 0.28

= 72%

Various exposure levels

Exposure level

High

Medium

Low

Population Cases Incidence at risk

N

1

N

2

N

3 a a a

1

2

3

I

I

I

1

2

3

Unexposed N ne c I ue

Exposure level

High

Medium

Low

Various exposure levels

Population Cases Incidence RR at risk

N

1

N

2

N

3 a a a

1

2

3

I

I

I

1

2

3

RR

RR

RR

1

2

3

Unexposed N ne c I ue

Reference

Cohort study:

Tobacco smoking and lung cancer,

England & Wales, 1951

Cigarettes smoked/d

> 25

15 - 24

1 - 14 none

Source: Doll & Hill

Person-years at risk

Cases Rate per

1000 p-y

Rate ratio

25,100

38,900

38,600

42,800

57

54

22

3

2.27

1.39

0.57

0.07

32.4

19.8

8.1

Ref.

Disadvantages of cohort studies

Large sample size

Latency period

Cost

Time-consuming

Loss to follow-up

Exposure can change

Multiple exposure = difficult

Ethical considerations

Strengths of cohort studies

 Can directly measure

incidence in exposed and unexposed groups

true relative risk

 Well suited for rare exposure

 Temporal relationship exposure-disease is clear

 Less subject to selection biases

outcome not known (prospective)

Cohort studies

 Can examine multiple effects for a single exposure

Population exposed N e unexposed N ne

Outcome 1 Outcome 2 Outcome 3

I e1

I ue1

I

I e2 ue2

I

I e3 ue3

RR

1

RR

2

RR

3

A cohort study allows to calculate indicators which have a clear, precise meaning.

The results are immediately understandable.

Cross-sectional (prevalence) studies

Cross-sectional studies

 Observation of a cross-section of a population at a single point in time

 Recruitment of study participants

 Population

 Population sample

 Observation for the presence of:

 One or more outcomes

 One or more exposures

Sampling

Sampling

Population

Sample

Target Population

Uses of cross-sectional surveys in public health

 Estimate prevalence of disease or their risk factors

 Estimate burden

 Measure health status in a defined population

 Plan health care services delivery

 Set priorities for disease control

 Generate hypotheses

 Examine evolving trends

Before / after surveys

Iterative cross-sectional surveys

Potential objectives of a cross sectional study

 Descriptive

 Estimate prevalence

 Analytic

 Compare the prevalence of a disease in various subgroups, exposed and unexposed

 Compare the prevalence of an exposure in various subgroups, affected and unaffected

Presentation of the data of an analytical cross sectional study in a 2 x 2 table

Exposed

Non exposed

Ill a c

Non ill b d

Total a+b c+d

Simultaneous measurement of outcomes and exposures

Cross-sectional study

Total Cases

Non cases Prevalence %

Exposed 1,000 500 500 50 %

Not exposed

1,000 100 900 10 %

Prevalence ratio (PR) 50% / 10% = 5

Measuring association in analytical cross-sectional surveys

 Prevalence among exposed / prevalence among unexposed

 Prevalence ratio

 Formula equivalent to risk ratio

 Concept different

No incidence

Only prevalence

• depends on both occurrence of new cases & duration of disease

Prevalence of West Nile virus (WNV) infection by place of residence, Central Macedonia,Greece, 2010

Rural

Infected Total Prevalence Prevalence ratio

38 491 7.7% 5.9

Urban 3 232 1.3% Ref

Prevalence of HIV infection by socioeconomic status,

African country X, 1999

High class

Low class

Infected Total Prevalence Prevalence ratio

15 235 6.4% 2.6

11 450 2.4% Ref

Prevalence of hepatitis C (HCV) infection by quantity of therapeutic injections, Hazabad, Pakistan, 1993

No.of injection s

>10

Infected Total

Prevalence Prevalence ratio

9 41 22% 22

0-10 4 52 8% 8

0 1 82 1% Ref

Advantages of cross-sectional surveys

 Fairly quick

 Easy to perform

 Less expensive

 Adapted to chronic diseases

Limitations of cross-sectional surveys

 Limited capacity to document causality

(exposure and outcome measured at the same time

-difficult to establish time sequence of events)

 Not useful to study disease etiology

 Not suitable for the study of rare / short diseases

 Not adapted to severe / acute diseases

 Not adapted to incidence measurement.

Limitations of causal inference in analytical cross sectional studies

• Prevalent cases

• Exposure and outcome examined at the same time

Principle of case control studies

Our objective is to compare

the incidence rate in the exposed population to the rate that would have been observed in the same population, at the same time if it had not been exposed

Source population

Exposed

Unexposed

Source population

Exposed

Unexposed

Cases

Source population

Exposed

Unexposed

Sample

Cases

Controls

Source population

Exposed

Unexposed

Sample

Cases

Controls:

Sample of the denominator

Representative with regard to exposure

Controls

Intuitively if the frequency of exposure is higher among cases than controls then the incidence rate will probably be higher among exposed than non-exposed

Case control study

Exposure

?

?

Disease

Controls

Retrospective nature

Distribution of cases and controls according to exposure in a case control study

Exposed

Not exposed

Total

% exposed

Cases a c a + c

Controls b d b + d

Distribution of cases and controls according to exposure in a case control study

Exposed

Not exposed

Total

% exposed

Cases a c a + c a/(a+c)

Controls b d b + d b/(b+d)

Distribution of myocardial infarction by oral contraceptive use in cases and controls

Oral contraceptives

Yes

No

Total

% exposed

Myocardial

Infarction

693

307

1000

69.3%

Controls

320

680

1000

32%

Distribution of myocardial infarction by amount of physical activity in cases and controls

Physical activity

>= 2500 Kcal

< 2500 Kcal

Total

% exposed

Myocardial

Infarction

190

176

366

51.9%

Controls

230

136

366

62.8%

Volvo factory, Sweden, 3000 employees,

Cohort study

200 cases of gastroenteritis

Water

Consumption

YES

NO

Total

Cases

150

50

200

Controls

?

?

200

Two types of case control studies

Exploratory

New disease

New risk factors

Several exposures

"Fishing expedition"

Analytical

Define a single hypothesis

Dose response

Cohort studies

Rate/risk

Rate/risk difference

Rate Ratio/Risk ratio (strength of association)

Case control studies

No calculation of rates/risks

Proportion of exposure

Any way of estimating measures of association?

Odds

Probability that an event will happen

Probability that an event will not happen

Probability that cases/controls will be exposed

Probability that cases/controls will not be exposed

Case control study

Cases Controls Odds ratio

Exposed a b

OR= (a/c) / (b/d)

= ad / bc

Not exposed c

Total a+c

% exposed a/(a+c)

% unexposed c/(a+c)

Odds of exposure

a/c

d b+d b/(b+d) d/(b+d)

b/d

Case control study

Exposed

Not exposed

Total

Odds of exposure

Cases Controls Odds ratio

50 20 4 a

50 80 b c d

100

50/50

100

20/80

OR= (a/c) / (b/d)

= ad / bc

= (50x80) / (20x50)

= 4

Case control study design

E

Cases Controls a b

Odds ratio a b

-----= c d a x d

--- ---b x c

E c d

Frequency of chicken consumption in campylobacter cases and controls, Republic of Ireland and Northern

Ireland, 2003

Cases Controls Odds ratio

251 2.1

Ate chicken 181

Did not eat chicken

15 44 Ref

Frequency of contact with a dog in campylobacter cases and controls, Republic of Ireland and Northern

Ireland, 2003

Cases

Contact with dog

Yes

No

29

158

Controls

93

201

Odds ratio

0.40

Ref

Distribution of myocardial infarction by recent oral contraceptive use in cases and controls

Oral contraceptives

Yes

No

Myocardial

Infarction

693

307

Controls

320

680

Total 1000 1000

Odds 693/307= 320/680= of exposure 2.2

0.5

OR

4.8

Ref.

Distribution of myocardial infarction by amount of physical activity in cases and controls

Physical activity

>= 2500 Kcal

Myocardial

Infarction

190

Controls

230

< 2500 Kcal

Total

176

366

136

366 odds of 190/176= 230/136= exposure 1.1

1.7

OR

0.64

Ref.

Distribution of cases of endometrial cancer by oestrogen use in cases and controls

Oestrogen use

High

Low

None

Cases Controls Odds ratio a1 b1 a1d/b1c a2 b2 a2d/b2c c d Reference

Relation of hepatocellular adenoma to duration of oral contraceptive use in 79 cases and 220 controls

Months of

OC use

0-12

13-36

37-60

61-84

>= 85

Total

Cases

7

11

20

21

20

79

Source: Rooks et al. 1979

121

49

23

20

7

220

Controls Odds ratio

Ref.

3.9

15.0

18.1

49.7

Advantages of case control studies

Rare diseases

Several exposures

Long latency

Rapidity

Low cost

Small sample size

Available data

No ethical problem

Limitations of case-control studies

 Cannot compute directly risk

 Not suitable for rare exposure

 Temporal relationship exposure-disease difficult to establish

 Biases +++

control selection

recall biases when collecting data

 Loss of precision due to sampling

The cohort study is the gold-standard of analytical epidemiology

CASE-CONTROL STUDIES HAVE THEIR PLACE

IN EPIDEMIOLOGY, but if cohort study possible, do not settle for second best

Thank you!

Back-up slides

E

Cases

Population denominator a P

1

E c P

0

I

1

= a / P

1 }

I

1

/ I

0

= -----a /P

1 c /P

0 I

0

= c /P

0

Cases

Population sample

E a P

1

/10

E c P

0

/10 a

I

1

= --------

P

1

/ 10 c

I

0

= --------

P

0

/10

}

I

1

/ I

0

= -----a /P

1 c /P

0

Source population

Cases Pop.

E a

E c

P

1

P

0

I

1

= a / P

1

}

I

1

/ I

0

= -----a /P

1 c /P

0

I

0

= c /P

0

= sample

E

Cases Controls a b

P

1 b

--= ----

P

0 d

E c d

Source population

Cases Pop.

E a P

1

I

1

= a / P

1

E

E c c

E

Cases a d

P

0 b

= sample

Controls

P

1

I

0

= c /P

0 b

}

I

1

/ I

0 a /P

1 a . P

0 a . d

= ------ = ------- = ----- = a / c

-----c /P

0 c . P

1 c . b b / d

Since d/b = P

0

/ P

1

--= ----

P

0 d

Download