小村庄中选择对照Selection of controls

病例对照研究中对照的选择
Selection of controls in a case-control or case-base study
Robert E. Fontaine, MD, MSc
Bao-Ping Zhu, MD, MS
US Expert Advisors to Chinese FETP
选择对照的原则 Rules of control selection
 病例和对照来自于同一人群或人群组 Cases and controls
come from the same base (population or population subset)
 病例和对照间除了要研究的暴露因素外其他越相似越好
Selection designed to increase similarity with cases for all
characteristics except the exposures under investigation
 如果对照发生了疾病,那么也会变成病例的一部分 If
controls had developed illness, they would have been detected
as a case
在工厂、学校、小村庄中选择对照
Selection of
controls: factories, schools, institutions, small villages
• 常常有各群组的名单 Lists are available by
groups
• 用单纯随机、系统抽样或其他方法来抽样
Random, systematic or other probability
selection from lists.
• 若各群组罹患率不同,应使病例和对照中各组人
数平衡 If attack rates differ by group then
balance cases to controls by group.
某工厂疾病在各群组中的分布情况。如何选
择对照? A disease in a factory. How do you choose controls?
Group
A
B
C
D
E
F
G
H
Cases
6
12
10
13
14
18
9
10
People
35
80
55
90
95
125
50
65
某工厂疾病在各群组中的分布情况。如何选
择对照? A disease in a factory. How do you choose controls?
Group
A
B
C
D
E
F
G
H
Cases
0
28
0
3
24
34
2
1
People
35
80
55
90
95
125
50
65
在“开放”人群中选择对照 Control
selection in an “open” population
 没有有用的名单 No useful list
例:
o 大单位或工厂 Large institutions or factories
o 城镇 Towns
o 城市 Cities
o 县 Counties
o 省 Provinces,
o 国家 Countries
根据入选标准,从范围较小的人群中选择 We
select from a more limited population defined by eligibility criteria
 将人群根据以下特征进行定义 Population +
collection of persons defined by:
o 时间 Time
o 地点 Place
o 人群 Person
 用入选标准,从这一假定人群中选择对照 Use
eligibility criteria to select from this hypothetical
population
病例和对照应该用同样的入选标准
Eligibility criteria should be applied equally to cases
and controls.
例:在一个1,110,000人的城市中有150个病例。
需要随机选择对照 You have 150 cases in a city of
1,110,000 you select controls at random
 如何选择对照?How do you select controls?
 要考虑哪些问题?What must you think
about?
 假定我们随机选择了150名对照 Let’s take a
random selection of 150
但是,分析时却发现病例和对照的分布如下 But you
find in analysis the following distribution of cases and controls
Group
流动人口A Mobile A
流动人口B Mobile B
当地固定人口Local
合计 All
Cases
50
50
50
150
Controls
2
15
133
150
为什么会发生这样的问题?Why??
因为大多数暴露都发生在较小的人群组内 Much
of the exposure was in a limited population.
Group
流动人口A Mobile A
流动人口B Mobile B
当地固定人口Local
合计 All
Cases People Controls
50
10000
2
50 100000
15
50 1000000
133
150 1110000
150
所面临的问题与前面“封闭”人群中的情况相似
You will risk the same problem as in the earlier example in a
closed population.
假设病例的年龄分布不均衡 Imagine that the case
have a limited age distribution.
Group
0-9
10-19
20-29
30-39
40-49
50-59
Cases
5
35
87
15
1
1
People
11000
10000
8000
7000
5000
4000
对人群中进行随机抽样作为对照组 A
probability survey can be a control group
 优点 Strength
o 较少偏倚 Less subject to biases
o 抽样比例已知,因此可以计算率 Sampling fraction
known – one can estimate risk
 缺点 Weakness
o 代价昂贵 Expensive
o 需要处理复杂抽样的分析问题 Need to handle
complex survey analysis
家庭内的对照效率很高 Household controls are
very efficient
 对照与病例在许多因素上很类似 Similar to cases in many
factors besides the disease
 病例和对照的暴露机会机会完全均等 Exposure opportunity
nearly identical to cases
 每一个家庭可以看出是一个小的队列 Each household and be
a small cohort
 实施较容易,常常配合良好 Logistics and cooperation are easy
 可以研究家庭外的暴露因素 Evaluates exposures outside the
home
 缺点:可选择的人数有限,尤其是要对年龄或性别进行
匹配时 Limited in numbers -- age or sex
朋友、同学、同事对照很方便 Friends,
schoolmates, and workmates are very convenient controls
 优点 Strength
 病例与对照间许多因素类似
Similar to cases in many
factors besides the disease
 实施较容易,常常配合良好
Logistics and cooperation are
easy
 缺点 Weakness
o 被人提名作为对照,常常与暴露有关
Being named
may be related to exposure
o 病例影响对照的选择
Case influences control selection
o 重复:同一个人可能被两个病例所提名
same person named by ≥ 2 cases
Overlap –
邻居对照:可以减少可能的地区偏倚 Selection
of neighbors handles potential geographic biases
 比抽样较容易实施
Logistics more simple than
surveys
 若入户调查病例家庭,这一方法很好
Good if
visits to case-household are done
 邻居对照有很高的暴露机会
High probability of
exposure opportunity
 在对照家庭内随机抽取对照
Best to use a
random selection rule inside the neighborhood
 需要进行多次家访
Will require revisits
医院或诊所病人作为对照 Persons coming to
the same hospital and clinic as a case are possible controls.
 易得到、易面访 Easy to obtain and interview
 如果对照也发病,比较容易发现 If they
develop the disease the probability is high that they
would be detected
 对照与病例之间暴露机会可能不同 Can have
different exposures than study base
 可能会有其他的暴露、或因治疗而受到保
护 Can have other exposures and protective effects
from treatment and management
某特殊危险因素在病例和对照间的分布可能迥异
The control can differ from the base population by special risk factors
Risk
Factor
+
-
Hospital
control
80
20
Study
Base
50
50
OR
4.0
对照可能是人群中一个特殊群组。这种选择性对研
究的影响无法预料 These controls are a very special
subgroup of the study base. Their effect is not predictable
其他“与疾病有关”的对照也有同样问题 Other
“disease dependent” control selection have similar problems
 阴性实验室结果 Negative laboratory tests
 监测系统中的其他疾病病例 Other diseases
detected in surveillance
时间上的可比性:举例 Time comparability: Example
 病例对照研究,调查医院中败血症的危险
因素 Case-control study examining risk factors for
septicaemia in a hospital
o 怀疑中心静脉导管是危险因素 Central venous line
is a suspected risk factor
o 若某位被感染的医护人员是感染来源,则暴露
会根据该医护人员是否上班发生时间变异
If an
infected health care worker is a source of infection, the
effect of exposure (central line) will vary over time (presence
of the health care worker)
食源性爆发常常发生如下情况 The following
situation often arises from foodborne outbreaks
餐次 Meal
Cases
Controls
OR
早餐 B 1/10
50%
55%
0.81
午餐 L 1/10
85%
40%
8.5
晚餐 D 1/10
60%
45%
1.8
早餐 B 2/10
35%
45%
0.66
午餐 L 2/10
50%
50%
1.0
晚餐 D 2/10
60%
55%
1.2
若用同一组病例和对照对10月1日午餐进行研究,
会发生什么情况? What if you use the same cases and
controls for a food for lunch on 1/10?
午餐 L 1/10
+
合计 Total
Cases
85
15
100
Controls
40
60
100
• 这样,午餐中每一种食物,都有15%的病例和60%的对照没
有吃过 For any individual food that I evaluate 15% of cases and 60% of
controls were not present to eat it.
• 他们没有暴露机会! There was no exposure opportunity!
对1月10日午餐B食物进行病例对照研究 Casecontrol study on Food B at lunch on 1/10
1月10日午餐 L 1/10+
Food B
Cases
Controls
吃了B食物
34
16
未吃B食物
51
24
没吃1月10日午餐
15
60
66
84
OR
1.0
Did not eat L 1/10
没有吃B食物总数
3.5
If I include these unexposed cases and controls, the OR will be
incorrect and high
If I exclude them, I may not have enough controls.
如果不小心,可能发生如下情况 This is what
happens if one does not take care
1月10日午餐 L 1/10+
Food B
Cases
Controls
吃了B食物
34
16
未吃B食物
51
24
没吃1月10日午餐
15
60
66
84
OR
1.0
Did not eat L 1/10
没有吃B食物总数
3.5
If I include these unexposed cases and controls, the OR will be
incorrect and high
If I exclude them, I may not have enough controls.
如果不小心,可能发生如下情况 This is what
happens if one does not take care
1月10日午餐 L 1/10+
Food B
Cases
Controls
吃了B食物
34
16
未吃B食物
51
24
没吃1月10日午餐
15
60
66
84
OR
1.0
Did not eat L 1/10
没有吃B食物总数
If I include these unexposed cases and controls, the OR will be
incorrect and high
If I exclude them, I may not have enough controls.
如果不小心,可能发生如下情况 This is what
happens if one does not take care
1月10日午餐 L 1/10+
Food B
Cases
Controls
吃了B食物
34
16
未吃B食物
51
24
没吃1月10日午餐
Did not eat L 1/10
15
60
没有吃B食物总数
66
84
OR
1.0
3.5
• 若把没有暴露机会(未吃1/10午餐)者加进去,可能高估OR If we include
these unexposed cases and controls, the OR will be incorrect and high
• 若把他们去掉,对照数目可能不够 If we exclude them, I may not have
enough controls.
对左旋色氨酸进行序列研究
Sequential studies
in assessment of L-tryptophan in the United States
嗜酸性粒细胞增多-肌痛综合征 ,1989 Eosinophilia-myalgia syndrome,
USA, 1989
 问题 Question: 原因是什么? Cause of syndrome?
研究人群 Study base:
全人群,分年龄-性别 Age-sex band of population
 问题 Question: 所有品牌都有危险?All brands of tryptophan?
研究人群Study base:
左旋色氨酸使用者 Tryptophan users
 问题 Question: X品牌所有批次都有问题? All lots of brand X?
研究人群 Study base:
使用X品牌者 Brand X tryptophan users
需要为每名病例选择 >1名对照吗? Should
you select > 1 control per case
 病例数比对照数重要 Case numbers are more important
 为每名病例选择≥ 1名对照可以增加统计效力 ≥ 1
control per case can increase statistical power
 ≥ 4名对照时,统计效力增加得很有限 ≥ 4 does not
improve power
 5-6名对照在病例数很少时可能有帮助 5 to 6 helps
with very low (≤ 10) case numbers
≥2个对照组常常不好 ≥2 control groups to test
the same hypothesis is usually a poor idea.
可以考虑采用序列假设
Testing sequential hypotheses is better
某病在一所学校爆发,研究者决定选择临近一所学
校学生作对照。For a disease in a school, students from a
nearby, similar school are used for controls.
 这种方法选择对照如何?Is this a good choice
for a control group?
 应该如何选择? What is the proper choice?
 如果这所学校所有学生都暴露怎么办?
Suppose every student was exposed? What do you
do?
猪链球菌感染病人的病例定义如下。如何选择对照?
Streptococcus suis case definition, what is your control group?
临床诊断病人+接触病猪或死猪
Clinical disease +
exposure to a sick or dying pig
研究问题:接触病猪或死猪的RR或OR是多少?
Question: What is the relative risk (OR) of disease for contact
with a sick or dying pig?
这种情况下的2X2表 Here is what your 2X2 table will
look like
Pig
exposure
+
-
Cases Controls
100%
0
100%
0
猪链球菌感染病人的病例定义如下。如何选择对照?
Streptococcus suis case definition, what is your control group?
临床诊断病人+接触病猪或死猪 Clinical
disease + exposure to a sick or dying pig
研究问题:何种暴露是危险因素? Question: What
type of exposure is a risk factor for infection?
病例和对照的暴露时间相同 Cases and controls are
examined for the same referent exposure period
 病例在某一时间段内暴露
Cases were
exposed during the referent exposure period
 病例和对照的暴露时间段必须相同
Referent exposure period must be identical for cases
and controls
 时间长短相同 Same duration
 时间段相同 Same time frame
END
Extra Slides
A disease in a school 40/class. How do you
choose controls?
Cases
0
1
2
3
4
5
6
8
72
Classes
10
8
6
5
2
3
1
1
36
Mean = 2 cases/class
Expected cases under random (Poisson)
distribution per class – mean = 2/class
Cases
0
1
2
3
4
5
6
8
72
Classes
10
8
6
5
2
3
1
1
36
Expected
4.8
9.7
9.7
6.5
3.2
1.3
0.43
0.031