Unit 3. Research Design Introduction Research is defined as human activity based on intellectual application in the investigation of matter. The primary purpose for applied research is discovering, interpreting, and the development of methods and systems for the advancement of human knowledge on a wide variety of scientific matters of our world and the universe. Research can use the scientific method, but need not do so. Scientific research relies on the application of the scientific method, a harnessing of curiosity. This research provides scientific information and theories for the explanation of the nature and the properties of the world around us. It makes practical applications possible. Scientific research is funded by public authorities, by charitable organisations and by private groups, including many companies. Scientific research can be subdivided into different classifications according to their academic and application disciplines. *The term research is also used to describe an entire collection of information about a particular subject. ________________________________________________________________ Basic research (also called fundamental or pure or theoretical research) has as its primary objective the advancement of knowledge and the theoretical understanding of the relations among variables (see statistics). It is exploratory and often driven by the researcher’s curiosity, interest, and intuition. Therefore, it is sometimes conducted without any practical end in mind, although it may have unexpected results pointing to practical applications. The terms “basic” or 1 “fundamental” indicate that, through theory generation, basic research provides the foundation for further, sometimes applied research. As there is no guarantee of short-term practical gain, researchers may find it difficult to obtain funding for basic research. Examples of questions asked in basic research: Does string theory provide physics with a grand unification theory? Which aspects of genomes explain organismal complexity? Is it possible to prove or disprove Goldbach's conjecture? (i.e., that every even integer greater than 2 can be written as the sum of two, not necessarily distinct primes) Traditionally, basic research was considered as an activity that preceded applied research, which in turn preceded development into practical applications. Recently, these distinctions have become much less clear-cut, and it is sometimes the case that all stages will intermix. This is particularly the case in fields such as biotechnology and electronics, where fundamental discoveries may be made alongside work intended to develop new products, and in areas where public and private sector partners collaborate in order to develop greater insight into key areas of interest. For this reason, some now prefer the term frontier research. Empirical research (Evidence-based research): The word empirical denotes information gained by means of observation, experience, or experiment, as opposed to theoretical. A central concept in science and the scientific method is that all evidence must be empirical, or empirically based, that is, dependent on evidence or consequences that are observable by the senses. It is usually 2 differentiated from the philosophic usage of empiricism by the use of the adjective "empirical" or the adverb "empirically." "Empirical" as an adjective or adverb is used in conjunction with both the natural and social sciences, and refers to the use of working hypotheses that are testable using observation or experiment. In this sense of the word, scientific statements are subject to and derived from our experiences or observations. Empirical data are data that are produced by experiment or observation. ________________________________________________________________ Primary research (also called field research) involves the collection of data that does not already exist. This can be through numerous forms, including questionnaires and telephone interviews amongst others. This information may be used in such things as questionnaires, magazines, and Interviews Secondary research (also known as desk research) involves the summary, collation and/or synthesis of existing research rather than primary research, where data is collected from, for example, research subjects or experiments. The term is widely used in market research and in medical research. The principle methodology in medical secondary research is the systematic review, commonly using meta-analytic statistical techniques, although other methods of synthesis, like realist reviews and meta-narrative reviews, have been developed in recent years. Secondary research can come from either internal or external sources. ________________________________________________________________ 3 Social research refers to research conducted by social scientists (primarily within sociology and social psychology), but also within other disciplines such as social policy, human geography, political science, social anthropology and education. Sociologists and other social scientists study diverse things: from census data on hundreds of thousands of human beings, through the in-depth analysis of the life of a single important person to monitoring what is happening on a street today - or what was happening a few hundred years ago. Social scientists use many different methods in order to describe, explore and understand social life. Social methods can generally be subdivided into two broad categories. Quantitative methods are concerned with attempts to quantify social phenomena and collect and analyse numerical data, and focus on the links among a smaller number of attributes across many cases. Qualitative methods, on the other hand, emphasise personal experiences and interpretation over quantification, are more concerned with understanding the meaning of social phenomena and focus on links among a larger number of attributes across relatively few cases. While very different in many aspects, both qualitative and quantitative approaches involve a systematic interaction between theories and data. Common tools of quantitative researchers include surveys, questionnaires, and secondary analysis of statistical data that has been gathered for other purposes (for example, censuses or the results of social attitudes surveys). Commonly used qualitative methods include focus groups, participant observation, and other techniques. Psychology is an academic and applied discipline involving the scientific study of mental functions and behavior. Psychologists study such phenomena as 4 perception, cognition, emotion, personality, behavior, and interpersonal relationships. Psychology also refers to the application of such knowledge to various spheres of human activity, including issues related to everyday life (e.g. family, education, and employment) and the treatment of mental health problems. Psychologists attempt to understand the role of these functions in individual and social behavior, while also exploring the underlying physiological and neurological processes. Psychology includes many sub-fields of study and applications concerned with such areas as human development, sports, health, industry, media, and law. Psychology incorporates research from the natural sciences, social sciences, and humanities. Epidemiology is the study of factors affecting the health and illness of populations, and serves as the foundation and logic of interventions made in the interest of public health and preventive medicine. It is considered a cornerstone methodology of public health research, and is highly regarded in evidence-based medicine for identifying risk factors for disease and determining optimal treatment approaches to clinical practice. In the work of communicable and non-communicable diseases, the work of epidemiologists range from outbreak investigation to study design, data collection and analysis including the development of statistical models to test hypotheses and the documentation of results for submission to peer-reviewed journals. Epidemiologists may draw on a number of other scientific disciplines such as biology in understanding disease processes and social science disciplines including sociology and philosophy in order to better understand proximate and distal risk factors. 5 Study Design The degree to which researchers need to build certain logical arrangements into their studies depends on which purposes are guiding their work. If our purpose is limited to exploring a new area about which little is known – in the hope of generating new lights and hypotheses that will be studied more rigorously later on – it is appropriate to use flexible methods that yield tentative findings. Trying to tightly structure exploratory studies in order to permit conclusive logical inferences and generalization s to be made from the findings would not only be unnecessary but undesirable. A flexible methodology in an exploratory study would not permit researchers the latitude they need to probe creatively into unanticipated observations or into areas about wich they lack the information needed to construct a design that would be logically conclusive. Exploratory research is a type of research conducted because a problem has not been clearly defined. Exploratory research helps determine the best research design, data collection method and selection of subjects. Given its fundamental nature, exploratory research often concludes that a perceived problem does not actually exist. Exploratory research often relies on secondary research such as reviewing available literature and/or data, or qualitative approaches such as informal discussions with consumers, employees, management or competitors, and more formal approaches through in-depth interviews, focus groups, projective methods, case studies or pilot studies. The Internet allows for research methods that are more interactive in nature: E.g., RSS feeds efficiently supply researchers with up-to-date information; major search engine search results may be sent by email to researchers by services such as Google Alerts; comprehensive search 6 results are tracked over lengthy periods of time by services such as Google Trends; and Web sites may be created to attract worldwide feedback on any subject. The results of exploratory research are not usually useful for decision-making by themselves, but they can provide significant insight into a given situation. Although the results of qualitative research can give some indication as to the "why", "how" and "when" something occurs, it cannot tell us "how often" or "how many." Exploratory research is not typically generalizable to the population at large. But when our research purpose is description or explanation, then adhering to certain logical principles in the design of our research becomes much more importatnt. Descriptive studies seek to portray accurately the characteristics of a population. These studies usually will attempt to make generalizations about the attributes of what population by studying a small part of (a sample drawn from) that population. The more confident we are that the part (sample) that we study is representative of the population, the more confident we can be in the generalizations we make about the population. Explanatory studies also usually aim to generalize from a sample to a populaiton. But what they seek to generalize focuses on causal processes occurring among variables, not simply on describing specific attributes. Explanatory studies therefore need to be more concerned with those logical arrangements that permit us to make inferences aobut causality in the sample we observe as well as those logical arrangements that enable us to generalize our causal inferences to a larger population. 7 Two fundamental issues in descriptive and explanatory research, then, are the gereralizability of research findings and the logical validity of the causal inferences made about those findings. In fact, these issues are also important in exploratory research in the sense that we need to be careful not to overgeneralize exploratory findings or draw unwarrented causal inferences from them. Types of design Some of the most popular designs are sorted below, with the ones at the top being the most powerful at reducing observer-expectancy effect but also most expensive, and in some cases introducing ethical concerns. The ones at the bottom are the most affordable, and are frequently used earlier in the research cycle, to develop strong hypotheses worth testing with the more expensive research approaches. Experimental 1. Randomized controlled trial Double-blinding Single-blinding 8 Non-blinding 2. Non-randomized controlled trial Non-experimental 1. Cohort study 2. Case-control study 3. Cross-sectional study ________________________________________________________________ Descriptive Quantitative and Qualitative Methods Qualitative Research Methods emphasize the depth of understanding associated with idiographic concerns (understanding personal information). They attempt to tap the deeper meanings of particular human experiences and are intended to generate theoretically richer observations that are not easily reduced to numbers. During the first half of the 20th century, qualitative methods were commonly employed by researchers. Around the middle of the 20 th century, however, the potential for quantitaive methods yielding more generalizable conclusions became appealing to social scientists in general. Gradually, quantitative studies were regarded as superior – more “scientific” – and they began to squeeze out qualitative studies. During the 80’s, qualitative methods have enjoyed a rebirth of support in the social science generally. Some scientists called quantitative methods obsolete and implored the profession to concentrate on qualitative methods. Many scholars, however, do not believe that the two contrasting types of mehtods are 9 inherently incompatible. In their view, quantitative and qualitative mehtods – despite their philosophical differences – play an equally important, complementary role in knowledge building. Indeed, some of our best research has combined the two types of methods within the same study. Qualitative methods may be more suitable when flexiblity is required to study a new phenomenon about which we know very little, or when we seek to gain insight into the subjective meanings of complex phenomena to advance out conceptualization of them and to build theory that can be tested in future studies. Sometimes, therefore, qualitative research can pave the way for quantitative studies of the same subject. Other times, qualitative methods produce results that are sufficient in themselves. However, it is increasingly recognised thatt quantitative and qualitative approaches can be complementary. They can be combined in a number of ways, for example: 1. Qualitative methods can be used in order to develop quantitative research tools. For example, focus groups could be used to explore an issue with a small number of people and the data gathered using this method could then be used to develop a quantitative survey questionnaire that could be administered to a far greater number of people allowing results to be generalised. 2. Qualitative methods can be used to explore and facilitate the interpretation of relationships between variables. For example researchers may inductively hypothesize that there would be a positive relationship between positive attitudes of sales staff and the amount of sales of a store. However, quantitative, deductive, structured observation of 576 10 convenience stores could reveal that this was not the case, and in order to understand why the relationship between the variables was negative the researchers may undertake qualitative case studies of four stores including participant observation. This might abductively confirm that the relationship was negative, but that it was not the positive attitude of sales staff that led to low sales, but rather that high sales led to busy staff who were less likely to be express positive emotions at work! Quantitative methods are useful for describing social phenomena, especially on a larger scale. Qualitative methods allow social scientists to provide richer explanations (and descriptions) of social phenomena, frequently on a smaller scale. By using two or more approaches researchers may be able to 'triangulate' their findings and provide a more valid representation of the social world. A combination of different methods are often used within "comparative research", which involves the study of social processes across nation-states, or across different types of society. 11 Ecological study: a. Observation of disease and exposure on the group basis. Example from Szklo and Nieto of grouped data from cohorts in the Seven Countries Study Ecological Fallacy: The failure of expected ecological effect estimates to reflect the biologic effect at the individual level. Examples: Researchers found suicide rate is associated with areas where Catholics lives (1951). Latter, some researchers found that in these area also lives many minorities with very high suicide rate, in contrast, Catholics has low suicide rate. Summary of Ecologic Studies Advantages: Good for Generating hypotheses Inexpensive Quick and simple Limitations: Population information may not apply to individuals Difficult to derive precise estimates of exposure and/or disease No control for extraneous factors (confounding) 12 Cohort Study Research strategies: a. Prospective – The investigator collects information on the exposure status of the study subjects at the time the study begins, and identified new cases of disease (or deaths) from that time on b. Retrospective – The investigator identifies the exposure characteristics of a cohort in the past and then reconstructs the subsequent disease experience up to some defined point in the more recent past or up to the present time. c. Retrospective cohort studies have some distinct advantages over prospective cohort studies; in particular, they can be completed in a much more timely fashion and are therefore considered less expensive. d. In Retrospective cohort studies, the investigator may have the problems in determining exposures, while in prospective cohort studies, the investigator can actually measure the exposures thoroughly. Sources of cohorts: a. Special exposure groups 13 1. High occupational exposure 2. Smokers 3. Persons with unusually high or low physiologic measurement, e.g., cholesterol or blood pressure b. Special resource group 1. Prepaid medical care plans 2. Physicians 3. Insured persons 4. Veterans 5. College graduates c. Geographically defined group 1. Framingham, Massachusetts Selection of comparison groups: a. Internal comparison group: One group is initially defined and then divided into exposure categories (usually, relative risk and mortality rate are calculated) b. External comparison group: When there is no non-exposed sub-group existed in this cohort, a external population is sought, e.g., general population, another factories, etc. (usually, mortality rates are compared; rate ratios or SMR) c. Multiple: A combination of the above (Internal and External comparison groups). Should the comparison groups give different results, then the investigator must try to find out why the results are different. d. Healthy-worker effect (健康工人效應): It can be divided into two important parts: • healthy worker hire effect (HWHE) : healthy workers being more likely to be employed than those who are relatively less healthy on an initial selection process. • healthy worker survival effect (HWSE) : This effect refers to a continuing selection process such that the probability of an individual still remaining employed in a workplace is greater in healthy workers than in unhealthy workers e. Careful thought must enter into the choice of the comparison group and of the measurements to be made on all cohort members so that either the unexposed cohort members are firmly believed to be comparable to exposed cohort members, except for the exposure, in the likelihood of developing the disease. Summary of cohort studies: a. Advantages 14 1. Direct determination of risk or rate 2. Stronger evidence of exposure and disease association 3. Provides evidence about lag time between exposure and disease 4. Easier to generalize findings b. Limitations: 1. Take a long time 2. Difficult to implement and carry out 3. Big dimes 4. Lost to follow-up Data analysis: Risk (Rate) Difference: By subtracting the smaller from the larger, one may obtain the magnitude of the risk difference. (率差) Attributable Risk : the difference between two incidence rates; (Rexp – Runexp) (可歸因的危險) e.g., Lung cancer mortality rate in nonsmokers aged 55-69= 19 per 100,000 person-years in smokers age 55-69= 188 per 100,000 persons-years Rate difference = 169 per 100,000 person-years Smoking cigarette produce 169 lung cancer cases per 100,000 person-years among smokers Attributable risk percent (危險百分率) (etiologic fraction (病因分率) or attributable fraction): is the difference between risks (rates) between the exposed and non-exposed divided by the (risk) rate in the exposed. 169 per 100,000 person-year/188 per 100,000 person-years = 169/188 = 0.9 Smoking produced 90% of the lung cancer cases among smokers. Among smokers, 90% of the lung cancer were attributable to smoking. Population Attributable Risk (PAR; 族群可歸因的危險): Rp - Runexp Population Attributable Risk Percent (PAR%;族群病因分率): (Rp Runexp )/Rp 15 Eg., 自 1980 年起調查 200,000 名勞工抽煙史,並將其抽煙情形分為抽煙 者、已戒煙者、從不吸煙者等 3 組作追蹤,十年後發現: 抽煙情形 肺癌人數 觀察人年 數 吸煙者 121 660,000 已戒煙者 25 200,000 從不吸煙者 62 1060,000 總和 208 1,920,000 請問抽煙行為在抽煙者中所造成的危險為?在族群中所造成的危險為? 16 Relative Risk (RR)/Risk Ratio: (相對危險/危險對比值) Diseased a c Exposed Non-exposed Non-diseased b d a+b c+d RR=(a+b)/(c+d) Evaluation: >1 positive correlation (risk factor) =1 No correlation <1 Negative correlation (protective factor) 95% confidence intervalincluding 1 not statistically significant 例題:以下為某世代研究的結果: 喝酒 肝癌 合計 有 無 有 30 10000 10030 無 10 30000 30010 合計 40 40000 40040 (1) 請計算相對危險(relative risk)?並解釋其意義。(5%) (2) 請計算相差危險(risk difference)?並解釋其意義。(5%) 17 Sleep Disturbances and Cause-Specific Mortality: Results From the GAZEL Cohort Study Naja Hulvej Rod,* Jussi Vahtera, Hugo Westerlund, Mika Kivimaki, Marie Zins, Marcel Goldberg, and Theis Lange Am J Epidemiol. 2011 February 1; 173(3): 300–309. Abstract: Poor sleep is an increasing problem in modern society, but most previous studies on the association between sleep and mortality rates have addressed only duration, not quality, of sleep. The authors prospectively examined the effects of sleep disturbances on mortality rates and on important risk factors for mortality, such as body mass index, hypertension, and diabetes. A total of 16,989 participants in the GAZEL cohort study were asked validated questions on sleep disturbances in 1990 and were followed up until 2009, with <1% loss to follow-up. Body mass index, hypertension, and diabetes were measured annually through self-reporting. During follow-up, a total of 1,045 men and women died. Sleep disturbances were associated with a higher overall mortality risk in men (P = 0.005) but not in women (P = 0.33). This effect was most pronounced for men <45 years of age (≥3 symptoms vs. none: hazard ratio = 2.03, 95% confidence interval: 1.24, 3.33). There were no clear associations between sleep disturbances and cardiovascular mortality rates, although men and women with sleep disturbances were more likely to develop hypertension and diabetes (P < 0.001). Compared with people with no sleep disturbances, men who reported ≥3 types of sleep disturbance had an almost 5 times' higher risk of committing suicide (hazard ratio = 4.99, 95% confidence interval: 1.59, 15.7). Future strategies to prevent premature deaths may benefit from assessment of sleep disturbances, especially in younger individuals. Keywords: body mass index, cause of death, diabetes mellitus, hypertension, longitudinal studies, mortality, sleep disorders 18 The GAZEL cohort study The GAZEL cohort study was initiated in 1989 and was at baseline composed of a sample of 20,625 employees, aged 35–50 years, of the French national gas and electricity company, Electricité de France–Gaz de France (22). A questionnaire was sent to the participants every year to obtain data on health status, lifestyle, social, and occupational factors. Electricité de France–Gaz de France employees hold a civil servant-like status that guarantees job stability, and typically employees are hired when they are in their 20s and stay with the company until they retire. About 75% of the questionnaires have been returned annually, and <1% of the participants have been lost to follow-up over 20 years. The vast majority of the participants were white Europeans, and all gave informed consent. The 1990 study questionnaire included questions about sleep disturbances, and this wave was used as the baseline for the present study. The 17,970 participants who responded in 1990 constituted a response rate of 87%. Participants with missing information on any of the covariates (n = 981) were excluded, leaving 4,465 women and 12,524 men for the analyses. Follow-up The participants were followed from the date of the 1990 examination until the date of death (n = 1,045) or the end of follow-up on September 25, 2009. Data on total number and causes of death were obtained from the French National Death Index. Cause-specific mortality was coded using the International Classification of Diseases, Ninth Revision (ICD-9), until 1998, and the Tenth Revision (ICD-10) thereafter. We distinguished between deaths due to cancers (ICD-9 codes: 140–208; ICD-10 codes: C00–C97), CVD (ICD-9 codes: 390–459; ICD-10 codes: I00–I99), external causes (ICD-9 codes E800–E999; 19 ICD-10 codes: V01–X84), and suicide (a subcategory of external causes with ICD-9 codes: E950–E959; ICD-10 codes: X60–X84). BMI trajectories, as well as incidence of diabetes and hypertension, were determined on the basis of annually updated self-reported information on these conditions. Incident cases were defined as first-time reporting of hypertension or diabetes. 20 Case-control study a. Persons with a given disease (the cases) and persons without the given disease (the controls) are selected; the proportions of cases and controls that have certain background characteristics are then determined and compared. b. Sources of cases: 1. Ideally: All incident cases in a defined population in a specified time period; e.g., a tumor or disease registry, or a Vital Statistics Bureau 2. In the real world: Logistics may restrict case selection to one or a few medical facilities. c. Caveats in selection of cases: 1. Representativeness of the cases derived from special care facilities 2. Prevalent cases make it difficult to separate characteristics that are causal or consequential; usually, incident cases are used in the study. d. Selection of Controls: 1. Ideally: Should have the same characteristics as the cases, except for the exposure of interest. 2. In the real world: rarely achieved… e. Sources of Controls: 1. Populations: Total population, random sample 2. Patients from same hospital as the cases 3. Relatives of cases: Spouses, siblings 4. Associates of cases: Friends, co-workers, neighbors Sources of Controls: a. Population: 1. Advantage: More representative 2. Limitation: Expensive and low participation possible b. Patients from same hospital as the cases: 1. Advantage: Cheap, quick, more likely to participate 2. Limitation: May not be representative c. Relatives of cases: 21 1. Advantage: Good way to control for other variables, e.g., socioeconomic status, education, ethnic status 2. Limitation: Expensive and time consuming; may end up controlling for an important (unidentified) risk factor Matching a. The pairing of one or more controls to each case based on certain characteristics. b. Matching in case-control study will introduce bias into the study. c. To enhance study precision. (reduce confidence interval) d. Variables selected in matching should be controlled latter in the data analysis. Summary of case control studies: a. Quick b. Easy c. Relatively inexpensive d. More easily repeated e. Particularly useful for rare diseases Limitations: a. Difficult to know representativeness of the cases and controls b. In some cases, may not provide a direct measure of risk, only odds ratio can be obtained; in some cases, if the population served as controls, rate can be obtained. c. Possibility of introduction of bias: selection bias, follow-up bias, information bias Odds Ratio: a technique for estimating relative risk (勝算比) 22 Exposed Non-exposed Case a c Control b d OR=(a/c)/(b/d)=a*d/b*c If it is a rare disease, OR is close to RR. 23 Variety of case-control study design Case-Based Case-Control Study 1. Cases and noncases are identified at a given point in time among living individuals. 2. This study is carried out “cross-sectionally” (ie, cases and controls are identified at the same time), cases must necessarily occur over given time period prior to their inclusion in the study. 3. Thus, it is necessary to assume that the cases who survive through the time with regard to the exposure experience and that if exposure data are obtained through interviews, recall or other bias will not intrude regarding to their exposure status. 4. Source of cases is often one or more hospitals or other medical facilities. 5. If case-based design using prevalent cases: essentially same as cross-sectional design Case-cohort Study and Nested Case-Control Study When cases are identified within a well defined cohort, it is possible to carry out nested case-control or case-cohort studies. Nested Case-Control Study 1. Controls are from a random sample of the cohort selected at the time each case occurs. This study design is called a nested case-control design and is based on a sampling approach known as “incidence density sampling” or “risk-set sampling.” 2. The idea underlying this sampling schemes is that it allows the comparison of cases with a subset of the cohort members at risk of being cases at the time when each case occurs - that is, a “risk set” of all cohort members under observation at the time of each case’s occurrence. 24 In nested case-control study: 1. The probability that any person in the source population is selected as a control is proportional to his or her person-time contribution to the incidence rates in the source population. 2. By this strategy, cases occurring later in the follow-up are eligible to be controls for earlier cases. 3. Incidence density sampling is the equivalent of matching cases and controls on duration of follow-up. Example: Human Chorionic Gonadotropin and Alpha-Fetoprotein Concentrations in Pregnancy and Maternal Risk of Breast Cancer: A Nested Case-Control Study Annekatrin Lukanova, Ritu Andersson, Marianne Wulff, Anne Zeleniuch-Jacquotte, Kjell Grankvist, Laure Dossus, Yelena Afanasyeva, Robert Johansson, Alan A. Arslan, Per Lenner, Go‥ ran Wadell, Go‥ ran Hallmans, Paolo Toniolo, and Eva Lundin Am. J. Epidemiol. (2008) 168: 1284-1291. Abstract Pregnancy hormones are believed to be involved in the protection against breast cancer conferred by pregnancy. The authors explored the association of maternal breast cancer with human chorionic gonadotropin (hCG) and a-fetoprotein (AFP). In 2001, a case-control study was nested within the Northern Sweden Maternity Cohort, an ongoing study in which blood samples have been collected from first-trimester pregnant women since 1975. Cases (n 210) and controls (n 357) were matched for age, parity, and date of blood donation. Concentrations of hCG and AFP were measured by immunoassay. No overall significant association of breast cancer with either hCG or AFP was observed. 25 However, women with hCG levels in the top tertile tended to be at lower risk of breast cancer than women with hCG levels in the lowest tertile in the whole study population and in subgroups of age at sampling, parity, and age at cancer diagnosis. A borderline-significant decrease in risk with high hCG levels was observed in women who developed breast cancer after the median lag time to cancer diagnosis (_14 years; odds ratio 0.53, 95% confidence interval: 0.27, 1.03; P 0.06). These findings, though very preliminary, are consistent with a possible long-term protective association of breast cancer risk with elevated levels of circulating hCG in the early stages of pregnancy Study population Study subjects were part of the Northern Sweden Maternity Cohort, which is based at the University Hospital in Umea° (Umea°, Sweden). This cohort has been described previously (14). The cohort was established in November 1975 with the purpose of preserving for research use serum samples from pregnant women tested for systemic infections. Virtually all pregnant women from the 4 northernmost counties of Sweden (total population, approximately 800,000) visit one of the maternity health-care clinics in the region and donate a blood sample, mostly during the final weeks of the first trimester of pregnancy or the early weeks of the second trimester (weeks 6–18). Potential cases were women diagnosed for the first time with invasive breast cancer after their entry into the cohort. Cases were identified through record linkage with the nationwide Swedish Cancer Registry using the unique 10-digit personal identity number assigned to every person born in or legally resident in Sweden. The registration of newly detected cancers in Sweden is based on mandatory reports from all physicians serving outpatient and inpatient departments in all public and private hospitals. Reporting is also mandatory for all pathologists involved with surgical biopsies, cytologic specimens, and autopsies, including private laboratories. The completeness of cancer registration in Sweden is considered close to 100%. Linkages carried out in 2000 and 2001 led to the identification of 426 potential case subjects. Potential controls were selected among cohort members who were alive and free of cancer at the time of diagnosis of the index case. Controls were matched to the case (1:2 ratio) on parity at the time of blood sampling (primiparous, nonprimiparous), age at blood sampling (62.5 years), and date of blood sampling (63 months). 26 Case-cohort Study 1. Controls are from a random sample of the total cohort at baseline (case-cohort study), thus allowing some cases that develop during follow-up to be part of both the case and control groups. 2. Every person in the source population has the same chance of being included as a control, regardless of how much time that person has contributed to the person-time experience of the cohort. Example: Bowel Movement and Constipation Frequencies and the Risk of Colorectal Cancer Among Men in the Netherlands Cohort Study on Diet and Cancer Colinda C. J. M. Simons*, Leo J. Schouten, Matty P. Weijenberg, R. Alexandra Goldbohm and Piet A. van den Brandt Am. J. Epidemiol. (2010) 172 (12): 1404-1414. Abstract The authors investigated the associations between bowel movement and constipation frequencies and colorectal cancer (CRC) endpoints among men in the Netherlands Cohort Study on Diet and Cancer (n = 58,279) and explored whether dietary fiber intake may modify associations. After 13.3 years (1986–1999), 1,207 CRC cases and 1,753 subcohort members were available for case-cohort analyses. Multivariate analyses showed a significantly increased hazard ratio for CRC overall and rectal cancer in men who reported having a bowel movement 1–2 times per day (second-highest category) as compared with once a day (CRC: hazard ratio (HR) = 1.29, 95% confidence interval (CI): 1.09, 1.53 (Ptrend < 0.001); rectal cancer: HR = 1.50, 95% CI: 1.15, 1.95 (Ptrend = 0.001)). Hazard ratios for CRC overall and rectal cancer were significantly decreased and lowest in men who reported suffering from constipation 27 sometimes or more often versus never (CRC: HR = 0.76, 95% CI: 0.58, 0.98 (Ptrend = 0.02); rectal cancer: HR = 0.57, 95% CI: 0.35, 0.90 (Ptrend = 0.01)). No trends in the associations with proximal or distal colon cancer risk were observed. Interactions with dietary fiber intake were not significant. In this study, frequent bowel movements were associated with an increased risk of rectal cancer in men, and constipation was associated with a decreased risk. Study population and design Within the Netherlands Cohort Study on Diet and Cancer (NLCS), which includes 120,852 men and women, information on bowel movement and constipation frequencies is available for men only. The NLCS has been described in detail elsewhere (20). Briefly, the NLCS includes 58,279 men who were aged 55–69 years at baseline in 1986 when they completed a mailed self-administered questionnaire on diet and cancer. Bowel movement and constipation frequencies were addressed in the questionnaire for men only, by means of the multiple-choice questions “How often do you usually have a bowel movement?” and “Do you ever suffer from constipation?”. We used a case-cohort approach for data processing and analysis for reasons of efficiency, enumerating cases for the entire cohort and calculating the person-time at risk from a random subcohort of 5,000 members—of whom 2,411 were men—who were followed up for vital status. Incident cases of colorectal cancer were ascertained through the Netherlands population-based cancer registry and the Netherlands nationwide pathology registry (24, 25). The estimated completeness of cancer follow-up was more than 96% (26). 28 Case-crossover Study 1. This design is a case-control study analogue of the crossover study. 2. A crossover study is an experimental study in which two (or more) interventions are compared, with each study participant acting as his or her own control. 3. Each subject receives both interventions in a random sequence, with some time interval between them so that the outcome can be measured after each intervention. 4. A crossover study thus requires that the effect period of the intervention is short enough so that it does not persist into the time period during which the next treatment is administered. 5. In case-crossover study, all subjects in the study are cases. The control series dose not comprise a different set of people but, rather, a sample of the time experience of the cases before they develop disease. 6. The control information is obtained from the cases themselves. 7. Only certain types of study question can be studied with a case-crossover design. The exposure must be something that varies from time to time within a person. 8. Case-crossover study is convenient to evaluate exposures that trigger a short-term effect. And the disease must have an abrupt onset. 9. How short is brief? The duration of the exposure effect should be shorter than the typical interval between episodes of exposure so that the effect of exposure is gone before the next episode of exposure occurs. 10. A study hypothesis has to be defined in relation to a specific exposure that causes the disease within a specified time period. Each case is considered exposed or unexposed according to the time relation specified in the hypothesis. 29 Example: Chen SY, Fong PC, Lin SF, Chang CH, Chan CC. A Case-Crossover Study on Transient Risk Factors of Work-Related Eye Injuries. Occup Environ Med. 2009 Aug; 66(8):517-22. Abstract Objective: To investigate modifiable risk and preventive factors of work-related eye injuries. Methods: A case-crossover study conducted to explore the associations between transient risk factors and work-related eye injuries. Patients seen at seven medical centres in Taiwan with work-related eye injuries over a 4-year period were enrolled in the study. Clinical information was collected from medical charts and detailed information on exposure to eight potentially modifiable factors during the 60 minutes prior to the occurrence of each injury, as well as during the same time interval on the last work day prior to the injury, were obtained using questionnaire surveys. Matched-pair interval analysis was adopted to assess the odds ratios (ORs) for work-related eye injuries given exposure to the eight modifiable factors. Results: A total of 283 subjects were interviewed. Most of these injured workers were young, male, and self-employed or small enterprise workers. The most common injury type was photokeratitis (33.2%), mainly caused by welding (30.4%). The OR for a work-related eye injury was increased with the performance of an unfamiliar task (57.0), operation of a faulty tool or piece of equipment (48.5), distractions (24.0), being rushed (13.0), or fatigued (10.0), and a poor work environment (4.3). Wearing eye protection devices was found to have a significant protective effect on workers who might otherwise have been exposed to eye injuries (OR = 0.4; 95% CI 0.2 to 0.7). Conclusion: Potential modifiable risk and preventive factors for work-related eye injuries were identified using a case-crossover study. This information should be helpful in the development of preventive strategies. 30 Cross-Sectional Study a. Includes all persons in the population at the time of ascertainment or a representative sample of all such persons as subjects. b. Exposure status and disease status are measured at one point in time or over a short period of time in the study subjects. c. A cross-sectional study conducted to estimate prevalence is called a prevalence study. Summary of Cross-Sectional Studies Advantages: a. Particularly useful for 1. Frequent diseases of long duration 2. Determining characteristics of a population 3. Estimating the prevalence of a disease 4. Generating hypotheses of exposure – disease relationships b. Generalizability, due to the ways that the subjects are recruited into the study. Limitation: a. Not useful for rare diseases or diseases with short duration b. Cause-effect relationship tenuous (temporality); it may be impossible to determine which came first. (e.g., do people in low social classes have higher prevalence rates of many mental illness than do people of higher social classes, or do people migrate down the social class scale once they become mentally ill and are therefore found in the lower social classes at the time a study is done?) c. For disease is in remission may be falsely classified as not having the disease. d. People either recover or die from a disease quickly has less of a chance of being included in the disease group. If characteristics of persons whose disease is either of short duration or rapidly fatal are different from those whose disease is of long duration, then the exposure-disease association observed in a cross-sectional study will misrepresent the association of exposure with incidence. 31 Example: Pesticide Use and Thyroid Disease Among Women in the Agricultural Health Study Whitney S. Goldner,* Dale P. Sandler, Fang Yu, Jane A. Hoppin, Freya Kamel, and Tricia D. LeVan Am J Epidemiol. 2010 February 15; 171(4): 455–464. Abstract: Thyroid disease is common, and evidence of an association between organochlorine exposure and thyroid disease is increasing. The authors examined the cross-sectional association between ever use of organochlorines and risk of hypothyroidism and hyperthyroidism among female spouses (n = 16,529) in Iowa and North Carolina enrolled in the Agricultural Health Study in 1993–1997. They also assessed risk of thyroid disease in relation to ever use of herbicides, insecticides, fungicides, and fumigants. Prevalence of self-reported clinically diagnosed thyroid disease was 12.5%, and prevalence of hypothyroidism and hyperthyroidism was 6.9% and 2.1%, respectively. There was an increased odds of hypothyroidism with ever use of organochlorine insecticides (adjusted odds ratio (ORadj) = 1.2 (95% confidence interval (CI): 1.0, 1.6) and fungicides (ORadj = 1.4 (95% CI: 1.1, 1.8) but no association with ever use of herbicides, fumigants, organophosphates, pyrethroids, or carbamates. 32 Study Bias 1. Selection bias: a. Selection bias is a system error in a study that stems from the procedures used to select subjects and from factors that influence study participation. b. It comes about when the association between exposure and disease differs for those who participate and those who do not participate in the study. c. Because the association between exposure and disease among non-participants is usually unknown, the presence of selection bias must usually be inferred, rather than observed. 2. Information bias a. It can arise because the information collected about or from study subjects is erroneous. b. Such information if often referred to as being misclassified if the variable is measured on a categorical scale and the error leads to a person being placed in an incorrect category. c. Recall bias occurs in case-control studies where a subject is interviewed to obtain exposure information after disease has occurred (efforts in recalling tend to be different among cases and controls). 3. Confounding Definition of Confounding a. The effect of an extraneous variable that wholly or partially accounts for the apparent effect of the study exposure. b. Confounder is a third variable which may artificially create or mask an association between exposure and disease. Requirements for confounders: a. The confounder must be associated with the exposure b. The confounder cannot be a consequence (pathway) f the exposure c. The confounder must be a risk factor for the disease 33 Confounder Exposure Disease 34