Scientific Theory and Research Terms I. NFSC 470 The Development of Scientific Theory A. Scientific theory as the seat of a stool that needs 3 legs to stand 1. One leg = epidemiological studies that show an association or a relationship between a food or type of diet & chronic disease, such as diabetes. These types of studies generally cannot show cause & effect (causation). 2. The second leg is research that explains how a certain food might ward off disease, i.e. it identifies the biological mechanism involved. This type of study is often called basic research. 3. The third leg includes clinical trials. The gold standard is the double-blind, crossover clinical trial. This type of research, if repeated in different settings with similar results, can be said to prove a cause and effect relationship. It is often called applied research. 4. Generally, all 3 types of research studies are needed for the development of a scientific theory. B. Population (epidemiological) studies suggest correlations between external variables, such as diet, and internal responses, such as a disease. 1. These correlations are then converted into risk factors (NOT causal factors). 2. Risk factors, therefore, are a public health expression of the correlation between a given characteristic and the presence of a given disease. 3. They are numerical expressions of a chance, i.e. they indicate whether a characteristic has a strong chance of eliciting a health outcome. II. Hypotheses and Research Questions A. Research question(s) 1. Are questions about the relationship between the dependent and independent variable(s). 2. They help guide the researcher toward the development of hypotheses. Scientific Theory and Research Terms NFSC 470 B. Hypothesis (es) 1. Are educated guesses, e.g. a. There is a significant difference in (dependent variable) between (subjects) with (independent variable). b. Example: There is a significant difference in BMI between diabetic women consuming a low GI index diet vs. those consuming a high GI diet. 2. Null hypothesis: There is not a statistically significant difference between variables. 3. Alternative or research hypothesis: There is a significant difference between variables. III. Study Designs A. Choice of study design depends on the research question being asked. B. Descriptive study designs (used to generate hypotheses) 1. Population studies/nutrition epidemiology 2. Cross-sectional studies: observation of variables at one point in time, e.g. NHANES. http://www.cdc.gov/nchs/about/major/nhanes/intro_mec.htm 3. Surveys C. Analytical designs (used to test hypotheses) 1. Case-control studies (retrospective) 2. Cohort studies (cohort = people sharing some attribute, e.g. age) 3. Intervention trials a. These are true experimental designs. The intervention is under the control of the researcher(s) b. Randomized control trial – random assignment to intervention or control group c. Crossover trial – subjects serve as own controls, thereby decreasing error variance. d. Knowledge of intervention: Blind = subjects are unaware of tx. assignments Double-blind = both subjects & researchers unaware of tx. assignments. Powerful design b/c eliminates expectation bias. Scientific Theory and Research Terms IV. NFSC 470 Study Variables A. Dependent variable 1. Outcome variable of interest, e.g. risk for or presence of a diet related health problem. 2. Studies are generally conducted to investigate the degree to which the dependent variable is “dependent” on the independent variable(s). B. Independent variable(s) 1. Variable(s) being investigated 2. Is under the control of the researcher(s) in intervention studies, but not in observational descriptive studies. C. Control (confounding) variables 1. A confounding variable confuses, or confounds, the relationship between the variables you’re attempting to study. 2. When the relationship between two observed variables is distorted/contaminated by a third (typically unmeasured) variable, that third variable is said to be a confounding variable. Scientific Theory and Research Terms V. NFSC 470 Study Subjects and Selection Process A. Terms 1. Population: All the people to whom the results should be applicable 2. Sample: Subset of the population since, in most cases, the population is too large to include everyone in the study. a. Generally need 15/group or more to test causal relationship b. Generally need 30/group or more to test correlations. 3. Comparison/control/placebo group: A group not receiving the intervention, against which one can compare results from the target or intervention group. 4. Intervention/target/experimental group: A group receiving the intervention. 5. Matching: selecting a control group that has certain characteristics that match the target or intervention group. a. The purpose is to eliminate the effect of specific variables on group differences, e.g. age. b. If groups are matched on age, then differences in outcomes between the groups cannot be attributed to the effects of age. B. Sample selection 1. Random sampling (the gold standard): This selection method ensures that the study sample (group of subjects) is representative of a reference population. Only under these conditions can the study results be generalized to the whole population. 2. Stratified random sampling – improves random sampling by dividing (stratifying) the population into a number of nonoverlapping subpopulations, or strata, and then taking a sample from each of stratum. 3. Cluster sampling: A cluster is an existing unit, such as a state or county. Study clusters should be as heterogeneous as possible, and then can be randomly chosen to be included in the study. 4. Haphazard sampling (or “sample of convenience”) involves using a sample of subjects that are readily available. Therefore, this type of sample is almost never randomized and the degree to which results can be extrapolated to the population varies greatly. Scientific Theory and Research Terms VI. NFSC 470 Statistical Analyses A. Descriptive statistics – used to describe the study sample (subjects) 1. Measures of central tendency a. mean (average), mode (most frequently occurring value), median (divide data into two parts, upper and lower halves. The point that divides the two halves is the median) b. Frequency distributions (e.g. histogram = graphical display of tabulated frequencies). 2. Incidence and Prevalence a. Incidence = number of new cases of a condition over a period of time (trend) = NEW (rate of appearance of dx) b. Prevalence = number of cases at a given point in time = ALL (rate of existence of dx). Prevalence depends on incidence and duration. c. E.x. The prevalence of CHD decreases after age 70, but its incidence continues to increase with age. How can you explain this? 3. Measures of variability (standard deviation): a. Provides a measure of the amount of dispersion of scores around a central value b. The greater the SD, the less the mean is representative of the sample for that variable c. E.g. The mean score on a test = 55 35 (SD). The SD should be less than 1/3 the size of the mean. A large SD indicates the mean is not a good indicator of the test scores and that the scores have a wide range. Data should be presented in another format, such as freq. distribution, quartiles, or other grouping, or discussed in the article’s narrative. 4. Correlation analysis can be used to describe relationships between variables. a. Direct (positive) correlation, e.g. r = .7 b. Indirect (inverse or negative) correlation, e.g. r = -.7 c. The closer the r (correlation coefficient) is to 1 or –1, the stronger the association. Scientific Theory and Research Terms NFSC 470 B. Statistical tests of inference – used to test relationships between or among variables. The type of test used depends on the parametric or nonparametric nature of the variables. 1. Correlation, regression analysis, analysis of variance (ANOVA), ttests, & Chi-square (crosstabs using SPSS) are most common. 2. t-test: The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups (ungrouped, parametric data). 3. Chi-square is used to test for significant differences between groups (non-parametric data) 4. ANOVA is used to test for significant differences among >2 means (parametric data) 5. Regression analysis is used to predict values for a dependent variable using values for 2 or more independent dep’t variables. It can also be described as characterizing the relationship between several independent variables considered simultaneously and a single dependent variable. a. R is the % of variance for the dependent variable that can be explained by the regression equation. b. E.g. R = .49 means that 49% of the variance in the dependent variable can be explained by the regression equation (model). C. Statistical Significance 1. p value (p = probablility) = the probablility that your results are due to chance, and not to the variable you’re testing. a. statistical significance: p 0.05. b. It is the probability of wrongly rejecting the null hypothesis if it is in fact true. The p value of 0.05 would also suggest a “confidence level” of 95% that the result is due not due to chance. 2. Statistical vs. clinical significance a. Clinical significance = significant in practice e.g. If nutrition knowledge test scores are significantly different between groups, but the difference was only 1 or 2 questions. e.g. If a specific diet resulted in a significantly greater drop in weight for the experimental group, but the difference was only 2 lbs. b. Watch for trivial differences that are statistically significant due to large sample size, with no evaluation of clinical significant. Scientific Theory and Research Terms For a review of SI units see: A Dictionary of Units by Frank Tapson http://www.ex.ac.uk/cimt/dictunit/dictunit.htm For conversions of clinical data see: http://www.unc.edu/~rowlett/units/scales/clinical_data.html NFSC 470