Research Methodology Lecture Notes

co ● ● ● Specialist in a particular area ○ Eg., Cardiologist, PT, Advantages ○ Have really detailed knowledge within their field Disadvantages ○ Unsystematic ■ “You might try this one time, try another next time. Jump higher one time, lower another” ○ Difficult to determine what caused the outcome ○ Often the results are not shared/communicated Deductive reasoning (major premise, minor premise, into conclusion) Inductive reasoning (specific premise, into general idea) Both types of reasoning used to: ● Develop a research question ● Answering + interpreting research question Necessary Example: ● Air is necessary for human life ○ Without air, we are dead Sufficient Example ● Air is insufficient for human life ○ Air does not guarantee our existence ■ Food, Water, etc. Journal article types ● Primary article (empirical study) – a primary study is one that aims to gain new knowledge on a topic through direct or indirect observation and research. These include quantitative or qualitative data and analysis. ○ IMRaD = Intro, Methods, Results and Discussion ● Review article – a review article provides a summary of existing research in a field/topic area. There are several common types of review articles: ○ Narrative reviews (literature reviews) – Summarises some of the existing evidence in a field or topic ○ Scoping reviews – these are broad reviews that aim to gather as much evidence as possible and maps the evidence into themes. ○ Systematic reviews – these are highly structured reviews that utilize pre-planned methods to include/exclude articles. ● ● Meta-analysis – is a type of research study that combines and/or analyzes data from different primary studies (usually) in a new analysis in order to strengthen the understanding of a particular topic. Case studies – report specific instances of interesting phenomena. A goal of case studies is to make other researchers aware of the possibility that a specific phenomenon might occur. This type of study is often used in medicine to report the occurrence of previously unknown or emerging pathologies. ○ Anecdotal, but not an anecdote Theories, models and the scientific method ● Theory: “Set of interrelated concepts (constructs), definitions, and propositions that specify relationships among variables and represent a systematic view of specific phenomena” -Portney & Watkins (2009) ● The Scientific Method: ○ Have preconception (theory) ○ Make prediction from preconception (hypothesis) ○ Conduct experiment/obtain data to compare ○ Update preconception from data (induction/deduction) Definitions ● Variables - anything you observe or measure ● Construct - a useful “idea” represented by 1 or more variables ○ A connection between variables that we measure ● Hypothesis - A prediction of what you think is going to happen ● Theory - systematic synopsis of interrelated constructs/variables ● Model - simplified representation of a complex phenomenon - they can incorporate multiple theories Health and Well-being Theory ● Theory - Health and Well-Being Theory ↕ ↕ ● Constructs - Health well-being ↕ ↕ ↕ ↕ ● Variables - BP, cholesterol, stress, anxiety Why do we need theories? 1. Summarizes existing knowledge/observations a. Organizes multiple studies and ideas b. Induction: specific observations to generalizations 2. Predicts future events a. That can or cannot be observed b. Deduction: general theory to specific instances 3. Generates hypotheses a. Tests of hypotheses support or refute theories b. Stimulate development of new knowledge MODULE 2: PICOT APPROACH The scientific method ● Make an observation ● Formulate a question ● Formulate a hypothesis ● Design an experiment ● Execute the experiment ● Analyse the results ● Draw conclusion ● Formulate a new hypothesis Development of a research question ● Identify the problem you wish to examine ● Could come up with the problem based on: ○ Clinical problem ○ Literature review ○ Research theory ○ Prior observation(s) ● The answerable question has to be defined in terms of population, intervention, and outcome A model for developing a research question - PICOT ● A useful model to help structure an answerable question ● Used to formulate clinical or research questions ● Breaks down question into four/five key elements ● POPULATION ○ Who is the group who will participate in your research? ○ ○ ● How do you describe the group? To whom do you want to apply the research results? ■ Can be defined in many ways: ● Disease or condition ● State of illness ○ Often in clinical research ● Region ● Demographics ● Risk factors ○ Often in epidemiological research ○ Target population ■ Identify the population that you wish to study and hope to apply the results of your study to ○ Accessible population ■ The portion of the target population from which you are able to recruit participants ○ Sample ■ The participants who you recruited and who met your inclusion criteria ● VOLUNTEERS (typically what we have to rely on) Intervention/exposure ○ Intervention refers to the treatment that participants in your study will receive ○ What is it you want to know the effect of ○ Sometimes you don’t have to do anything! ■ The intervention group (or exposure group; or treatment group) does NOT have to be a manipulated intervention. It can refer to self-selected, identifiable groups or other classifications. ● Manipulated ○ Surgical procedure ○ Pharmaceuticals ○ People who exercise vs. sedentary (“force” people to train) ○ Rehabilitation technique ● Not Manipulated ○ People who live near power lines vs. those that do not ○ People who exercise vs. sedentary (self-classified or researcher classified) ○ People who eat a ‘Mediterranean’ diet vs. a ‘North American’ diet ○ Older adults vs. younger adults ○ High school vs University education Control group or Comparison group ● Typically, the intervention group(s) and control group(s) are two (or more) levels of the same variable. For example: ● In some studies, there is no comparison between groups For example, what is the prevalence of cardiovascular disease in Canada? Examines an outcome (cardiovascular disease) in a population (Canadians) No intervention or control groups • Control groups need to be carefully constructed. Sometimes more than one type of control group is necessary. For example: What is the mean of the compression force of the LBP group ● 16N/Nm What is the standard deviation of the compression force of the LBP group? ● +- 6 N/Nm The compression force of the asymptomatic group is statistically lower than that of the LBP Group (p < 0.05) ● No OUTCOME ● Dependent variables ● Discrete/Categorical ○ Typical to have subsets ○ Hospitalisation rate ○ Education level ○ IVD degeneration stage ○ Surgical requirements ● Continuous ○ Wide variety of outcomes typically have relevant units ■ VO2 max ■ HR ■ Force TIME ● How long do you want to follow the outcome ● Can be over time: ○ Longitudinal ● ○ Can be at one specific point in time ○ Cross-sectional Time as an intervention/control ● Time can also be an intervention/control ○ Eg., Is the prevalence of dementia in Canada higher in 2021 than in 1991? ■ Population: Canadians ■ Intervention (exposure)Control: 2021 vs. 1991 ■ Outcome: Prevalence of dementia ■ Time: 30 years PICOT practise ● You were curious if there was a difference in vaccination rates of COVID vaccine eligible adults based on geographical location. You examined public health records from 7 different health units in the Province spanning March 2021 to March 2022 ○ P: Adults in ontario ○ I: Geographical location (8 different health units) ● ○ C: NO CONTROL ○ O: vaccination rates ○ T: 1 year (March 2021 to March 2022) A researcher was interested in how the pandemic has affected youth outdoor physical activity in the Waterloo Region. They have activity data from 2018, 2019, 2020 and 2021 ○ P: Youth in waterloo ○ I: Pandemic, 2018, 2019, 2020, 2021 ○ C: Pre pandemic (2018, 2019) ○ O: Physical ability ○ T: 4 years, each assessed individually How do we develop a research question? ● ● ● What are the effects of aerobic and light resistance exercise over a 12 week period on adults over 65+ with acute stroke, and their ability to perform a 6 minute walk as compared to standard care. Does a 10 week hyaluronic acid injection and conservative treatment program decrease shoulder elevator muscle fatiguability for adults >40 with rotator cuff tears compared to conservative treatment. ● ○ ○ ○ ○ ○ P: resistance-trained women I: ketogenic diet C: non-ketogenic O: health parameters (VAT, BMC, BMD, BP) T: 8-weeks ■ How does a keto/non-keto diet affect health parameters of 21 resistance-trained women over the course of 8 weeks. Research Hypotheses ● Null hypothesis (H0): ○ A sample is representative of (equal to) a population ○ Intervention (exposure) & control groups will have the same outcome ○ No difference between groups or there’s no effect of an intervention/exposure ● Alternative (research) hypothesis (H 1): ○ This hypothesis contradicts the null hypothesis ○ A sample does not represent (differs from) a population ○ The outcome of the intervention & control groups will differ Research Hypotheses and Directionality ● Non-directional: ○ “There is a difference between the groups”, ○ “Blood pressure will be altered”, ○ “There will be a change in blood pressure” ■ Predict a change or difference in the outcome measure, but you do not specify which way that change is/will go ● Directional: ○ “Quitting smoking will reduce cancer risk” ○ “Grade 3s are taller than Grade 1s” ○ “Exercise will lower blood pressure” ■ Predict the direction of the change or difference in the outcome measure between the groups Definitions - Variables ● Independent variable ○ This is what a researcher typically manipulates ○ it is selected by the researcher to determine its relationship/effect on some other observed variable ○ The independent variable is plotted on the x-axis of a graph ● Dependent variable ○ This is what is measured ○ This is the outcome of interest as selected by the researcher ○ The dependent variable is plotted on the y-axis of a graph Definitions - Types of variables Control Variables ● These are variables that are held constant by the researchers ● The goal is to minimise the effects that these variables might have on the dependent variable or other aspect of the study Confounding Variable ● These are variables other than independent variable that may have an effect on the dependent variable ● They can lead to erroneous conclusions about the relationship between the independent and dependent variables Intervening Variable ● Is a conceptual variable ● Difficult to define/measure ● For example: health Reliability – in relation to measurement ● Reliability – sometimes referred to as repeatability or precision ● A researcher should consistently get the same output when providing the same input or performing the same measurement. ○ E.g., you measure the body mass of a participant 3x and get 70.1kg, 70.1kg, 70.2kg ● True reliability occurs when our measurements are consistent and free from random errors Factors affecting test-retest reliability ● Effects of testing ○ Participants ‘learn’ and perform better on subsequent trials ● Effects of Test/Retest intervals ○ Too much rest = boredom ○ Too little rest = fatigue (physical, mental) ● Rater Bias ○ People will perform measurements slightly differently ○ Same person should measure the outcome on all participants ● External Factors ○ Ambient conditions ○ Noise ○ Temp ○ Distractions P - 89 men and woman aged between 30 and 65 years old with T2DM and body mass index between 30 and 35kgm^-2 I - VLCK diet C - standard low-calorie diet O - safety and tolerability T - 4 months Independent variable - Diet Dependent variable - weight loss Control variable - age, sex, BMI Confounding variable - physical activity, years with disease, smoking Validity ● In the context of measurement: Validity is the extent to which an instrument measures what it is intended to measure ○ Determines the ‘believability’ or ‘trueness’ of results ● Can a measurement device be reliable but not valid? ○ YES! ○ Sometimes an instrument can be reliable but may not be measuring exactly what you are intending to measure Measurement Reliability and Validity ● Can we have a reliable outcome measure that is not valid? YES ● Can we have a valid measure that is not reliable? NO ^not reliable + valid ^reliable but not valid ^reliable and valid Internal validity is the degree to which a study establishes a cause-and-effect relationship between the treatment (independent variable) and the outcome (dependent variable) ● Threats to Internal Validity ○ Selection - the groups are not equivalent ■ A difference between groups could have been present at the start ○ History - it refers to some event/effect during the study, other than the independent variable that influenced the dependent variable ○ Maturation - developmental (physical, mental) changes occur in participants during the study, which may influence the dependent variable ○ Testing - if multiple trials are performed, participants might improve the more trials they complete ■ Especially important with new/unfamiliar tasks ○ Instrumentation - Instruments could be unreliable. Instruments are not valid. Observer/rater bias ○ Attrition - Unequal loss of participants from groups after random assignment has occured How to control for/mitigate threats to internal validity RANDOM ASSIGNMENT (mitigates) - hallmark of experimental research Attrition/withdrawal/drop-out - intention-to-analyze all data is analyzed regardless of subjects dropping out or receiving a treatment when they should really be in the control group Other requirements/considerations ● Control group ● Pre-test and post-test ● Masking External Validity ● External validity refers to whether causal relationships can be generalised to different measures, persons, settings, and times ● In other words, how generalizable/applicable are the findings to a wider setting (eg., ● ● Threats to external validity ○ Selection of participants - if the sample is not representative of the population from which it was drawn, the generalizability is reduced ○ Selection of Treatment - if the treatment is not likely observed or found in the “real world” ○ Multiple treatment effects - if multiple treatments are applied to an individual, a prior treatment might influence the next treatment ○ Repeated testing - a pretest (or repeated testing) can affect the participant’s responsiveness to the independent variable Ways to mitigate threats to external validity: ○ Random sampling is randomly drawing people form a target population to participate in your research ○ Selecting an appropriate research design may reduce the multiple treatment and testing threats. Washout periods will help mitigate the multiple treatment effect. What is critical appraisal of research ● Critical appraisal is the process of carefully and systematically examining research to judge its trustworthiness, and its value and relevance in a particular context. It is an essential skill for evidence-based medicine because it allows people to find and use research evidence reliably and efficiently. Modern Day Research Ethics ● Social and clinical value ● Scientific validity ● Fair subject selection ● Favorable risk-benefit ratio ● Independent review ● Informed consent ● Respect for potential and enrolled subjects Social and clinical value ● Will answering the research question have significant value for society or for present or future patients with a particular illness ● The answer to the research question should be important or valuable enough to justify some risk ● Only if society will gain useful knowledge - which requires sharing results both negative and positive - can exposing human subjects ot the risk and burden of research be justified Scientific Validity ● Is the question researchers are asking answerable? Are the research methods valid and feasible? Is the study designated with a clear scientific objective and does it use accepted principles, methods, and reliable practices. ● Statistical power must be sufficient to definitively test the objective. Invalid research is unethical because it is a waste of resources and exposes people to risk for no purpose. Fair subject selection ● The primary basis for recruiting and enrolling groups and individuals should be the scientific goals of the study — not vulnerability, privilege, or other factors unrelated to the purposes of the study. ● Consistent with the scientific purpose, people should be chosen in a way that minimises risks and enhances benefits to individuals and society Favourable risk-benefit ratio ● Risks can be physical (death, disability, infection), psychological (depression, anxiety), economic (job loss), or social (for example, discrimination or stigma from participating in a certain trial). ● Has everything been done to minimize the risks and inconvenience to research subjects? ● Do the potential benefits outweigh the risks? Independent review ● To minimize potential conflicts of interest and make sure study is ethically acceptable before it even starts, an independent review panel with no vested interest in the study should review the proposal and ask important questions. Informed consent ● For research to be ethical, most agree that individuals should make their own decision about whether they want to participate or continue participating in research. ● This is done through a process of informed consent in which individuals (1) are accurately informed of the purpose, methods, risks, benefits, and alternatives to the research, (2) understand this information and how it relates to their own clinical situation or interests, and (3) make a voluntary decision about whether to participate. ● There are exceptions to the need for informed consent from the individual — for example, in the case of a child, of an adult with severe Alzheimer’s, of an adult unconscious by head trauma, or of someone with limited mental capacity. Respect for participants and enrolled subjects ● Individuals should be treated with respect from the time they are approached for possible participation—even if they refuse enrollment in a study—throughout their participation and after their participation ends. ● Keeping their private information confidential. ● Respecting their right to change their mind, and to withdraw without penalty. ● Informing them of changes to the risks and benefits of participating. ● Monitoring their welfare and, if they experience adverse reactions, untoward events, or changes in clinical status, ensuring appropriate treatment and, when necessary, removal from the study. ● Informing them about what was learned from the research MODULE 3: RESEARCH COHORT AND CASE CONTROL STUDIES No design is “superior” to any other... ● but there is often a “most appropriate” design for a specific scenario ● Research Types ● ● Basic research ○ Conducted to increase knowledge and fundamental understanding of the physical, chemical, and functional mechanisms of life processes and disease. It is not directed to solving any particular problem in humans or animals ● Applied research ○ ● ● involves the application of existing knowledge, much of which is obtained through basic research, to solve a practical problem. Clinical Research ○ Patient- or end user-oriented research with human subjects. Patient-oriented research includes: ■ Mechanisms of human disease ■ Therapeutic interventions ■ Clinical trials ■ Development of new technologies Translational research ○ part of a unidirectional continuum in which research findings are moved from the researcher’s bench to the patient’s bedside and to the community Research Design Definitions ● Descriptive ○ Describes an outcome in a population. Characterises who, where, or when in relation to the what (the outcome of interest) ■ Eg., oh the sky is blue! ● Analytical ○ Examines the relationship between intervention and outcome ○ Test hypotheses ○ The “how” and the “why” ■ Eg., why is the sky blue ● Qualitative ○ Subjective/interpretive observations ○ Identifies themes in observations - forms narrative/story/essay ○ Does not test a hypothesis, but may lead to hypothesis development ● Quantitative ○ Objective, measurable, units ○ Test hypothesis ■ Require statistical analysis Contrasting qualitative and quantitative ● ● ● Qualitative Strengths ○ Generates new ideas, hypotheses Quantitative Strengths ○ Test hypotheses and allows us to examine cause and effect relationships Quantitative Research Designs ● Observational ○ Is non-manipulated studies/research ○ Researchers do not attempt to influence/manipulate participants or the surroundings ● Experimental ○ Is a manipulated study ○ Participants are randomised to receive intervention or control ● Quasi-experimental ○ Lacking 1 or more element of experimental research ■ Eg., you want to do an elementary school study with 2 classrooms, you're not randomising, you are splitting them off based on how they are already split The utility of observational research ● Studying the otherwise un-study-able ○ When manipulation of an exposure (independent variable, IV) is not possible, not practical, too complex ○ Cannot manipulate for ethical/logical reasons e.g., toxin exposure, education level/attainment ○ Cannot manipulate the variable of interest e.g., age, sex, personality ● Prioritising external validity ○ Laboratory manipulation does not well-represent real-world phenomena. ■ Eg., treadmill walking vs outdoor walking, “social engagement” ● Generating research questions ○ Scientific method - observe, then develop a question Observational Research Design - Time element Advantages ● Less expensive ● Less likely to drop out ● Controls for ‘period effects’ ● Data on ALL variables are collected at one time Disadvantages ● Do not know whether exposure(s) happened before or after outcome ● Associations identified between variables may be difficult to interpret ● “Snapshot” timing not guaranteed to be reflective of ‘real-world’ settings Observational Research Design ● Collected multiple times = repeated measures ● Could be 5 months, 5 years, 10 years, etc. Advantages ● You may observe patterns in the outcome (Dependent variable, DV) over time ● Establishes an order of events ● Reduces recall bias of participants ● May provide insight into causal mechanisms ○ Can not definitively say it for sure Disadvantages ● Time consuming and expensive ● Usually requires a large sample size ● Affected by ‘cohort effects’ ○ Eg., generational cohorts (gen z, gen y), Individuals are affected differently based on when they were born ● Cannot be used to suggest causation - only associations ● Despite temporal aspects - may not know if exposure precedes outcome Difference between prevalence and incidence ● Prevalence refers to the total number of individuals in a population who have disease or health condition at a specific period of time, usually expressed as a percentage of the population. ○ Who has the disease now? ● ○ Incidence refers to the number of individuals who develop a specific disease or experience a specific health-related event during a particular time period (such as a month or year). ○ Who will develop the disease over time? ○ Observational Study Designs Discussed in KIN 232 ● Case-control study ○ Participants are selected based on an OUTCOME of interest (eg., hypertension) ● Cohort study ○ Participants are selected based on a POPULATION of interest ● Definition of a cohort: ○ A collection or sampling of individuals who share common experience and/or characteristics, such as age, sex, activity level, location, education etc. ■ Examples: ● Birth cohort: group born at the same time ● Geographic cohort (residents of the same area) ● Historical cohort (group exposed to the same historical event) Cohort Studies ● Participants are recruited based on cohort ● A cohort study is a longitudinal study. ○ It may be: ■ Prospective: ● Recruit participants and track them forward in time ● Outcome is evaluated in the future ■ Retrospective: ● Recruit your participants and identify past/historical exposures ○ ONLY exposure of the particular participant (no family) ● Outcome is evaluated at time of recruitment (present day) Cohort Studies - prospective vs retrospective Advantage of Cohort Studies ● Longitudinal (time element) ○ Can determine temporal sequence of risk factors versus outcome ● Best external validity ○ More likely to be representative of ‘real-life’ scenario/environment ● Representative ○ If the population is appropriately sampled, the risk estimates may be generalizable to the population ● Multiple Exposures Outcomes ○ Often multiple exposures and outcomes are evaluated within one study Disadvantages of Cohort Studies ● Large sample is required (in order to capture outcomes e.g., heart attack in FHS) ● Expensive - participant compensation, researchers/staff ● Attrition Bias ○ Overtime, people will drop out ) affects outcomes/results ○ Tends to be people who are most sick ) ● ● Measurement Bias: ○ If measurement methods change over time, this may alter rate/risk estimates. Hard to measure certain variables consistently over time. Poor Internal Validity Recruiting people from a population ● Researchers wish to study individuals with cystic fibrosis (CF) ● Prevalence ≈ 0.00033% Thus, 1 CF case for every 3000 participants ● How many people would you need to recruit if you wanted to capture 100 CF cases in your study? ■ 300,000 Case-Control Studies ● Deliberately recruit participants based on outcome status (diagnosis). Then, you can assess how exposures affected that outcome. ○ Case: have the outcome/disease of interest ○ Control: Do not have the outcome/disease ● Adu. - will capture even the most rare disease ○ Fewer people ○ Cheaper Recruitment ● Recruited based on having some outcome ● Research can recruit in different ways: ○ Incident cases ○ Prevalent cases Where can you recruit from? ● Sources of cases: ○ Population-based ■ Recriot from all cases in the population ● Strengths (external validity): ○ Cases are representative of population ○ Results are generalizable to population ● Weaknesses: ○ More difficult to recruit ○ How do we find the cases? ○ Hospital-based ■ Drawn from cases admitted and treated in a hospital. ● Strengths: ○ Easy to identify cases ○ Access to medical records/history ● Weaknesses: ○ Typically more sick ○ May be different in other ways compared to general population Control Group Recruitment ● How do we identify appropriate controls? ○ Want to match the controls as closely to the cases as possible ■ Age, sex, ht., wt., smoking/non, ● Often recruit from family or friends ○ Advantage: Often similar in many factors ○ Problems: ■ May not generalize to population ■ May be too similar to cases to find differences Challenges in Case-Control Studies ● Selection Bias ○ Selection bias occurs when the subjects studied are not representative of the target population about which conclusions are to be drawn. ● The way that cases and controls are recruited alters the relationship between intervention/exposure & outcome ○ ○ Control selection and case selection must NOT be based on exposure history E.g., If fish consumption is the exposure, and blood pressure is your outcome of interest, then recruiting from a fishing village would be inappropriate ● Recall Bias ○ Recall bias is a type of information bias common in case-control studies where the cases (or their families) are more likely to recall a prior exposure than the controls. ■ It is all about exposure and outcome ● For example: ○ Cancer patients may be more likely to recall exposure to potential toxins/carcinogens ○ Diabetic patients may be more likely to remember poor diet/lack of physical activity ● Misclassification ○ Non-differential misclassification ■ Cases and controls are misclassified equally ■ Will make detection of a true effect less likely ○ Differential misclassification ■ Only one group (cases or controls) are misclassified ■ Can alter the magnitude and/or direction of the effect Case-Control Study Design ● Case-control studies can be: ○ Cross-sectional, or ○ Longitudinal ■ Longitudinal case-control studies are always retrospective ■ i.e., exposures were evaluated/reported in the past ● ● Some important results from observational studies ● Absolute risk is the actual risk of some event happening given the current exposure. There is no comparison between groups. For example, if 1 in 10 individuals with ● exposure develop the disease then the absolute risk of developing the disease with exposure is 10% Absolute risk =Outcome/Outcome+Control ● ● Absolute risk (Exposed) = 983/(983+4467) = 0.180 or 18% ● Absolute risk (Not Exposed) = 85/(85+7941) = 0.0105 or 1.05% Relative risk ● Example: the relative risk of developing lung cancer (outcome) in smokers (exposed group) versus non-smokers (non-exposed group) would be the probability of developing lung cancer for smokers divided by the probability of developing lung cancer for non-smokers. ● Relative Risk (RR) = [OE/(OE+CE)]/[ON/(ON+CN)] ● [983/(983+4467)]/[85/(85+7841)]=17.03x higher in Exposed group Odds Ratios (does not take into account all of the people ● The odds ratio (OR) is a measure of how strongly an event is associated with exposure. The odds ratio is a ratio of two sets of odds: the odds of the event occurring in an exposed group versus the odds of the event occurring in a non-exposed group. The larger the odds ratio, the higher odds that the event will occur following exposure. A ratio equal to 1 means there’s no association between exposure and event (disease). ● Odds Ratio (OR) = (OE/CE)/(ON/CN) ● (983/4467)/(85/7941) = 20.56 greater odds Parkinson’s Study ● Type of study: Case-control longitudinal - retrospective ● Expeosure/event: Appendectomy (yes/no) ● Population: 62.2 million patients ● Absoulte Risk (Exposed): 4470/488190=0.00915627112 ● Absolute Risk (Not Exposed): 177230/61,700,000= 0.00287244732 ● Relative Risk: 0.00915627112/0.00287244732=3.18762020673x Does living in a fishing community affect the risk/odds of developing mercury poisoning? ● ● Absolute risk (Exposed) = 578/(578+14458) = 0.03844107475 ● Absolute risk (Not Exposed) = 13/(13+15012) = 0.00086522462 ● Relative Risk (RR) = [578/(578+14458)]/[13/(13+15012)]= 44.4290116825x Do elite athletes have an elevated risk/odds of developing irregular heart rhythms? ● ● ● ● Absolute risk (Exposed) = 5806/(5806+19448) = 0.22990417359 Absolute risk (Not Exposed) = 354/(354+748) = 0.32123411978 Relative Risk (RR) = [5806/(5806+19448)]/[354/(354+748)]= 0.71569039349x MODULE 4: EXPERIMENTAL DESIGNS ● Specifically testing hypotheses in a systematic manner ○ Deliberate consideration of variables (i.e., DV, IV, Control, Confounding) ○ A major aim is to examine a cause & effect relationship ■ (does x cause y?) 1. Manipulation of variables 2. Control Group 3. Random Assignment Manipulation of variables ● Independent Variables: ○ In experimental research the independent variable is chosen/manipulated by the researcher. Whereas, in observational research the independent variable is not manipulated. ● ● Dependent Variables: Affected by the independent variable. Experimental Research: ○ Participants are assigned to an interventional group (many possible groups) Researcher manipulates the level of the independent variable by group (more on this to come! Types of independent variables ● Active Variable: ○ An IV that CAN be manipulated (CAN BE, NOT IS MANIPULATED) ■ Eg., drug dosage, exercise intensity ● Attribute Variable ○ An IV that CANNOT be manipulated ■ Eg., genes, geographical location, disease presence, age, sex **if primary comparison of interest is an attribute variable, then an experimental design is impossible as you cannot manipulate the independent variable ● Experimental research must have 1 active variable Control groups ● The group you compare your intervention group to ○ Control may be: ■ Standard treatment ■ Placebo ■ No treatment/no intervention ■ Sham surgery ○ What are the characteristics of a good control group? ● There are potentially some ethical issues with control group design ○ By design, experimental research involves withholding experimental intervention from one group ■ Is the experimental intervention known to be beneficial? ■ Is the control group required to abstain from standard care? ■ Sometimes, the control group receives the experimental intervention after some delay (at the end of the trial) – if it is deemed that beneficial. Random Assignment ● Random Assignment = Every participant has the same chance of being assigned to any group in the study ○ Does NOT mean that each participant has a match in the other group ○ Randomization would guarantee traits balance out in an infinite population but not in a finite sample – we assume there is balance (measure attribute variables to confirm) ○ Researchers compare groups by important traits & extraneous factors to make sure they are similar ○ Chance of being assigned to a given group does not depend on: ■ Time of recruitment (early or late in the study) ■ Location of recruitment (one site vs. another Matching vs. Blocking vs. Random Assignment ● Can randomize within blocks of a trait (sex, age, etc.) ○ Considered randomization ● ○ Matching ○ Some studies match on key variables (sex, age, etc.) ■ Not considered randomization ○ …But experimental design can’t do everything ● Limitations of experimental ○ Impossible if you cannot manipulate variables ○ Tight control over extraneous variables may limit generalizability ■ Relatively poor EXTERNAL VALIDITY ● Often difficult/expensive to manipulate intervention Experimental Design ● Within the three factors associated with experimental designs (above), there are several options that can be incorporated into the design. ○ 1. Masking (blinding) ○ ○ ○ 2. Number of assessments 3. Number of experimental groups 4. Number of independent variables Masking ● Certain individuals involved in the experiment are prevented from knowing (i.e., masked) certain information about the experiment. ○ Generally refers to being ‘masked’ to group allocation (e.g., control vs. treatment) ○ ● ● ● Masking Participants ○ May have the expectation that experimental group will have better (or worse) outcome ○ May influence participant’s behaviour or performance, thus influencing results Masking Researcher(s) Administering Intervention ○ Avoiding bias ○ Avoiding mistakes when administering/giving away variables ■ Eg., “oh here is your placebo” Masking of Researcher(s) Evaluating Outcome ○ Expectation that experimental group will have better (or worse) may influence measurement, especially where not entirely objective Placebo ● Definition: ○ A ‘dummy’ treatment administered to the control group to distinguish specific and nonspecific effects of the experimental treatment ○ Participants are generally masked to group assignment by administration of a “placebo condition” ○ The Placebo Effect ■ ○ Definition: ● .Observable, measurable, tangible effect that has nothing to do with the actual intervention because you did not receive it Placebos have identified effects, especially in cases of pain, depression, insomnia, anxiety Masking ● Certain individuals involved in the experiment are prevented from knowing (i.e., masked) certain information about the experiment. ○ Generally refers to being ‘masked’ to group allocation (e.g., control vs treatment) ○ Number of Assessment: Pre/post ● The most basic design is a pre/post evaluation ○ Randomised Control Trial ● Number of assessments: Post- only ● ● ● ● What is the disadvantage of this design? ○ Are groups equivalent? ■ Likely because of random assignment but not 100% sure because we back a pre-test When would you use it? ○ Time ○ Learning - how you respond to measurement ■ Eg., injury Number of assessments: Repeated Measures ● Third measurement is added ● ● ● RO O/X O RO O O ● ● ● ● ● ● R OXOO R O O O X = intervention R = random assignment O = observation No X = control group ● How long does an effect take to disappear (how long they linger) ○ Eg., resistance training (how long does the hypertrophy last for) ● ■ ^18lbs ^240lbs Understanding comparisons Between Group Comparisons 1) Were the groups the same/different at the start of the study (“pre”)? a) Compare the red and black bar in the pre-test b) The same 2) Were the groups the same/different at the end of the study (“post”)? a) Compare the red and black bars in the post-test b) different Within Group Comparisons 3) Did the control change pre vs. post? a) Red bar = the control b) Compare the red bars between pre and post 4) Did the experimental group change pre vs. post? a) Compare the black bars between pre and post ● ● ● ● RO O ROXO ROXO What is the advantage of this design over the standard two-group design ○ Determining if more of the independent variables is better or worse? ■ Eg., What is the dose effect? Repeated Measures between subject ● Between subjects design: ○ Participants are assigned to and only receive 1 level of the independent variable ○ Examines the variability and differences BETWEEN groups Repeated Measures Within Subject Design ● Within Subject Design: ○ Participant receive every level of the independent variable ○ Examines the variability and differences WITHIN a person, and of course examines the effect of the independent variable. ○ The different levels of the independent variable can be given to the participants in a randomized or fixed order Crossover Design ● Cross-over design is similar to the repeated measures design, but allows for wash-out period in between interventions: ○ Wash-out period = effect of treatment ‘washes out’ or diminishes to baseline ○ Reduces the likelihood of carryover effects ○ Preferred when only two interventions are used Kyle UG, Genton L, Hans D, Karsegard L, Slosman D, Pichard C. Age-related differences in fat-free, skeletal muscle, body cell mass and fat mass between 18 and 94 years of age. Eur J Clin Nutr 2001;55:663–72. Midterm question on board (you have picture) ● ● ● ● ● P - 28 males, overweight/obesity, 18-45yrs I - HIIT/MICT C - No control O - Blood Pressure, central aortic stiffness T - 6 weeks MODULE 5: QUASI-EXPERIMENTAL DESIGN Quasi experiments ● Used under conditions where true experimental designs are not possible (usually clinical conditions) or are too expensive. ● Cannot be used to infer causation as conclusively as true experimental trials ○ Poorer internal validity ● Most useful if: ○ Control is exerted where possible ○ Masking (blinding) procedures are used Quasi-Experimental: Non-equivalent Pre-test-Post-test control group design ● ● OXO O O A nonequivalent groups design is a between subjects design in which participants have not been randomly assigned to groups Think about it.... ● What is/are the independent variables in this design and what are the levels of independent variable? ● ● ● ○ Iron supplementation ○ Time (pre + post test_ What are the weaknesses in this design? ○ The groups may not be equivalent ○ However, there was a pre-test Why might you use this design ○ When you have pre-existing groups (eg., grade 3 classrooms) The independent variable is time Let’s think about it.... ● What is the independent variable in this design? ○ Time ● ● ● ● ● ● ● ● ● ● ● What are the weaknesses in this design? ○ No control group – poor internal validity Why might you use this design? ○ Ethical ○ Time frame is really short What are the weaknesses in this design? ○ No control group - poor internal validity Why might you use this design? ○ Ethical ○ Time frame is really short - pilot studies What is the independent variable in this design ○ Time - there are 2 levels -> pre, post Is there a “treatment”? ○ Yes Does everyone in the study receive the same “treatment”? ○ Yes Let’s revisit the box approach to laying out research designs. When there is a control group (Figure 1), there are two independent variables: time and supplementation. When the control group is removed, supplementation is no longer considered an independent variable, yet it is a “treatment”. This is being driven from a statistical perspective, the statistical comparison is being made within the factor “Time”. ○ SKIP 13, 15 Quasi-experimental Design: Repeated Measures ● These repeated measures designs may include: ○ Pre/Post ○ Interim assessment(s) ○ Follow-up assessment(s) Comparisons: Internal Validity ● Control vs. single group quasi-experimental designs ● Better internal validity when a control group is included in design ● What threats to internal validity does incorporation of a control group help rule out? ○ History ○ Testing ○ Instrumentation Time-Series Design ● Multiple measures to document patterns or trends in behaviour over time ○ Purpose is to determine trend in outcome over time ● ○ What type of study is this (at the highest level)? ○ Observational ● In order to make this into a quasi-experimental design, what needs to happen? ○ Manipulate variables, Control Interrupted Time-Series Design ● Interrupted Time-Series: ○ Time-series is “interrupted” by an intervention within the series of measurements ○ Used for population level interventions (including policy changes) ■ Randomization not generally possible, thus, quasi-experimental ■ May or may not have a control group ■ Repeated measures before and after intervention ■ Interrupted Time Series Design: Single Group ● Intervention to reduce inappropriate prescription of antibiotics ○ Intervention: Tracked antibiotic prescriptions by patient/New policy regarding antibiotic prescriptions ● Drug prescriptions were assessed monthly for 2 years pre and post intervention ○ Interrupted Time Series Design: Single Group ● Interrupted Time-Series Control Group Design ● Two-groups enrolled ● Repeated measures before and after intervention ● One group receives intervention, other does no ○ Control Group Interrupted Time-Series Design ● Advantages/Disadvantages ● Advantage: ○ Time-Series: Multiple measurements enhance confidence that there was a real change . ■ Versus normal variation ■ Versus repeated testing effects ● Disadvantages: ○ Lack of randomization and/or control still reduces internal validity relative to experimental design ○ Also, may be a confounding (e.g., historical) event that occurred at same time as intervention ■ E.g., Smoking, parent/family member died/cancer Randomized controlled trials, randomized clinical trials, and clinical trials ● These terms are used interchangeably, which creates confusion. ● Randomized controlled trial: A study design that randomly assigns participants into an experimental group or a control group. As the study is conducted, the only ● ● expected difference between the control and experimental groups in a randomized controlled trial (RCT) is the outcome variable being studied. George Washington University Randomized clinical trial: A study in which the participants are divided by chance into separate groups that compare different treatments or other interventions. Using chance to divide people into groups means that the groups will be similar and that the effects of the treatments they receive can be compared more fairly. At the time of the trial, it is not known which treatment is best. National Cancer Institute Clinical trial: A research study in which one or more human subjects are prospectively assigned to one or more interventions (which may include placebo or other control) to evaluate the effects of those interventions on health-related biomedical or behavioral outcomes. National Institutes of Health ○ Phases of clinical trails ● Preclinical research ○ Non-humans in laboratory research conditions ○ Screens many potential compounds ○ If promising, request to move to human trials ● Phase 0 trial ○ First trial in humans ○ Very small number of healthy people (N = 10-15) ○ Very small dose to ensure safety ● Phase 1 trial ○ Determines the largest dose without serious side effects ○ Larger sample of healthy adults, but still relatively small (N = 50-200) ○ Dosage, timing, side-effects ● Phase 2 trial ○ Given dose found to be most beneficial in Phase I ○ In the case of SARS-CoV-2 vaccine phase II has been large (thousands) ○ Small sample of people with disease/at risk for disease (e.g., patients) ○ Effect should be ≥ standard treatment to continue trials ○ More appropriate for a drug developed to treat a disease like cancer ● Phase 3 trial ○ Randomized controlled trial (masked/blinded) ○ Large studies with thousands of participants - years ○ New therapy vs. standard treatment/placebo ● Phase 4 trial ○ Post-approval ○ Alternate populations ○ Risk factors, benefits, optimal use ● ● Mice and rats are generally regarded as the go-to animal for pre-clinical research Phases of Clinical Trials ● Quasi-experimental (single group) ○ Preclinical research ○ Phase 0 Trial ○ Phase 1 Trial ● Quasi-experimental OR Experimental (drig vs usual care) ○ Phase 2 Trial ● Experimental (RCT, placebo, blinded) ○ Phase 3 Trial ○ Phase 4 Trial ● MODULE 6: Descriptive Statistics, Correlation, and Regression Descriptive statistics ● Descriptive statistics ○ What are descriptive statistics? These are tools used to organize and illustrate data. They include, but are not limited to tables, graphs, measures of central tendency, measures of variability, etc. ○ Population: A population is a group of people that share similar traits or characteristics. It is a group of people that you are interested in studying or making inference about. Target population vs. Accessible population ○ Sample: A sample is a subset of people from the target population and is ideally randomly selected from the population and is used to make inferences back to the population. E.g., randomly select 10 people from class ○ ● Measurement ○ Quantify characteristics by assigning numerals to variable according to certain rules. e.g., height in cm; left-handed = 0, right-handed = 1 ○ Scales of measurement ■ Non parametric statistics (Non-normal variables; qualitative) ● Nominal - categories with no order ○ Left handed vs right handed ● Ordinal - categories with order - difference b/w rank or order is meaningless ○ Eg. 100m race 1st, 2nd, 3rd or pain scale ■ Just interested in the order ■ Parametric statistics (Normal variables quantitative) ● Interval - equal intervals between categories - no true zero ○ Temperature (degrees) ■ No absence of something if 0 ● Ratio - equal intervals between categories - true zero ○ Temperature (kelvins, mass, height, velocity, distance) ○ Scores ■ What is a score? A score is a value associated with any variable that you have measured. ● e.g., height – if you are 180 cm then your score is 180 cm ■ In this course, we primarily work with interval or ratio data Data sets or distribution ● A data set or distribution is a collection of scores – typically arranged in a series of columns (individual variables) and rows (an individual’s scores). ● Possible outcomes: no mode FILLOUT ABOVE^^ SLIDE 4 Measures of central tendency - Median (M or X) ● Median = the score that divides the data set in two equal halves. In other words, it is the score at the 50 th percentile. ● There are two approaches to determine the median score and it depends on whether you have an odd number of scores in your data set or an even number of scores. Measures of central tendency - Mean ● Mean = The mathematical centre of all the scores in a data set; the closest value to all data (scores) Advantages ● Considers the magnitude of all scores ● Reflects the total sum of all scores Disadvantages: ● Affected by extreme scores ● Measures of variability - Range ● Measures of variability provide information about the spread of scores. Said another way, variability values provide information about the extent that scores differ from each other. Range = maximum (highest) score – minimum (lowest) score Min = 160m Max = 190cm Range = 190-160 Range = 30cm ● Advantages: ○ Very easy to do ● Disadvantages: ○ Provides minimal information ○ Only considers 2 scores, which are at the extreme ends of the distribution Measure of variability - Variance ● Variance is an average of the squared deviations about the mean. Variance provides an indication of how much the scores in a data set are spread out around the mean ● This is the theoretical aspect of variance. From a mathematical perspective the calculator is a little bit more complex ○ Symbol for Variance = s^2 ■ MAKE SURE IT IS LOWERCASE S (s^2 NOT S^2) ○ ○ Measures of variability - Standard deviation Measures of variability - Coefficient of variation ● ● ● Height (cm) ○ X = 176.3cm ○ s = +/- 9.53 Weight (kg) ○ X = 75.35kg ○ s = +/- 14.65kg ■ Calculate coefficient of variation ● CV = 9.53/176.3 * 100 = 5.40% ● CV = 14.65/75.35 * 100 = 19.44% Frequency graphs = frequency polygon and histogram ● ○ ○ ○ Mode = 18 years Median = 19 years Mean = 19.7 years ● age f 16 17 18 19 20 21 22 23 24 Quick comment on histograms ● ● ● ^N (number of scores in data set) More data means a more smoother histogram ● Quick comment on skewness ^ +ve skew -> (affected by large scores) ● ● Xbar = sample mean Xtild (squiggle line) = sample median Frequency graphs - cumulative frequency polygon ● CF = cumulative frequency ^<- -ve skew (affected by small scores) Frequency graphs – percentile polygon (cumulative percent) Need to start with covariance before talking about correlation/regression ● Covariance directly influences correlation/regression. Covariance is a measure of how two variables vary together. E.g., height and weight Units? ● Kgm -> what does this mean? Scaling? ● Magnitude of covariance is affected by magnitude of x and y scores Pearson Product Moment Correlation (PPMC) ● ● rxy indicates the strength and direction of a relationship between two variables (X & Y) e.g., height (X) & weight (Y) OR determined the reliability of a repeated measure of one variable (X1 & X2) e.g., measure height twice Rxx - same variable measured twice Correlation and linear regression ● What is it used for? ○ Linear regression or correlation is a measure of the strength of a relationship/association between two variables ● What is the difference? ○ “The main difference between correlation and regression is that in correlation, you sample both measurement variables randomly from a population, while in regression you choose the values of the independent (X) variable” (Handbook of Biological Statistics, McDonald JH 3rd ed. 2014) ● Researcher choose the independent variable ○ X value is Regression Linear Regression ● Linear regression is used to study the linear relationship between a dependent variable (Y-axis) and one or more independent variables (X-axis). ● The dependent variable (Y) must be continuous. ● The independent variable(s) may be either continuous (age), dichotomous (yes/no), or categorical (social status). ● The initial judgement of a possible relationship between two continuous variables should always be made on the basis of a scatter plot (scatter graph). ○ ○ GET THE NOTES FOR THE DAY YOU MISSED!!! The normal curve or normal distribution ● Why is the normal distribution important? 1. Distribution of many variables show normal distribution (weight, height, blood pressure) 2. Distribution of sample means produce normal curve 3. Normal curve allows determination of relative frequency/proportion and probabilities Properties of a normal distribution What inputs determine the appearance/shape of the normal curve? ● μ = population mean ● σ = population standard deviation ● X = a score ● f = frequency ○ Properties of a normal curve: ■ 1) Symmetrical about mean ■ 2) Area under the curve is a proportion ■ 3) Defined by mean and standard deviation ■ 4) Asymptotic ■ 5) ± 3 s represents where the majority of scores are located ■ 6) point of inflection at ±1s The normal curve – relative frequency and percentiles ● ● Side note: since the curve is symmetrical, the mode, median, and mean are all the same value Means will never change The normal curve - relative frequency and percentiles ● ● ● ● ● Group 1: narrow Group 2: more spread out Group 3: even more spread out Depending on mean, you can shift curve Depending on standard deviation, you can shift the spread of the curve Standardized scores, z scores, and the normal curve ● Standardized scores (e.g., z score) are powerful because we can compare and interpret scores from virtually any normal distribution of interval or ratio scores (data). ● All normal curves can be expressed in standardized terms, referred to as “z scores”. ● z score is based on mean and standard deviation of distribution. ● A z score is the distance a raw score (X) is from the mean relative to the standard ● deviation ● ○ Midterm 2 is sad despite having more marks ■ Need more information ● Eg., Midterm 1: Mean score = 30 marks, SD = 10 marks ● Midterm 2: Mean score = 90 marks, SD = 15 z score determination ● Any known score (X) can be expressed as a z score by knowing the mean and standard deviation of the distribution. ● ○ Therefore, all normal distributions can be expressed in terms of z scores (unitless) and thus, are standardised for common interpretation and comparison. ○ Midterm 1 calculation: Z = 45 - 30/10 = 1.5 ○ Midterm 2 calculator: Z = 75 - 90/15 = -1.0 Use z scores to determine: 1. What proportion of the normal curve represents a certain score (X) 2. How many people represent a certain proportion of the distribution 3. What percentile represents a score 4. What score represents a certain percentile Using z scores ● Normal curve allows determination of relative frequency or proportion ● ● Normal curve allows determination of relative frequency or proportion of area under the normal standard curve ● ● ○ ○ Green area = 50% (using the z-tables posted in kin232) ○ ○ ○ ○ ○ ● Area between mean and z = 0.2910 -> 29.1% Total area = 50% + 29.1% = 79.1 Area beyond z = 0.2090 -> 20.9% Total area = 100% - 20.9% Total area = 79.1% Total area = 50% - 29.1% = 20.90% Z score ● Z = (52 - 40)/7 ● Z = 1.71 ○ Thus, joe falls on the right side of the curve Using Z-value table ● ● Area from mean to z of 1.71 is 0.4564 -> 45.64% Joe’s percentile is 50% + 45.64% = 95.64% Sam: ● Z = (35 - 40)/7 = -0.714 ○ Mean to z = 26.11% ○ Percentile = 50% - 26.11% = 23.89% Bill: ● Z = (44 - 40)/7 = 0.571 ○ Mean to z = 21.57% ○ Percentile = 50% + 21.57 = 71.57% Percentage of people between sam and bill is 71.57 - 23.89 = 47.68% OR 26.11 + 21.57 = 47.68% INFERENTIAL STATISTICS, PROBABILITY & HYPOTHESIS TESTING Inferential Statistics ● Inferential statistics are used to make judgements of the probability that an observed difference between groups is statistically significant or that the difference between groups happened simply by chance Sampling Distribution of Means Consider 3 different types of distributions: ● 3) Sampling distribution of means Measure an infinite number of samples with 100 undergraduate students in each sample ● ● The sampling distribution of means is a frequency distribution of all of the infinitely possible sample means from a population N = numbers of scores used to determine sample means Three distributions 1. Population distribution a. Data from which a sample is chosen 2. Sample distribution a. Assume sigma and s to be equal 3. Sample distribution of means (many samples) a. Theoretical Sampling distribution of means ● Sampling distribution of means always: ○ Forms an approximately normal distribution ○ Has a mean (μ !" ) equal to the population mean (μ) from which it was created ○ Has a standard error that is a function of the sample standard deviation and the sample size ■ ■ Formula for standard error of the mean ● Larger sample size = lower variability Probability in everyday life: 1. Poker 2. Insurance Rates - How much you pay in premiums is based on probability. 3. Weather 4. Flipping a coin 5. Rolling a die Inferential statistics and probability ● Probability forms the basis for inferential statistics and statistical conclusions ● ● Proportion of the total area under the curve for particular scores equals the probability of measuring those scores Hypothesis testing ● Sample data is used to make inferences about the population. For interval/ratio data, the mean is the best representation of this data. ○ Hypothesis Testing: ■ Allows you to determine if the sample is representative of the population ■ Allows you to determine if groups are from the same population or from different populations ■ Uses sampling distribution of means to represent the population Sampling distribution of means ● As N increases, variability σ "# about the mean is reduced ○ ● ■ Standard error of the mean Sampling distribution of means = normal distribution ○ Therefore, it possesses characteristics and properties of a normal curve. ○ remember z scores? – Table C.1 (probabilities) ● ○ Black curve = increased sample size (= increased variability) Using sampling distribution of means ● Known: ○ population mean (μ) = 40.0 kg ○ population standard deviation (σ) = ± 10.0 kg ● Measure: ○ Grip strength of 100 individuals (N = 100) ○ Determine the characteristics of the sampling distribution ● Using sampling distribution of means ● What is the probability of having a sample mean between 39 – 41 kg? ● Using sampling distribution of means ● What is the probability of having a sample mean between 38 – 42 kg? ○ z score of -1.96 to +1.96: 95% Table C.1 ■ Therefore, the probability of having a sample mean less than 38 kg is <2.5% and the probability of having a sample mean greater than 42 kg is <2.5%. Using sampling distribution of means ● Using a criterion of 5%, is it possible that a sample with a mean of 36 is from the population of interest? No, NOT statistically possible! ● If a sample with a mean of 36 is not from the population of interest, then they must be from a different population. ● We call this a “rare event” the fact that the sample mean was 36 did not simply happen by chance. There’s a reason why the sample mean is 36 ○ (could be from another population OR in the case of experimental research, the independent variable/intervention caused an effect). ○ HYPOTHESIS TESTING Significance Level: ● Probability is used to define the sample means as being too unlikely to represent the underlying raw score population (rare event) ○ This sample is NOT from this population. ● Use probability of 0.05 (5 times out of 100) or 0.01 (1 time out of 100) ● Use symbol α (alpha) to represent significance level, i.e. α = 0.05 ● As a researcher, you choose the α value (either 5% or 1%). As a KIN232 student, you will always be told/given the α value to use. Working with an α value of 1% is what we call being more conservative because it will be harder to have a mean that is NOT from the population. In other words, it will be harder to find a significant difference between means/groups Graphically ● ○ One tail question: contains a “higher than” or “lower than” statement Converting alpha and 1 or 2-tails into z score Two tail question: state that there is a difference Hypothesis testing - 7 steps 1. State null hypothesis in symbols and words 2. State alternative hypothesis in symbols and words 3. Use α level and decide if one or two-tailed 4. State rejection and retain rule 5. Compute appropriate statistic a. The one step that is variable 6. Make decision by applying rejection / retain rule 7. Write conclusion in context of study a. Scientific method ● NOTE – THIS IS SIMPLY AN EXAMPLE. You will see specific aspects of each step with the various types of analyses we will examine. ○ Example: Is the starting salary of University of Waterloo graduates higher than other Ontario University graduates who average $50,000? ■ Questions you must ask yourself when reading the descriptions of the study (this one is simple, they will get more complex). ● Is there a reference to a population? ● Is randomization mentioned? ● How many groups are there? ● Is there a control group? ● Is there an independent variable (or more than 1)? How many levels of the independent variable are there? ● Is there directionality (e.g., lower, higher, smaller, larger) or not (altered, difference, change)? ● Is there a pre-test and post-test? ● Are there repeated measures? 1. State Null a. University of Waterloo graduates have the same starting salary as other Ontario University graduates. i. μ: University of Waterloo mean salary ii. μo : other Ontario Universities mean salary 1. b. Most stats tests will have hypotheses that involve symbols AND words. For t-tests, it is helpful (i.e. helpful for marking!) to label the means/groups c. We will use these 7 steps to perform several different statistical analyses. With each step there are similarities and differences between statistical tests. I will do my best to highlight these similarities and differences to you. 2. State alternate a. b. University of Waterloo graduates have a higher starting salary than the $50,000 average of other Ontario University graduates 3. Alpha level a. b. You will be told whether to use 0.05 or 0.01. The only decision you need to make is whether it is a one or two tailed test. 4. Rejection Rule a. using Table C.1, α = 0.01 / one tail i. z critical = + 2.33 ii. reject Ho : if the test statistic ≥ z critical 2.33 iii. retain Ho : if the test statistic < z critical 2.33 b. How you structure the rejection rule is dependent on several things: i. What type of statistical test you are performing ii. Is it a one tail or two tail analysis iii. What is your data? What I mean by that is it blood pressure (reduction is good), is it hypertrophy (more muscle is better), is it memory loss (more is bad) 5. Calculate a. compute test statistic using appropriate test 6. Decision a. Make a decision by applying rejection rule: i. Possibility 1: If test statistic is ≥ z critical (+ 2.33) then, reject null hypothesis ii. Possibility 2: If test statistic is < z critical (+ 2.33) then, retain null hypothesis 7. Conclusion a. Conclusion must reflect the decision i. Possibility 1: If our decision was to reject Ho 1. University of Waterloo graduates have a significantly higher starting salary (> $50 000) than other 2. Ontario University graduates. (p < 0.01) ii. Possibility 2: If our decision was to retain Ho 1. University of Waterloo graduates have statistically, the same starting salary ($50 000) as other 2. Ontario University graduates (p > 0.01) b. The conclusion must: i. Include mention of the dependent variable; ii. The comparison being made (i.e. control vs. exercise group); iii. The statistical results (i.e. statistically different, no statistical difference, statistically); iv. The probability. T-Tests ● First described in 1908 by Willian Sealey Gosset ● Degree in mathematics and chemistry ● Worked for Arthur Guinness Son and Company ● Invented t-test to help with the quality control of small samples What is a T-Test ● Ratio that quantifies how significant the difference is between the “mean” of two groups ● Considers the variance or distribution ● Used when the population standard deviation is unknown ● Uses t-statistics and compares to a t-distribtuion (Table C.2) ● More than one mean? ○ Then an ANOVA (analysis of variance) is used – discussed later PAGE 3 ● Df = degrees of freedom 3 Versions of the T-Test ● Ratio/Interval Data à parametric statistics ● One sample mean: ○ Compare Waterloo male body fat % to Canadian male body fat % ● Two sample mean: ○ Two different groups/Independent t-test - compare Waterloo KIN student’s body fat % to Guelph KIN student’s body fat % ○ Same individuals (or people who are very closely matched; e.g., twins)/Paired sample t-test - compare body fat % before and after exercise program The 𝐭 !𝐗 test 1) You would use the 𝐭 !𝐗 test when the population standard deviation is not known and thus you must estimate the population standard deviation. 2) In t test, sample distribution described by: student t distribution or t distribution (Table C.2) 3) There are many t distributions; changes with sample size (N) (determined by degrees of freedom, df = N-1 Degree of freedom (df) ● The number of values in a set of scores that are free to vary ● Defined by t !" test: df = N-1 Example ● You have measured the age of 3 UW students and Xbar = 20 years ○ Recall that Xbar = sigmaX/N ○ What is the sum of X? sigma X? ○ Possible values for X3 ■ X1 = 21, ■ X2 = 25 ■ X3 = ? ○ Xbar = sigmaX/N ○ 20 = X1 + X2 + X3/N ○ 20 = 21 + 25 + X3/3 ○ 20 = 46 + X3/3 ○ 60 - 46 = X3 ○ 14 = X3 ■ Note: X 1 and X 2 values are free to vary (i.e., 2 degrees of freedom). X 3 cannot vary – it is a defined value that is determined by X 1 and X 2 . The 𝐭 !𝐗 test ● As sample size (N) increases the t distribution becomes closer to the z distribution ● Comparison of t score vs. z score - as N increases (df ↑), t value becomes closer to z value The 𝐭!𝐗 test formula ● Recall: z = X-Xbar/s ● Txbar Xbar - (uo)/(sxbar) Review of SD and SEM ● Standard deviation (s/SD/sigma) ○ Amount of variability from the individual data values to the mean ● Standard error of the mean (sx/SEM/sigmaxbar) ○ Amount of discrepancy likely in the sample mean compared to the population mean ■ Defining variables ■ ■ ■ ■ s/ σ – Population standard deviation N/n – the sample size x i – each value from the population μ – Population mean Premise of hypothesis testing ● Use sampling distribution of means to establish the range of sample means that would represent the population from which the sample came ● Calculate one value (t !/ ) that represents the sample mean ● Compare the t !/ value to a critical/criterion t value from Table C.2 ○ Generated based on degrees of freedom, n shit ● If t !/ lies beyond the critical value (within the tails of the distribution), the sample came from some other population (in other words a treatment effect exists) Example t !" test: using sampling distribution of means A local Waterloo elementary school has introduced a 6-month resistance exercise training program to grade 5 students. The school board is interested if the program changes the weight of the students. You measure the weight of 20 grade 5 students involved in the program. The mean weight for these students was 32.4 kg with a standard deviation of ±4.2 kg. The reported average weight for Ontario grade 5 students is 30.7 kg. Does a resistance training program change the weight of Grade 5 students when compared to the Provincial average? Use α=0.05 ● ● ● ● N = 20 Xbar = 32.4 S = +/- 4.2 Mew (u) 30.7 Df = n - 1 = 19 Step 1 hypothesis testing: ● Resistance training did not change the body mass in grade 5 students Step 2: ● Resistance training caused a change in body mass of grade 5 students Step 3: ● α=0.05 ○ How many tails? One or two? ■ Two, there is a difference Step 4: ● Fetain Ho : if t !" < 2.093 AND if t !" > -2.093 ● Reject Ho : if t !" ≥ 2.093 OR if t !" ≤ -2.093 Step 5: ● Calculating the test statistic - 𝐭 !𝐗 ○ Xbar = sample mean ○ μo = population mean ○ ○ sxbar = standard error of mean CALCULATION ON PHOTO GALLERY/SLIDES Step 6 and 7: ● Decision ○ Since t !/ 1.81 < 2.093, retain null hypothesis ● Conclusion ○ There was no statistical difference in body mass of a sample of Grade 5 students when compared to Ontario Grade 5 students (p > 0.05) ■ p = probability ● When reporting NO STATISTICAL DIFFERENCES findings use p > 0.05 or p > 0.01 (depending on α) ● When reporting STATISTICAL DIFFERENCES findings use p < 0.05 or p < 0.01 (depending on α) t !/ # ?!/ % - t test for two independent samples ● When would I use this test? When reading a study description, ask yourself, “how many observations or measurements were made?”. In other words, how many means are being generated and compared based on the design. We are looking for 2 means! ● Cohort study – Highly unlikely, why? ● Case-control – Possibly, e.g., You identify group of heart failure patients (cases) and a similar group of healthy controls and measure their VO2 max. This is an example of a cross-sectional case- control study ○ An observation (“O”) will generate a mean value. Thus, if you have 4 observations you will have 4 means Example: ● You are working on the KIN 204 flexibility portion of the course and a new and exciting flexibility program was just introduced. You were interested in answering the question, does this new flexibility training program increase flexibility compared to an old/traditional flexibility program? You recruited 18 students from the 2B KIN 204 class and randomly assigned them to a group (see table below). The students participated in their respective programs for 1 month and then flexibility was assessed. ○ Use α=0.05 T-TEST PART 2: ● REPEATED MEASURES: typically pre/post test ● D stands for “differences” td 2 means are in this example -> pre + post Null Hypothesis: The drug had no effect on tumour size. ● Ud(mew) = >0 Alternate hypothesis: the new drug reduced the size of the tumour Conclusion: ● The tumour size was statistically decreased post when compared to pre. (p < 0.01) ○ When should it be <, when >? ■ Statistical difference p< ■ NO statistical difference p> There will be no difference in height between males + females ● Hi = u1 - u2 does not = 0 ○ There will be a difference in height between males + females Sigma = 0.01 tcritical = +/- 3.355 Df = N1 + N2 - 2 Df = 8 2 tails Reject Ho if <-3.355 or >3.355 (or equal to) Retain Ho if >-3.355 or <3.355 (or equal to) Module 10: Error Power and Confidence Intervals Student STATISTICAL CONCEPTS ● Statistical difference (or no difference) ● Type I & type II errors ● Power ● Confidence Intervals Statistical difference or no statistical difference ● Takeaway: significance does NOT equal importance Type I and Type II Errors ● Possible conclusions: ○ “Reject H o: sample is not from population” or, “the treatment group was statistically different from the control group”. However, it is possible that we made the wrong decision – the sample is in fact from the population. We have committed Type I error. ● Type I error (also called α error): we rejected H 0 but H 0 is true ○ “Retain H o: sample is from population” or, “the treatment group was statistically different from the control group”. However, it is possible that we made the wrong decision – the sample is in fact from a different population. We have committed Type II error. ● Type II error (also called β error): we retained H 0 but H 0 is false Type I error: compare α levels ● Using larger α level (0.05 vs 0.01) has two effects: ○ 1) 5% is easier to reject H o vs. 1% ○ 2) 5% greater risk of making type I error ● Using α = 0.01 decreases the chance of type I error (false claim) ○ BUT Increases the probability of type II error (failure of detection) Type II error: retain Ho but should have rejected Ho ● When discussing type II error, need to consider two different populations ○ Example: Researcher develops new drug treatment for reducing blood cholesterol. ■ individuals with high cholesterol (control group) ● mean = 250 mg/dl ■ individuals with drug treatment (experimental group) ● mean = 240 mg/dl ● Conclusion: drug treatment does not work ○ But the drug actually does decrease blood cholesterol!!! Type II error ● If drug is effective, there exists two populations: ○ 1) People with high blood cholesterol (control group) ○ 2) People using drug who have lower blood cholesterol (treatment group) Type II error: β error Power ● Probability of correctly identifying a statistical difference if one exists (correctly reject H o) ● The researcher who did not find that a drug treatment was effective (when in truth the drug did work) may have an experiment with low power. ● A study with low power may fail to detect a difference when one does exist ○ Factors affecting power 1) Significance level chosen ● Selecting 0.05 vs 0.01 will increase power ● A one tail analysis will increase power compared to two tail 2) Sample size - N effects the sampling distribution of means 3) Variability (σ) will change distribution similar to N ● large σ: wider distribution thus more overlap and decrease power ● small σ: tighter distribution and increase power Large N:increase power ● Power = 1-B ● 1 - 0.2 = 0.8 Small N: decrease power ● 1-B ● Power = 1 - 0.4 = 0.6 Magnitude of difference (μ1 – μo ) ● the greater the difference between means – the larger the power the magnitude is determined by treatment effect ○ Does drinking cola cause stomach ulcers ■ drink 1 cola per week vs. drink 20 colas per day ● ● Beta is smaller due to larger sample Mean is placed around the middle of the two graphs Confidence Intervals ● Definition: ○ A confidence interval (CI) is a range of scores with defined boundaries that should contain the population mean. ○ The boundaries are calculated from the sample mean and standard error of the mean. ○ The wider the CI, the more confident you can be that the population mean is within the boundaries. ■ 99% ■ Alpha = 0.05 ■ Alpha = 0.01 ■ For example, the 95% confidence interval of the mean: ● The defined interval will include the population mean with 95% certainty. Said another way, the probability of finding the population mean with the confidence interval is 95%. How to calculate a confidence interval ● We require a sample mean, a standard error of the mean and t value to calculate a confidence interval ● (CI). The t value will depend on the probability and degrees of freedom. Typically, CIs are 95% and 99%, thus, we require a t value that encompasses 95% or 99% of the distribution. ○ Use the example: Xbar = 40 cm, s = ±8.8 cm, N = 22, and α = 0.05 X^2 one way test Chi square test: ● Non-parametric test ● Nominal scale of measurement ● Frequencies of more than 2 categories (observations ● X^2 tests ● Tests whether the frequency in each category in the sample data represents the expected frequency based on previous research (or some known data) ● ● ○ O – observed frequency (sample) E – expected frequency (based on population/previous research/known data ○ Used with categories describing one variable ○ Also called “goodness of fit” test, tests how “good” the “fit” is between our data and the Ho ■ Data = observed frequency ■ Ho = expected frequency X^2 distribution ● ● ● Distribution is dependent on degrees of freedom df = K – 1 where K is number of categories (options) ● X^2 test ● If you know the expected proportion for each category: ○ burgers 35%, pizza 25%, subs 20%, other 20% ■ From previous experience, knowledge, research etc ● If you don’t know the exwpected proportion for each category: ○ Burgers 25%, pizza 25%, subs 25%, other 25% ■ Evenly distribute the proportion based on number of categories ○ Decision: Since X^2 18.78 > 11.34, we will reject Ho Conclusion: University students have a statistically different preference for fast food when compared to general population (p < 0.01) OUR OWN STUDY (FAVOURITE CHIP) ● ● ● x^2 = 7.07 Decision: ● Since x^2 > 7.07 < 9.49, retain the Ho Conclusion: ● KIN232 students do not have a statistically significant preferred flavour of chips (p > 0.05) X^2 two-way test ● ● You can think of independent and dependent like a correlation. Does knowing whether someone studied help us determine the likelihood that someone will pass. In the example on the left there is no relationship between studying and the grade. In Scenario B: that there is a relationship between whether you studied and whether you passed. So these two variable are dependent. Example: ● A health researcher is investigating if there is a relationship between smoking and alcohol consumption. Individuals were classified for both variables as low, moderate or high. ○ ○ ● ● Example classification: ■ Low -> 0 - 1 drinks/week ■ Moderate -> 1-3 drinks/week ■ High -> 3-5 drinks/week What are the expected frequencies ? ○ Assume “equal” probability for each cell based upon the observed row and column total. Decision: ● Since X^2 2.62 is < 9.49, retain Ho Conclusion ● Alcohol consumption and smoking are not statistically related (p>0.05) OR Alcohol and smoking consumption are statistically independent from each other (p > 0.05). MODULE 12: One way ANOVA ● ANOVA -> ANalysis Of VAriance ● Ratio/Interval Data ○ ANOVAs are used when you have more than 2 means to analyze 1 factor: ● Different groups – compare undergraduate body fat % at University of Waterloo, University of Guelph and University of Toronto ● Same individuals – compare body fat % before treatment and at 1 month, 6 months and 1 year after starting an exercise program ANOVA Theory ● Example: Examine the effects of exercise on blood pressure. ○ Group #1: no exercise ○ Group #2: exercise 1X per week ○ Group #3: exercise 2X per week ○ Group #4: exercise 3X per week ○ Group #5: exercise 4X per week ■ Independent variable is exercise frequency – 5 levels ■ This research design produces 5 means ● If t tests were used it would require 10 separate analyses, which increases potential for Type I error ANOVA ● One of the more common inferential statistical procedure used in Kinesiology research Keeps error equal to α ● Examines ratio of variances – there are 2 general sources of variance ○ 1) Within group variance ■ Describes variability within the conditions ■ An individual value is compared to group mean ● Variability is due to sampling error ● Individual differences ● Measurement error ○ 2) Between group variance ■ Describes the variability between conditions ■ Mean of each group compared to mean of all scores ● Within group variability ● Group is from different population (treatment caused change) ANOVA Theory Need the means, not necessarily comparing them ● ● ● ● ANOVA – we calculate a F ratio ○ F ratio = between group variability/within group variability Between group variability = treatment effect + within group variability. Thus: ○ F ratio = (treatment effect + within group variability)/within group variability If groups are from the same population, then: ○ The treatment effect = 0 ○ Thus, if the F ratio = 1 if there is no treatment effect ○ F ratio = 0 + (within group variability)/within group variability The greater the F ratio (> 1.0), the greater the between group variance (due to larger treatment effect) ○ A statistical difference is present when: Fcalculated ≥ Fcritical One-way ANOVA ANOVA Theory & Calculation ● Steps required: ○ 1) determine the sume of square (SS) *These will be given in KIN 232* ○ 2) determine the mean square (MS) ■ Divide SS by df ○ Calculate F ratio ■ Divide MSbetween by MSwithin ● F = between group variance/within group variance ○ One-way ANOVA formulae ● Example ○ 0x/week, 3x/week, 5x/week ■ 20 people each program ● ● ● dfbetween= K-1 = 3-1 = 2 Dfwithin = 60 - 3 = 57 Dftotal = N - 1 = 59 One-way ANOVA Example: A health researcher was interested in answering the question, “what is the optimal frequency of aerobic exercise per week to lower mean arterial pressure (MAP) in hypertensive individuals?” To answer this question, the researcher recruited 25 individuals with hypertension (age 52±8 years, height 168±12 cm, weight 87±5 kg). These individuals were randomly assigned to one of five groups. The details of the aerobic exercise and groups are provided in the table. The study period lasted 12 weeks and mean arterial pressure was assessed at the end of the study period ● ● μ means ‘mean’ ○ Null hypothesis: Aerobic exercise training will have NO effect on MAP ○ Alternate hypothesis: Aerobic training will have an effect on MAP ● ● ● ● Dfbetween = k - 1 = 5 - 1 = 4 Dfwithin = 25 - 5 = 20 ○ Use table: critical values of F: The F-tables (TALBE C.5) ■ FCritical = 2.87 Reject Ho if Fratio is > or equal to 2.87 Retin Ho if Fratio < 2.87 Decision: Since F 49.74 > 2.87, reject Ho Conclusion: Aerobic training resulted in a statistical difference in MAP in individuals with hypertension (P < 0.05) ● ***Notice how non-descript this conclusion statement is?*** That’s because ANOVAs DO NOT provide any information regarding specific means/groups that are different from each other. If the H o is rejected in an ANOVA, then all we know is that 2 or more means are different from each other ● How do we know what means are different from each other? (0, 1X, 2X, 3X or 4X per week) Examine means for the different treatments: ○

Research Methodology Lecture Notes

Related documents

Products

Support

Research Methodology Lecture Notes

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib