- 1- Hypothesis Testing (Chapter 8) Definition: Hypothesis testing is an inferential procedure where sample data is used to evaluate the credibility of a hypothesis about a population Important assumption: That the effect of a treatment is to add or subtract a constant from each individual’s score. This implies that after the treatment, the population has: The same shape The same standard deviation Sometimes the effect of a treatment is obvious and sometimes it is not (example) Because of the possibility of researchers making errors when judging experimental effects, a standardized procedure for hypothesis testing has been developed 1 - 2- Hypothesis testing procedure: General overview 1) The researcher states 2 opposing hypotheses: The null hypothesis or Ho The scientific, working or alternate hypothesis H1 For example: A researcher is examining the effects of increased handling (stimulation) on the physical development of children The national data shows a population mean of 26 lbs. for 2 year olds The researcher must compare the sample data (children who have experienced increased handling) with the population mean Hypotheses: Ho states that the treatment has no effect – additional handling has no effect on body weight of the population of 2 year olds Ho: u infants handled = 26 lbs. In an experimental context, Ho predicts that IV (independent variable or treatment variable) has no effect on the dependent variable (DV) for the population – in this case weight H1 states the opposite of Ho – that the treatment (IV) does produce a change in the DV for the population H1: u infants handled does not equal 26 lbs. 2 - 3- Note: H1 does not specify the direction of change created by the treatment – directional hypotheses tests will be discussed later When not specifically asked to carry out a directional hypothesis test, always carry out a non-directional test generally non-directional tests are the most conservative and, therefore, the most appropriate 2) The question the researcher has to answer is whether differences between the sample statistics and the population parameters are the result of the treatment or the result of sampling error – standardized criteria are set to answer this question in an objective manner Example: 3) Collect sample data: Sample of infants obtained objectively Parents trained to provide additional handling Infants’ body weights measured when children reach 2 yrs 4) Evaluate Ho: 3 - 4- Data from the sample, after treatment, are compared with Ho – there are 2 possible outcomes: Reject Ho – the treatment has an effect Fail to reject Ho – treatment has no effect 4 - 5- Types of errors that can be made in hypothesis testing: Type 1 errors: reject Ho when the treatment in fact has no effect (falsely finding) Type 2 errors: fail to reject Ho when in fact the treatment has an effect, in other words, failing to reject a null hypothesis that is really false (fail to find) Table: 5 - 6- We can’t be certain if our decision to reject or fail to reject Ho is correct but we can figure out the probability of being right or wrong The hypothesis testing procedure is structured so that the researcher can specify and control the probability of making a Type 1 error (falsely finding an effect) – this probability is always kept very low We need to determine which sample means are likely if Ho is true and which are unlikely – the term significance refers to the probability value which is used to define ‘unlikely’ and is generally set at 0.05 (5%) – we also call this the alpha level For example: 0.05 (5% probability that the effect occurred by chance - most common value used criterion for publication – set by Ronald Fisher 1925) 0.01 (1% probability that the effect occurred by chance - criterion for publishing in some prestigious scientific journals 0.001 (0.1% probability that the effect occurred by chance - criterion for publishing in some prestigious scientific journals 0.1 trend 6 - 7- The alpha level is used to divide the distribution of sample means into 2 parts: Sample means comparable with Ho Sample means significantly different from Ho We reject Ho if the sample mean after the treatment is in the extreme tails of the distribution of sample means. Therefore, the alpha level defines the probability of making a Type 1 error (max 5%) Steps to evaluating hypothesis: 1) 2) 3) 4) State the hypotheses (Ho and H1) and define the alpha level Use the alpha level to define data that would reject Ho Analyze sample data Make a decision about Ho Note: For non-directional tests, alpha is divided evenly between the 2 tails of the distribution of sample means – these areas in the tails are called critical regions or regions of rejection (rejection of Ho) 7 - 8- Test statistic: various test statistics can be used to evaluate a hypothesis from sample data one which we can use is the Z Z x x obtained difference/difference due to chance Failure to reject Ho: We don’t prove that Ho is true because a sample provides limited information about a population Researchers, therefore don’t say that they ‘accept’ Ho but instead that they ‘fail to reject’ Ho By having a low alpha (0.05) we are actually increasing the chance of a Type 2 error (failing to find an effect when one is really there) but this is generally viewed as a less serious error than a Type 1 (falsely finding an effect when one isn’t really there) Alpha defines the risk of a Type 1 error but there is no method of specifying the chance of a Type 2 error (beta) For alpha: 0.05 0.01 0.001 Z= +/- 1.96 Z= +/- 2.58 Z= +/- 3.30 8 - 9- In the literature: Example: The results indicate that increased handling has a significant effect on the weight of 2 year olds, Z = __ p<0.05 Assumptions for hypothesis testing: Sample data is obtained randomly Observations are independent (orthogonal) no consistent relationship between observations – usually met by random sampling Assume standard deviation remains unchanged by treatment Distribution of sample means must be normal (stated in question or n at least 30) Directional tests (one tailed): Use of directional tests is warranted but not recommended Example of distribution: Slightly different phrasing of Ho and H1, Note: usually easier to start with H1 H1: u with infant handling > 26 lbs. Ho: u without infant handling ≤ 26 lbs. 9 - 10 - Some say 1 tailed test makes it too easy to refute Ho and, therefore, too easy to make a Type 1 error Some like to use 1 tailed test for exploratory research (pilot studies) – generating of new research possibilities 10 - 11 - 11