Experimental Psychology Validity and Reliability replication—sufficient detail in the procedure section to duplicate the experiment with the same materials and instructions reliability—the ability to produce consistent and stable scores or results validity—the ability of a measuring instrument or experiment to measure what is intended internal validity—controls prior influence, maturation processes and the order effect, but not the subjects’ history with the task threats to internal validity include: unplanned events differences between subjects statistical regression testing problems changes in a measuring instrument natural changes in subjects over time experimenter bias and misuse of statistical tests external validity—controls the subjects’ history with the task but not the other aspects threats to external validity include: poor selection of subjects limited characteristics of the subjects limited operational definitions multiple treatment interference subject awareness of the study too limited a setting for the experiment a limited time frame in obtaining results Variables independent variable—what the experiment manipulates; commonly the experimental and control groups dependent variable—what is being measured; should be measurable with numbers Variables confounding variables—variables that are unintended and throw off the experimental results participant variables—relate to the subjects’ individual characteristics and can include personal background mood level of anxiety intellectual level awareness of the experiment situational variables—relate to the experimental environment and can include room temperature noise distractions unplanned interruptions Samples independent subjects design—a design in which groups of subjects experience different experimental conditions; comparing a control group versus an experimental group is a common independent samples design; measures two distinct groups matched pairs design—a design where subjects are matched on one variable, and then one subject is given one condition of the independent variable and the other subject is given the other condition (usually experimental and control conditions) repeated measures design (within-subjects design)—a design in which one group is measured before and then after administration of a variable; typically this involves a pretest and a posttest single participant/subject design—a design in which one subject’s performance over time or across experimental conditions is tracked and recorded; this is not the same as a case study population—the total number of people or things from which to draw a sample sample—a small group of people or things selected to represent the target population random selection of participants and random assignment to groups—helps to increase the validity of the results Samples random sampling—selecting a sample from the population purely at random representative sampling—occurs when the population is divided into subpopulations and then a random sample is taken from each subpopulation stratified sampling—a sample that matches the overall characteristics of the population from which it is drawn systematic sampling—a sample that is pulled from the population using a system or some criteria, such as every 10th person opportunity sample—using an available, pre-existing group Experimental Methods experimental group—a group that receives the experimental condition; the group that is affected by the independent variable control group—a group that does not receive the experimental condition; the group is not affected by the independent variable placebo group—a group that is a control group but receives a placebo to minimize subject bias (a single blind experiment) research bias and expectancy (researcher and participant effects)—bias that occurs demand characteristics—a cuing in process which insidiously instructs subjects in an experiment about what is expected (e.g. compassionate behavior, aggressive behavior, etc); if deception is used as to the purpose of the experiment, such subject bias arising out of demand should not occur response bias—responding in a way the subject thinks the experimenter wants them to rather than according to their own beliefs or cognitions; can be a result of demand characteristics; also called the Hawthorne effect single-blind techniques—an experimental design in which subjects do not know which group they are in, typically an experimental or control group; this reduces subject bias double-blind techniques—an experimental design in which both the subjects and the experimenter do not know which group is which; this reduces both subject and experimenter bias Descriptive Statistics descriptive statistics—statistics that are used to describe a set of data, including averages, most frequent responses, the range of scores from high to low levels of measurement nominal scale—a unit of measurement using named categories such as eye color, gender, voting status, etc.; no logical order and no indication of how groups differ are apparent; this is the least refined of the four measurement scales (e.g. Ford or Chevrolet, or Male or Female) ordinal scale—a unit of measurement in which values to a variable can be rank ordered from highest to lowest, such as class rank, percentile ranks, ordering ideas from best to worst, etc. (e.g. good, better, best) interval scale—a unit of measurement similar to ordinal scales but in which the difference between each unit is equal or constant; the difference between 5 and 6 is the same as between 17 and 18; there is no natural starting point at zero (e.g. temperature) ratio scale—a unit of measurement based on ordinal and interval scales but comparisons can be made using ratios; saying a number is two or three times greater than another number; this is the most refined of the four measurement scales; zero is the natural starting point (e.g. age, weight, height) measures of central tendency mean—the average score from a group of scores mode—the most frequent scores that occurs in a group of scores median—the middle scores in a group of scores that separates the top half from the bottom half from rank-ordered scores outliers—a value far away from the other values in a set of data measures of dispersion range—the difference between the highest and lowest score standard deviation—a numerical index that tells, on average, how far the scores fall from the mean; the larger the standard deviation, the greater the spread of scores variance—the second moment around the mean; the expected value of the square of the deviations of a random variable from its mean value quartile and semi-interquartile range—divides ranked data into four parts or quartiles normal distribution of data standard scores—scores converted from raw score distributions; the two most common are z-scores and T-scores frequency—the number of raw scores that fall within a class of scores frequency distribution—a summary of how often different scores appear within a set of scores normal bell curve—a frequency curve in which most of the scores fall to the middle and gradually taper off to the sides Variables operational definition of variables—when an independent variable is defined according to the events used to produce it (e.g. what constitutes “low anxiety” or “high anxiety”); a measured operational definition applies to dependent variables that are defined in operational terms response rate—in survey research, the percentage of responses returned from the sample population research (or experimental) hypothesis—a detailed explanation of a predicted relationship between certain conditions or variables; this hypothesis is not subject to change null hypothesis—a detailed statement indicating there is no relationship between certain conditions or variables Inferential Statistics inferential statistics—statistical score used to support an hypothesis and to make inferences about the collected data probability—an estimation of how many times a certain event is likely to happen levels of confidence—the level of certainty that an inferential statistic is not due to chance in experimental research .05 is the accepted minimum level of confidence in other words, there is less than a 5% chance that are results are in error the appropriate choice of statistical tests and limitations upon their use most of these tests are used to measure the relationship between two sets of scores: Pearson product-moment (Pearson r) correlation coefficient shows the linear relationship between two variables independent samples t-test is used for independent groups to determine if the mean on one group is different from the mean of the other group to support a predicted direction in an hypothesis dependent samples t-test is used for matched groups when a pretest and posttest have been administered non-parametric tests—tests in which the results do not fall into a normal distribution Mann-Whitney U test—a test for use with two independent samples; the basis of this test is that if all the data from the two samples are ranked, the high and low ranks should be evenly distributed if the samples are equal chi-squared test—a statistical procedure for use with nominal (frequency counts) and ordinal (percentages) data Graphical Techniques bar chart—a graph using bars to denote numerical counts histogram—a bar chart that indicates the frequency of scores line graph—a graph using lines to connect dots that denote numerical counts frequency polygon—a line graph that indicates the frequency of scores Ethical Considerations there are six basic ethical considerations you need to address in any sort of research study: informed consent--subjects should be told briefly what will be involved in the psychological experiment. you do not necessarily need to explain your hypothesis but should explain what sort of tasks will be required of the subject during experimentation justification for any discomfort or deception--you need to justify why you would cause a subject any physical or mental discomfort, or deceive them in some way some experiments by their very nature may cause some mental frustration that would be encountered day-to-date (e.g. challenging math or verbal problems, or making choices between possible selections) some topics by their very nature may cause distress (e.g. cognitive dissonance or social conformity) you must justify why the experimental design you have developed warrants these minor discomforts Could this topic be tested any other way? Be sure to ensure accurate participation of your subjects. Provide them with the appropriate environment in which to complete their task. Be sure to conduct yourself professionally; the more professional and serious you are, the better your results will be right of withdrawal--at all times, subjects have a right to withdraw from the experiment Ethical Considerations anonymity—the assurance in testing the data results cannot be traced back to any individual subject findings are confidential--while you can ask demographic information such as age, grade level, sex, or GPA, you cannot record their names in connection with their results you cannot refer to subjects by name in your report you should make sure all data collected is done so anonymously and assure your subjects that the results will only be used for this experiment and then discarded participants are debriefed--all subjects should be debriefed at the conclusion of your experiment you should explain to them at the time they participate, if possible, what you are testing and how their results will be compared to others you should briefly explain what theories support the behavior that they displayed in the experiment if debriefed at the conclusion of the experiment, you can share with them the results and any conclusions you've made based on all the data you've collected Experimental Methods participant and researcher expectancies—based on the idea that what the researcher expects will alter the subject’s performance this is known as the Pygmalion effect this was shown by Rosenthal’s study that experimenter expectancies can alter the performance of children in a classroom, also called the Hawthorne effect this highlights the need to control experimenter bias Questionnaires/Surveys large-scale and small-scale surveys—the scale of the survey is dependent on the number of surveys collected, either a lot (large scale) or a few from a select group (small scale) use of Likert scale—a rating scale developed by R. Likert where respondents are asked to indicate where they fall along some dimension this is then converted into a numerical score (e.g. strongly agree-1, agree-2, neither agree nor disagree-3, disagree-4, strongly disagree-5) advantages include: flexibility in asking questions less time to collect data large amounts of data can be collected at once disadvantages include: question-bias self-report bias erroneous memories of the subjects social desirability bias Naturalistic Observation participant observation—the observer is part of the group being observed non-participant observation—the observer remains detached from the group; sometimes called a complete observer methods of recording data, including time, event and point sampling: duration recording—the observer specifies the length of time a particular behavior will last (e.g. talking to other student; being out of one’s set) frequency-county method—counting the number of time (frequency) the behavior occurs interval recording—a single subject is observed for a set amount of time and the subject’s behavior is recorded continuous observation—after observing the subject, the observer gives a narrative account of the observed behavior; it is up to the observer to determine which behaviors are important to report time sampling—the observer randomly selects time period to observe the subject; this is used in conjunction with duration recording, the frequency-count method, interval recording and continuous observation advantages of observations include: lessening self-report bias and social desirability information is not limited to what the subject can recall disadvantages include: difficulty in measuring complex behavior expense