ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: http://people.math.yorku.ca/~zyang/it ec6310.htm Office: Tel 3049 Functions of a Research Design • Two activities of scientific study – Exploratory data collection and analysis • Classifying behavior • Identifying important variables • Identifying relationships among variables – Hypothesis testing • Evaluating explanations for observed relationships • Begins after enough information collected to form testable hypotheses 2 Descriptive Methods • Observational methods • Case study method • Archival method • Qualitative methods • Survey methods 3 Example • Imagine that you want to study cell phone use by drivers. You decide to conduct observations at three locations – a busy intersection, an entrance/exit to a shopping mall parking lot, and a residential intersection. You are interested in the number of people who use cell phones while driving. How would you recommend conducting this study? How would you recommend collecting data? What concerns do you need to take into consideration? 4 Example Your research will investigate the following hypotheses: 1. Software design is a highly collaborative activity in which team members frequently communicate. 2. Team members frequently change their physical location throughout the day. 3. Team members frequently change the ways in which they communicate. How would you recommend conducting this study? How would you recommend collecting data? 5 Example Function points (FP) and source lines of code (SLOC) constitute two common software measures for estimating software size and monitoring programmer productivity. It is noted that there exist differences in developer and manager perceptions of software measurement programs in understanding the benefits and costs of software measurement. Your research is proposed to determine whether this perception gap exists for FP and SLOC. How would you recommend conducting this study? 6 Example Extreme Programming (XP) is a new lightweight software development process for small teams dealing with vague or rapidly changing requirements. Your research is proposed to provide observations about the key practices of XP to provide guidelines for those who will implement XP. How would you recommend conducting this research? 7 Experimental Research • The most basic experiment consists of an experimental and a control group. • Control is exercised over extraneous variables – Holding them constant – Randomizing their effects across treatments • A causal relationship between the independent and dependent variables can be established. 8 Example The research goal was to evaluate whether the use of the architecturally significant information from patterns (ASIP) improves the quality of scenarios developed to evaluate software architecture. Out of 24 subjects 21 were experienced software engineers who had returned to University for a postgraduate studies and remaining 3 were fourth year undergraduate students. All participants were taking a course in software architecture. The participants were randomly assigned to two groups of equal size. Both groups developed scenarios for architecture evaluation. One group was given ASIP information the other was not. The outcome variable was the quality of the scenarios produced by each participant working individually. 9 Strength and Limitations of Experimental Research • Strength – Identification of causal relationships among variables • Limitations – Can’t use experimental method if you cannot manipulate variables – Tight control over extraneous variables limits generality of results • Trade-off exists between tight control and generality 10 Internal Validity • INTERNAL VALIDITY is the degree to which your design tests what it was intended to test – In an experiment, internal validity means showing that variation in the dependent variable is caused only by variation in the independent variable. – In correlational research, internal validity means that changes in the value of the criterion variable are solely related to changes in the value of the predictor variable. • Internal validity is threatened by CONFOUNDING and EXTRANEOUS VARIABLES. • Internal validity must be considered during the design phase of research. 11 Factors Affecting Internal Validity History Events may occur between multiple observations. Maturation Participants may become older or fatigued. Testing Taking a pretest can affect results of a later test. Instrumentation Changes in instrument calibration or observers may change results. Statistical regression Subjects may be selected based on extreme scores. Biased subject selection Subjects may be chosen in a biased fashion. Experimental mortality Differential loss of subjects from groups in a study may occur. 12 External Validity • EXTERNAL VALIDITY is the degree to which results generalize beyond your sample and research setting. • External validity is threatened by the use of a highly controlled laboratory setting, restricted populations, pretests, demand characteristics, experimenter bias, and subject selection bias. • Steps taken to increase internal validity may decrease external validity and vice versa. • Internal validity may be more important in basic research; external validity, in applied research. 13 Factors Affecting External Validity Reactive testing A pretest may affect reactions to an experimental variable. Interactions between selection biases and the independent variable Results may apply only to subjects representing a unique group. Reactive effects of experimental arrangements Artificial experimental manipulations or the subject’s knowledge that he or she is a research subject may affect results. Multiple treatment interference Exposure to early treatments may affect responses to later treatments. 14 Types of Experimental Designs • Between-Subjects Design – Different groups of subjects are randomly assigned to the levels of your independent variable. – Data are averaged for analysis. • Within-Subjects Design – A single group of subjects is exposed to all levels of the independent variable. – Data are averaged for analysis. • Single-Subject Design – Single subject, or small group of subjects is (are) exposed to all levels of the independent variable. – Data are not averaged for analysis; the behavior of single subjects is evaluated. 15 Example If a researcher wants to conduct a study with four conditions and 15 participants in each condition, how many participants will be needed for a Between-Subjects Design? For a Within-Subjects Design? 16 Example A researcher is interested in whether doing assignments improves students’ course performance. He randomly assigns participants to either a assignment condition or non-assignment condition. Is this a Between-Subjects Design or a WithinSubjects Design? 17 Example The research goal was to evaluate whether the use of the architecturally significant information from patterns (ASIP) improves the quality of scenarios developed to evaluate software architecture. All participants first developed scenarios for architecture evaluation without ASIP information. Then the participants are provided ASIP information and developed new scenarios. The outcome variable was the quality of the scenarios produced by each participant before and after ASIP information is provided. 18 The Problem of Error Variance • Error variance is the variability among scores not caused by the independent variable – Error variance is common to all three experimental designs. – Error variance is handled differently in each design. • Sources of error variance – Individual differences among subjects – Environmental conditions not constant across levels of the independent variable – Fluctuations in the physical/mental state of an individual subject 19 Handling Error Variance • Taking steps to reduce error variance – Hold extraneous variables constant by treating subjects as similarly as possible – Match subjects on crucial characteristics • Increasing the effectiveness of the independent variable – Strong manipulations yield less error variance than weak manipulations. 20 Handling Error Variance • Randomizing error variance across groups – Distribute error variance equivalently across levels of the independent variable – Accomplished with random assignment of subjects to levels of the independent variable • Statistical analysis – Random assignment tends to equalize error variance across groups, but not guarantee that it will – You can estimate the probability that observed differences are caused by error variance by using inferential statistics 21 Between-Subjects Designs • Single-Factor Randomized Groups Design – The randomized two-group design – The randomized multiple group design • The multiple control group design • Matched-Groups Designs – The matched-groups design – The matched-pairs design – The matched multigroup design 22 Single-Factor Randomized Groups Designs • Subjects are randomly assigned to treatment groups. • Two groups (EXPERIMENTAL and CONTROL) are needed to constitute an experiment. • The TWO-GROUP DESIGN is the simplest experiment to conduct, but the amount of information yielded may be limited. 23 Single-Factor Randomized Groups Designs • Additional levels of the independent variable can be added to form a MULTIGROUP DESIGN. • If different levels of the independent variable represent quantitative differences, the design is a PARAMETRIC DESIGN. • If different levels of the independent variable represent qualitative differences, the design is a NONPARAMETRIC DESIGN. 24 Conducting a Two-Group Matched Groups Experiment • Obtain a sample of subjects • Measure the subjects for a certain characteristic (e.g., intelligence) that you feel may relate to the dependent variable • Match the subjects according to the characteristic (e.g., pair subjects with similar intelligence test scores) to form pairs of similar subjects 25 Conducting a Two-Group Matched Groups Experiment • Randomly assign one subject from each pair of subjects to the control group and the other to the experimental group • Carry out the experiment in the same manner as a randomized group experiment 26