THE EFFECTS OF TACT TRAINING IN THE DEVELOPMENT OF ANALOGICAL REASONING A Thesis Presented to the faculty of the Department of Special Education, Rehabilitation, School Psychology, and Deaf Studies California State University, Sacramento Submitted in partial satisfaction of the requirements for the degree of MASTER OF ARTS in Education (Special Education) by Sarah Elizabeth Dickman SPRING 2012 THE EFFECTS OF TACT TRAINING IN THE DEVELOPMENT OF ANALOGICAL REASONING A Thesis by Sarah Elizabeth Dickman Approved by: __________________________________, Committee Chair Dr. Jean Gonsier-Gerdin __________________________________, Second Reader Dr. Caio Miguel ____________________________ Date ii Student: Sarah Elizabeth Dickman I certify that this student has met the requirements for format contained in the University format manual, and that this thesis is suitable for shelving in the Library and credit is to be awarded for the thesis. __________________________________, Graduate Coordinator __________________ Dr. Bruce Ostertag, Department Chair Date Department of Special Education, Rehabilitation, School Psychology, and Deaf Studies iii Abstract of THE EFFECTS OF TACT TRAINING IN THE DEVELOPMENT OF ANALOGICAL REASONING by Sarah Elizabeth Dickman Analogical reasoning refers to one’s ability to derive the relation between stimuli, a process necessary for completing Aristotle’s proportional analogies, A:B::C:D. The ability to engage in complex reasoning of this nature would benefit many students with disabilities, given that they have appropriate prerequisite skills. However, the means to establish reasoning skills of this nature with students with specialized learning needs has yet to be addressed within experimental research. The current study set out to develop a procedure to teach analogical reasoning that could be used by educators across a variety of curricula. To ensure experimental control, arbitrarily related abstract figures were tested initially. Two three-member classes of abstract figures (A1-B1-C1 and A2-B2-C2) were presented to 12 adult participants via computer software. Adult participants were tested first such that inconclusive findings could not be attributed to impaired language skills, as could be the case with students with autism. Participants were trained to vocally label AB and BC pairs from within the same class as “same” and pairs from different classes as “different.” Vocal label and analogy tests with these relations followed. During the analogy tests, selecting the comparison with “same” terms was correct when the sample had “same” terms, vice versa with “different.” This testing sequence was repeated iv with variations of the trained compounds (BA, CB, AC, and CA). Equivalence class formation was tested with the figures presented individually in a matching task (e.g., selecting B1 in presence of A1 when told “Select same”). Six of the 12 participants passed all presented tasks, supporting the viability of this procedure. Had all the participants displayed the proficiency of these participants, the work would have been immediately extended to children of five years of age and then to children with autism. Once a reliable procedure is established, subsequent studies should follow this trajectory to ensure that the protocol is effective with learners with disabilities. The failures of the six participants give rise to a number of conjectures that should be examined in future research. The participants may have failed to use appropriate strategies to form the analogical relations. Future studies should examine modifications to the procedures used within this study to establish useful learning strategies such as naming the individual figures or naming categories of figures. Ultimately, this work contributes to the analogical reasoning literature as the first experimental study to include verbal response topographies. The successes of six participants indicate that the developed procedure has the potential to be useful to special educators working to train learners to display analogical relations between stimuli. Approved by: , Committee Chair Dr. Jean Gonsier-Gerdin _______________________ Date v ACKNOWLEDGMENTS I would like to acknowledge my advisors, colleagues, research assistants, and lastly my family. Dr. Jean Gonsier-Gerdin has guided my studies and awareness of special education throughout my time in graduate school. Her example of excellence in academics and desire to learn has been inspiring. Dr. Caio Miguel was the impetus for my decision to complete a thesis. I am eternally grateful to him for awakening my interest in research and, of course, applied behavior analysis. I could not have completed this work without Dr. Nassim Elias’ technical support and troubleshooting. My colleagues have been a constant source of stress relief and brainstorming. I do not know if I would have maintained my sanity without Jonathan Fernand, Kathryn Lee, and Candice Bright. Danielle LaFrance, Charisse Lantaya, and Shannon Medved have been wonderful data collectors and made the entire experience fun. To my family, you are literally the reason I am who I am as you have shaped my behavior throughout my entire life. My parents have shown me love in some way every day of my life. Their continued support has given me confidence and taken away my fear of failure, or most of my fear of failure. My brothers bless me with camaraderie and acceptance for who I am, allowing me to just enjoy life and take each day as it comes. Their support means more to me than I can say. Lastly, their prayers and reliance on faith encouraged me to trust in the Lord. Philippians 4:13, “I can do all things through Christ who strengthens me.” vi TABLE OF CONTENTS Page Acknowledgments.........................................................................................………… v List of Tables .............................................................................................................. ix List of Figures ................................................................................................................x Chapter 1. INTRODUCTION ................................................................................................. 1 Background of Problem .................................................................................... 2 Statement of the Research Problem .................................................................. 6 Purpose of the Study ..........................................................................................7 Theoretical Framework ....................................................................................10 Definitions of Terms ........................................................................................15 Justification ......................................................................................................24 Limitations .......................................................................................................25 Organization of the Thesis ...............................................................................26 2. REVIEW OF THE LITERATURE ..................................................................... 27 Experimental Research in Analogical Reasoning .......................................... 28 Analogies as Equivalence-Equivalence Responding ..................................... 33 Extensions of Equivalence-Equivalence Models of Analogies ..................... 39 Naming Account of Stimulus Equivalence and Emergent Categorization .... 53 Separable Compounds Account of Equivalence Classes ............................... 60 Summary ........................................................................................................ 66 3. METHODS ........................................................................................................... 67 Participants and Setting................................................................................... 67 Materials ........................................................................................................ 67 Experimental Design, Dependent Variable, and Data Collection................... 70 Experiment 1 Procedure .................................................................................. 75 Experiment 2 Procedure .................................................................................. 91 4. RESULTS ............................................................................................................. 96 vii Experiment 1 Participants’ Demographic Information ................................... 96 Experiment 1 Participants’ Task Performance ............................................... 97 Experiment 2 Participants’ Demographic Information ................................ 110 Experiment 2 Participants’ Task Performance ............................................. 110 Summary of Results ...................................................................................... 128 5. DISCUSSION ..................................................................................................... 129 Equivalence Class Formation ....................................................................... 130 The Role of the Verbal Repertoire ............................................................... 138 Future Research ........................................................................................... 144 Implications for Practice .............................................................................. 146 Conclusion ................................................................................................... 148 References ................................................................................................................ 149 viii LIST OF TABLES Tables Page 1. Experiment 1 Task Summaries ............................................................................. 71 2. Level 1 Compound Stimuli Designations and Responses .................................... 81 3. Compound Stimuli Designations by Level ........................................................... 85 4. Component Stimuli Designations and Example Trials ......................................... 89 5. Pre-Training Task Order in Experiment 2 ............................................................ 93 6. Experiment 1 Participant Characteristics .............................................................. 96 7. Experiment 1 Results- Component Matching ..................................................... 101 8. Experiment 1 Results- Trials to Criteria During Training Tasks .........................104 9. Experiment 1 Results- Vocal Label Tests ........................................................... 106 10. Experiment 1 Results- Analogy Tests................................................................. 107 11. Experiment 2 Participant Characteristics ............................................................ 110 12. Experiment 2 Results- Pre-Training ................................................................... 115 13. Experiment 2 Results- Trials to Criteria During Training Tasks ........................ 116 14. Experiment 2 Results-Vocal Label Tests .............................................................117 15. Experiment 2 Results- Analogy Tests................................................................. 119 16. Experiment 2 Results- Component Matching ..................................................... 122 17. Experiment 2 Results- Self-Reported Strategies................................................. 124 ix LIST OF FIGURES Figures Page 1. Experimental Stimuli ............................................................................................ 69 2. Same and Different Pretest Stimuli....................................................................... 75 3. PT-2’s and PT-3’s Performance Across Test Conditions ..................................... 98 4. PT-4’s and PT-5’s Performance Across Test Conditions ..................................... 99 5. PT-6’s and PT-7’s Performance Across Test Conditions ................................... 100 6. Component Matching Error Correlations in Experiment 1................................. 103 7. Level 3 Analogy Error Correlations in Experiment 1 ......................................... 109 8. PT-8’s and PT-9’s Performance Across Test Conditions ....................................111 9. PT-10’s and PT-11’s Performance Across Test Conditions ............................... 112 10. PT-12’s and PT-13’s Performance Across Test Conditions ............................... 113 11. Level 3 Analogy Error Correlations in Experiment 2 ......................................... 121 12. Component Matching Error Correlations in Experiment 2................................. 123 x 1 Chapter 1 INTRODUCTION Special educators face difficult challenges as they seek to teach individuals with disabilities to engage in advanced academic or reasoning skills. These skills are important for satisfactory performance on standardized testing and independent living; yet largely under studied within experimental research. Among the existing studies, many exclusively tested the age of emergence of the skill (Goswami & Brown, 1989, 1990; Freeman & Goswami, 2001) or the proposed internal processes that may give rise to the skill (Molen, 2010; Danielsson, Henry, Ronnberg, & Nilsson, 2010). Exploratory studies that seek to understand the phenomena as it occurs naturally are important; however, their pragmatic benefit is limited for individuals for whom the skill does not emerge as a product of typical development and those responsible for instilling the skill, such as special educators. Without thorough procedural research, practitioners will be forced to use methodology that may not be appropriate for individuals with disabilities, create their own protocols, or worst of all, give up on establishing higher-order skills. While working in the field of applied behavior analysis (ABA) both inside and outside school settings, the current author has personally experienced the pressure that parents place on educators to ensure life-long best outcomes for their children. As they advocate for their children to be helped to achieve their greatest potential, they are inadvertently advocating for advancement in instructional approaches. For example, the parent asking for individualized education plan (IEP) goals to target his or her child’s understanding of the value of money is an impetus for the IEP team to review protocols 2 that have been successful in establishing coin equivalence and perhaps do further research if satisfactory procedures cannot be found. Special education and its related field of ABA need to rise to meet this challenge and develop teaching protocols that will allow able students to access advanced curriculum, perform to their best ability when assessed, and support their life-long pursuit of intellectual growth. Background of Problem Philosophers, psychologists and educators have long viewed analogical reasoning as a critical element of advanced cognitive abilities (Sternberg, 1985). Analogical reasoning refers to one’s ability to derive the relation between stimuli, a process necessary for completing Aristotle’s four-term proportional analogies, A:B::C:D, and less formal, language based comparisons (Goswami & Brown, 1989, 1990; Inhelder & Piaget, 1958). This study will focus exclusively on the former, four-term analogies, as they are widely used as a means of measuring intelligence and developmental progress across learners of various ages (Sternberg, 1977; Voress & Maddox, 2002; Johnson-Martin, Attermeier, & Hacker, 2004). Typically in these tasks, learners are presented with three terms, A, B, and C, then asked to identify an appropriate fourth term, D, to complete the analogy. Four-term analogies can be used to assess a learner’s awareness of a variety of relations between stimuli including, but not limited to, categorical, functional, mathematical, proportional, and associative. Success within these tasks is believed to be “fundamental and ubiquitous for human cognition” (Schwering, Kuhnberger, & Kokinov, 2009, p. 175). 3 Within a basic four-term analogy problem, a learner is believed to display mastery of several skills related to deriving relations among stimuli. Given the A and B term, an individual must identify all the different ways the stimuli could be considered related to one another. The number of possible relations between the stimuli would vary depending on the terms within the problem and the respondent’s familiarity with the stimuli. For example, if the A term were “black” and the B term were “white,” the possible relations could include “opposites” or “colors.” Next, given the C term, the individual must identify the D term that when combined with the C term, CD, will most closely correspond to AB relation. There may be several possible relations that could be generated between the C and D terms, but the learner will be working to create “sameness” between the pairs of terms. Taking the previous example, given that the C term is “on” the learner should select the D term “off” as it has the same relation, “opposites,” as the A and B terms. Across analogies, the types of relations that are assessed will vary but a correct response will always include constructing similar relations across the two-term pairs. These types of stimulus-stimulus relations are of particular importance in displaying mastery with challenging functional and academic concepts. Consider the mathematical problem of 0.5 = ____ with possible multiple-choice selections of 1/2, 3/4, or 1/3. The student would be looking to create “sameness” on both sides of the equation, and should correctly select ½. Extended into a proportional analogy, the student may be given the A, B, and C terms of .5: 1/2 :: .75: _______ and asked to fill in the blank with the missing D term. Money concepts could also be assessed in this manner as mastery of 4 the subject would allow the individual to treat different representations of currency as substitutable for one another. For example, given a picture of a dime : .10 :: picture of a nickel:________, the individual should complete the analogy with the value of the nickel (Keintz, Miguel, Kao, & Finn, in press). Many different academic subjects, such as money, reading, history, and science to name a few, could be assessed with an analogy design. Yet across these diverse curricula, the analogies could all be solved with the strategy of seeking to create “sameness” among the presented pairs. Teaching learners to respond to analogies is of critical importance as it simultaneously develops novel, generative responding and logical reasoning skills. Following the principles behind the “sameness” strategy, individuals would not be forced to memorize basic relations between pairs, but encouraged to look at a multitude of possible relations between stimuli. The ability to respond to novel exemplars and even generate novel examples is typically considered consistent with “creativity” and “flexibility,” both desirable intellectual traits. Additionally, analogy skills may be related to the formation of logical arguments, construction of metaphors, and appreciation of sophisticated humor (Skinner, 1957; Stewart, Barnes-Holmes, Roche, & Smeets, 2001; Stewart, Barnes-Holmes, Roche, & Smeets, 2002). The establishment of these skills allows for students to more richly access and engage in their culture. The importance of these skills is not limited to typical learners, but must be extended to include learners with disabilities. To date, the behavior analytic literature concerning analogical reasoning has utilized only typically developing participants (Stewart & Barnes-Holmes, 2009; Barnes, 5 Hegarty, & Smeets, 1997; Carpentier, Smeets, & Barnes-Holmes, 2002; Carpentier, Smeets, & Barnes-Holmes, 2003; Carpentier, Smeets, Barnes-Holmes, & Stewart, 2004; Stewart et al., 2001). Yet, a chapter within a recently published book focusing on instructional methods for children with autism and other disabilities includes a protocol for training analogical responding (Stewart & Barnes-Holmes, 2009). The authors of the chapter based their recommendations exclusively on studies with typical learners, most of which were adults (Of the total 110 participants in all the referenced studies, only 47 were age 12 or younger.). Therefore, these studies are limited in their application to special education, as they have not included children or adults with disabilities. Procedures that are effective for adults cannot be assumed to be effective for children, particularly when only 21 of the 47 children participants passed the analogy tests (Barnes et al, 1997; Carpentier et al, 2002; Carpentier et al., 2003). Children with attending deficits and stimulus over-selectivity, as is common with children with autism, would likely show even less success (Lovaas, Schreibman, Koegel, & Rehm, 1971). Thus, research with participants with disabilities is needed to strengthen recommendations for the implementation of these procedures for students in that population. Elements of the analogy teaching protocols recommended through behavior analytic research may limit their usefulness within special education settings or in-home programs due to the extensive teaching and training required for performance to emerge. The protocol recommended by Stewart and Barnes-Holmes (2009) is based on the Relational Frame Theory (RFT) (Hayes, Barnes-Holmes, & Roche, 2001) approach and does not make use of existing verbal skills. Modifications based on naming literature 6 (Horne & Lowe, 1996; Miguel & Petursdottir, 2009) may expedite training and testing by promoting vocal-verbal behavior. This suggestion is supported by previous research, which demonstrated that some children with autism were unable to match stimuli according to derived relations until their vocal-verbal repertoires were utilized to train a common response (Eikeseth & Smith, 1992). The use of language in testing and training should establish bi-directional relations between stimuli and increase the likelihood of transfer across repertoires. Essentially, instead of only being able to match the stimuli to one another, as with the RFT studies, the participants may be able to vocally label the stimuli, select the stimuli when given a vocal label, and match the stimuli. This efficiency is of importance as special educators who are dealing with a variety of competing responsibilities may be more likely to implement procedures shown to be adaptability for a variety of learners across a variety of lesson contents (McBride & Schwartz, 2003; Peters & Heron, 1993). Educators and practitioners need resources based on rigorous research within their target population in order to address the special needs of their students and consumers. Statement of the Research Problem Substantial gaps in the literature exist that force special educators and practitioners in applied behavior analysis (ABA) to choose between forgoing targeting analogies altogether or developing their own protocols without the guidance of appropriate research. This study represents the initial steps in the development of an evidence-based protocol designed to be adaptable across a variety of subject matters and effective with varying groups of participants, while minimizing instructional time. 7 Specifically, this study attempts to answer the following questions: Can the completion of basic A:B :: C:D analogies be established through teaching individuals to label the relations between pairs of terms as either “same” or “different”? Will the language training with pairs of stimuli establish distinct classes when the stimuli are presented singly? Will participant age and cognitive ability significantly affect acquisition? Purpose of the Study To answer the research questions included above, a modified version of the analogy protocol recommended by Stewart and Barnes-Holmes (2009) was used. However, research from the naming account of verbal behavior (e.g., Miguel, Petursdottir, Carr, & Michael, 2008; Mahoney, Miguel, Ahearn, & Bell, 2011) was employed to maximize transfer across repertoires and build upon existing skills. Participants were trained to engage in the vocal response “same” in the presence of stimuli from within the same class and “different” in the presence of terms from different classes. Essentially, they were overtly trained to identify a “sameness” or “difference” relation between terms. Following training, they were presented with a matching task, as used in the Stewart and Barnes-Holmes (2009) protocol, and had the opportunity to directly relate “same” pairs to “same” pairs and “different” pairs to “different” pairs. This performance required the transfer from the speaker to the listener repertoire, as they have been taught to vocalize in the presence of the stimuli (e.g., saying “same” or “different”), but have not been asked to select them by name or match them (e.g., “Select same”). Additionally, to successfully complete these tasks the participants needed to display emergent responding across novel stimuli. The participants were never exposed to the 8 stimuli in the arrangements presented in these later phases (e.g., BA, CB, CA, and AC compounds and the stimuli presented individually), but had to derive the relations of “same” or “difference” based on the stimulus relations that were explicitly taught (e.g., AB and BC). To examine the effects of the vocal training on the formation of analogies and supporting equivalence classes, a steady-state strategy was employed within a singlesubject design (Sidman, 1960). To minimize the participants’ exposure to experimental stimuli prior to training, a multiple-probe design was employed (Johnson & Pennypacker, 1993). Participants were grouped into 12 dyads and their exposure to the experimental conditions arranged, non-concurrently, to support the logic of the steady-state strategy. In the Pretest phase, when one participant within each dyad showed stable responding below mastery level, the prediction was made that if they were to continue to be exposed to this condition no change would occur. This was an especially strong prediction as the classes of abstract stimuli used in the study have been previously experimentally determined as being sufficiently arbitrary that participants should not relate them on the basis of visual properties or previous history (Markham & Dougher, 1993; Debert, 2007; 2009). Once one member of the participant dyad was exposed to the training condition, the independent variable, the second participant continued with baseline trials to verify the prediction that the first participant’s responding would not have improved given further exposure. When the experimental effect, mastery level during Analogy Testing, was repeated then the second participant was exposed to Vocal Label Training. These results met the criteria of the steady-state strategy with prediction, verification, and replication. 9 The multiple-probe design was selected so as to create the strictest possible internal control in order to support the recommendation for implementation across learners of a variety of backgrounds. This same design could be implemented with the use of nonarbitrary stimuli, as the baseline measures would demonstrate the participant’s lack of familiarity with the test stimuli or failure to relate them as members of equivalence classes. The secondary purpose of the study was to assess the effect of verbal competency upon acquisition of the analogical responding. Previous literature has shown that substantial procedural modifications are needed for children age five to correctly respond at the proficiency of children nine years of age (Carpentier et al., 2002). The proposed procedure had never been experimentally validated; thus it may have proved to be ineffective or extremely time consuming. To ethically extend the use of these procedures to more vulnerable populations, such as children, they would first need to be conducted with adult participants with intact and sophisticated verbal skills, as expected in undergraduate students. If favorable results were achieved with minimal retraining and in a reasonable amount of training time, the protocol would be run with children with and without disabilities serving as participants. It was desired that any differences in responding be analyzed across participant groups and compared to previous research. However, as the results of the study will show, the current procedure was not deemed appropriate for children participants due to the high percentage of failures by the adults. Thus, this intended purpose was not addressed within the current work. Nevertheless, 10 recommendations are outlined to suggest means to extend this research line to benefit individuals of those target populations (e.g., children and then children with autism). Theoretical Framework Behavior analytic approaches to analogy formation have been based on existing literature on equivalence (Sidman & Tailby, 1982) and Relational Frame Theory (RFT) (Hayes et al., 2001). The current study also made use of the naming account of verbal behavior (Horne & Lowe, 1996). The research findings and philosophical approaches that have shaped previous work in analogical reasoning and the current study are explained below. Equivalence The establishment of classes composed of a variety of stimuli, through training only a few responses between members of the class, represented a significant step forward in technology related to instruction for learners with disabilities (Sidman & Tailby, 1982; Sidman, 2000). Following the basic logic, if A is the same as B, and B is the same as C, A is the same as C, Sidman and Tailby (1982) designed procedures to test if individuals with learning disabilities could derive relations according to this formula. Not only were participants successful in displaying the ability to relate the A term to the C term, they were also able to display novel behavior in the form of relating B to A, C to B, and C to A. These findings suggested that for a total of six stimulus-stimulus relations to be established, one need only train two of these relations. Results of this nature have been replicated widely to establish stimulus classes based on money concepts (Keintz et 11 al, in press), reading (Lane & Critchfield, 1998), and even geography (LeBlanc, Miguel, Cummings, Goldsmith, & Carr, 2003). Within the analogy literature, researchers have developed their procedures based on the assumption that the pairs of terms that make up a proportional analogy function as their own equivalence class (Barnes et al, 1997). Within equivalence classes, stimuli can be thought of representing one another or interchangeable with one another. Within a classic A:B :: C:D analogy, A would be substitutable for B and C for D, and vice versa. For analogies based on math concepts, this logic is supported. Consider that in an analogy of ½: .50 :: 1/3: .33, .50 and ½ could be switched and the validity of the analogy would maintain. There are three leading explanations behind the phenomena of stimulus equivalence: reinforcement contingency (Sidman, 2000), RFT, and verbal naming (Horne & Lowe, 1996). RFT and verbal naming are of particular interest as the analogy studies to date have been conducted with RFT as the guiding principles and a verbal naming protocol were used in the current study. Reinforcement Contingency Sidman (2000) theorized that equivalence classes are formed based on the participation of the terms within a reinforcement contingency. All of the elements of a matching to sample protocol, sample stimulus, the response, the comparison stimulus, and the reinforcer, all come to participate as members of the equivalence class based on the effect of the contingent reinforcer. The relations between these elements would give rise to reflexivity, symmetry, and transitivity as they share common terms. For example, if given a sample A1, the response of touching the comparison B1 would be reinforced 12 with praise; then when presented the sample B1, the individual will be likely to engage in the touching response in the presence of A1 because of the previous reinforcement contingency. This theoretical approach has several weaknesses, specifically that in order for equivalence classes to be discriminated from one another when a common response (e.g., touching) or reinforcer (e.g., praise) is used in training, these common elements must be believed to drop out of the equivalence classes to prevent cross-class responding (Sidman, 2000). In the current study, the selection responses and reinforcement provided in Phase II are the same across both Class 1 and Class 2 stimuli. Without the suggested “dropping out” phenomenon, Class 1 and Class 2 would become blended into one large equivalence class because they share common elements. This theoretical perspective on equivalence does not serve as the basis for the literature related to analogical reasoning. The recent behavior analytic research in analogy responding has grown out of a RFT approach to equivalence (Stewart & Barnes-Holmes, 2004). Relational Frame Theory Analogical reasoning, and any other form of complex verbal behavior, is supposed to be a product of arbitrarily applicable relational responding according to RFT’s proponents (Lipkens & Hayes, 2009). It is hypothesized that the manner in which an individual relates stimuli to one another, as in analogy protocol, is brought under control of contextual cues based on a substantial history of training embedded in natural interaction with their caregivers. For example, if a language-competent individual is told that stimulus A “goes with” stimulus B, the individual would derive that B also “goes with” A. In the previous example, the phrase “goes with” functions as a contextual cue of 13 “sameness,” which may be arbitrarily applied to the stimuli by the caregivers. Consider the relation of a picture to a written word; in essence those relations are arbitrary as no element of the picture necessitates its correspondence with the text. Those relations are applied to stimuli, acquisitioned through natural interaction with caregivers, and eventually give rise to the relational frame. RFT accounts for other forms of relatedness including “different than,” “more than,” “less than,” etc. In a mutually entailed relation, an individual trained that Stimulus A is more than Stimulus B will respond to Stimulus B as less than Stimulus A without additional training (Lipkens & Hayes, 2009). It is from this theoretical perspective that procedures to establish analogical reasoning have been developed and recommended for learners with disabilities (Stewart & Barnes-Holmes, 2009). The RFT account of analogical reasoning interprets correct responses to a fourterm analogy as product of several relational frames operating simultaneously. There is a relational frame operating between the A and B terms, the C and D terms, and the AB and CD pairs. Within a “sameness” analogy, Cat is to Dog as Apple is to Orange, the individual is believed to be deriving the relation of similarity among the pairs and then a frame of coordination between the pairs. Should the analogy be based on “difference,” Cat is to Apple as Dog is to Orange, the individual is believed to be deriving the relation of dissimilarity among the pairs and then a frame of distinction between the pairs. Several studies to date have shown that once the contextual cues of sameness and difference are established, then individuals can come to form equivalence classes among single terms 14 and then on the basis of these relations, engage in correct same-same and differentdifferent responding (See Stewart & Barnes-Holmes, 2004 for a review.). Verbal Naming The verbal naming account suggests that the development of many relations among members of a class, following the training of only a few, is due to the individual’s behavior as both a speaker and a listener in the presence of the training stimuli (Horne & Lowe, 1996). Thus, any reinforcement made contingent upon responding would strengthen both repertoires and give rise to performance not directly trained. It is hypothesized that this type of performance is dependent upon verbal imitation, which facilitates the transfer across repertoires. Take the child trained to select a cup when the word “cup” is spoken aloud. If the child also says “cup” in the presence of the cup, he/she will have behaved as a speaker and any reinforcement received will promote both future listener and speaker responding. These types of circular relations could also occur within a situation in which a child is asked to vocally label a cup, they will hear themselves say “cup” and likely orient towards the stimulus, completing a listener relation. Equivalence classes are hypothesized to be established when a learner is trained to relate to a group of stimuli with a common name such as “fruit,” “vegetables,” “vehicles,” “countries,” “show dogs,” etc. Research has supported the effectiveness of such protocols across a wide variety of different stimulus classes (Lowe, Horne, Harris, & Randle, 2002; Miguel et al, 2008). An interpretation of analogical reasoning per the naming account would suggest that an individual must behave as both a speaker and a listener in the presence of the four 15 terms. When presented with the A and B stimuli, the individual may vocally label the relation between the stimuli, which would naturally vary according to the construction of the analogy. The individual would then behave as a listener to his or her own speaker behavior and select the pair that corresponds with the originally labeled relation. Upon correct responding, the CD pair of terms should evoke the same relational label as the AB pair, so to speak; the individual should be able to vocally label the relational labels as “same.” For example, when presented with Apple: Orange, the individual could label their relation as “same.” The individual will respond to this verbal stimulus and select the C:D pair of Cat: Dog. Upon completion, the relations between the pairs will evoke the vocal labels of “same’ as well. An analogy based on difference would evoke the vocal label of “different” among the terms, but the vocal label of “same” between the relations. As such, Cat: Apple evokes “different” in the same manner that Dog: Orange would. However, since both pairs are “different,” relation between the relations would be “same.” It is reasonable to suppose that a protocol based on this interpretation would give rise to analogy responding as it bears great similarity to procedures proven to be successful in establishing categorization. Definitions of Terms Stimulus The environment functions to evoke or elicit instances of behavior and subsequently affects its future strength. Stimuli are considered the specific elements of the environment that can be perceived by the organism. Stimuli are not necessarily static, but can be considered changes in the environment that can be noticed by the organism. 16 These changes can be made up of a variety of forms: auditory, tactile, proprioceptive, kinesthetic, gustatory, olfactory (Michael, 2004). Response Behavior “is the portion of an organism’s interaction with its environment that is characterized by detectable displacement in space through time of some part of the organism and that results in the measureable change in at least one aspect of the environment” (Johnston & Pennypacker, 1993a, p. 23). A particular occasion of behavior can be considered a response. A group of responses that are controlled by the same antecedent and consequence stimuli can be considered a class of responses. For example, for a young child, the responses of asking for a cookie from a parent, reaching into the cookie jar to get a cookie, or hitting their sibling to get their cookie are all controlled by the same antecedent event, hunger, and reinforced by the same consequence event, getting the cookie. Reinforcement Reinforcement is defined as a consequence event, in the form of a stimulus change, which functions to increase the future frequency of a class of responses (Michael, 2004). For example, if the frequency of the behavior of asking for a cookie increases following the delivery of a cookie, the stimulus change from not having a cookie to having a cookie can be considered reinforcing. Within the current study, praise was used to strengthen occurrences of the behavior. 17 Discriminative Stimulus A discriminative stimulus refers to a stimulus correlated with the availability of reinforcement for a particular class of responses. Responses within the class will occur more frequently in the presence of the discriminative stimulus than in its absence (Skinner, 1938). For example, if a child’s mother has reinforced requests for cookies by allowing access to cookies, given that the child is hungry, the frequency of the behavior of asking for a cookie will increase in her presence. Within the current study, compound abstract stimuli and components of those compounds served as discriminative stimuli for the responses of vocally labeling “same” or “different” and matching. Non-Discriminative Stimulus A non-discriminative stimulus is correlated with the absence of reinforcement for a particular class of responses. Responses within the class will occur less frequently in the presence of the non-discriminative stimulus than in its absence (Skinner, 1938). Extending the prior example, if the child’s father does not reinforce requests for cookies, even if acutely hungry, the child will be unlikely to request a cookie from the father based on the prior history. The compound stimuli used within the current study functioned as discriminative and non-discriminative stimuli dependent upon their class membership and relatedness. For the response of “same,” compound stimuli made up of both Class 1 terms functioned as a discriminative stimulus. However, compound stimuli made up of terms from Class 1 and Class 2 functioned as the non-discriminative stimulus for the response “same” as no reinforcement was provided following this incorrect label. 18 Speaker Behavior The term, speaker behavior, refers to several types of verbal behavior characterized by a reinforcement history mediated by a trained listener (Skinner, 1957). Different speaker behaviors are delineated by their function, rather than their form, and the variables controlling their occurrence. For example, vocalizing “cookie” would be classified as a tact, echoic, mand, or intraverbal dependent upon the variables present in the environment and the history of reinforcement for the response. The current study employed speaker behavior in the form of tacts and echoics. Tact The tact response is evoked by a non-verbal stimulus and maintained by generalized conditioned reinforcement (Skinner, 1957). More specifically, the stimulus that occasions the behavior will not be emitted by a speaker, such as a question or statement. In the current study, the non-verbal stimuli presented will be the compound abstract stimuli. The response form can vary substantially, from a description of the appearance of the non-verbal stimulus to a comment regarding the individual’s past experience with the stimulus. For example, upon seeing a cookie, a child may say, “Mom, that’s a cookie,” or “I had a cookie yesterday for snack;” both of which could be considered tact as long as the only reinforcement that follows is the mother’s acknowledgement of the statement. Were the mother to give the child a cookie, this response would be classified as a mand because the child’s desire for the cookie itself may have controlled the emission of the response. In the presence of the compound 19 abstract stimuli, the participants of the current study were trained to emit the tact of “same” or “different.” Echoic The echoic is defined as a verbal response evoked by a verbal stimulus with a history of generalized conditioned reinforcement. This verbal response will have formal similarity and point-to-point correspondence with the verbal stimulus that occasioned it (Skinner, 1957). Essentially, the response will be evoked by another speaker vocalizing and the form of the response will correspond closely to the initial vocalization. For example, the child may vocalize “cookie” upon hearing their older sibling say “cookie.” This type of vocal response is often used to train the tact response, as was the case in the current study. The experimenter vocalized, “Say same” in the presence of an abstract compound made up of stimuli from within the same class and reinforced the participant when they emitted the echoic “same.” Over time, the echoic prompt, “Say same,” was faded such that the participant emitted the vocal response, “same,” in the presence of the compound as consistent with the tact. Listener Behavior Behavior that functions to reinforce the behavior of a speaker is referred to as listener behavior (Skinner, 1957). Consider the mother asks her child to “Give me a cookie.” The child will respond to the mother’s speaker behavior and reinforce the mother’s behavior by his or her compliance. For the purposes of this study, this behavior took the form of selecting components of the compound stimuli when instructed by the experimenter in Phase III, “Select same,” or “Select different.” 20 Naming Operant The naming operant refers to a higher order operant consisting of bi-directional relation between stimuli that once established allows for the emergence of untrained, verbal behavior across repertoires (Horne & Lowe, 1996). Essentially, naming is a type of response that, once learned, facilitates responding across a wide variety of environmental interactions. Returning to the example of the mother asking her child to “Give me a cookie,” this response may be facilitated by the child behaving as both a speaker and a listener in the presence of that cookie. In response to the mother’s initial vocal statement, the child may covertly or overtly engage in echoic responding. As the child is prompted to locate a cookie, they will echo “cookie” in the presence of the actual cookie, which is consistent with a tact response. When the child gives the cookie to their mother, a listener behavior, the mother’s praise may simultaneously reinforce their speaker behavior as well. Thus, in a future interaction the mother may ask the child, “What is this?” while holding the cookie and the child may correctly respond without prior training. The reverse could occur following speaker training. The mother may hold up the cookie and prompt the child to say “Cookie,” a speaker behavior. At the same time, the child may hear their vocalization of “cookie” and orient to the cookie, a listener behavior. The praise provided by the mother for saying “cookie” would inadvertently strengthen both repertoires and facilitate a future response to her question of “Find the cookie.” Within the current study, the naming operant is hypothesized to allow for participants to be trained to act as speakers in the presence of the abstract compounds, 21 labeling as either “same” or “different,” and then later select these stimuli without specific training as listeners. Stimulus Equivalence Eikeseth and Smith (1992) described demonstrations of equivalence among stimuli as “when a subject correctly matches any member of a class of stimuli with any other member of that class, despite having been trained on only a subset of the possible matches” (p. 123). Essentially, having been trained to relate A to B and B to C, the individual would demonstrate the ability to relate B to A, C to B, A to C, and C to A. This performance allows for the rapid development of complex classes of stimuli with significant educational relevance such as numbers to quantities, words to pictures, letters to sounds, etc. Reflexivity Reflexivity is a type of performance necessary for a demonstration of stimulus equivalence according to Sidman and Tailby (1982). Responding of this type is classified by the relating of a stimulus to an identical stimulus (e.g., A to A). Thus, given an abstract form, such as those used in the current study, the individual could relate that stimulus to an identical stimulus. Symmetry Symmetry is a type of relation considered necessary for a demonstration of stimulus equivalence according to Sidman and Tailby (1982). For this type of performance, the individual would display the ability to relate stimuli in the opposite manner as originally taught, without additional training. For example, having been 22 trained to relate A to B, the individual could relate B to A without additional training. Within a reading task, this could include a child matching the written word cookie to a picture of a cookie after having been trained to match the picture to the word. The current study will consider untrained correct performance in the presence of an abstract compound when the positions of the stimuli are switched from the locations used in training a demonstration of symmetry, as is consistent with previous literature (Debert et al., 2007; Debert et al., 2009). Specifically, having been trained to vocalize “same” in the presence of an AB compound, saying “same” in the presence of a BA compound will be considered symmetry. Transitivity The third component necessary for a demonstration of stimulus equivalence, per Sidman and Tailby (1982), is referred to as transitivity. In this type of performance, an individual would demonstrate the ability to relate stimuli, without additional training, that have never been presented together, but share a common term. For example, having been trained to relate A to B and B to C, the individual could relate A to C. Taking the reading example further, having learned to match a picture of a cookie to the word cookie (A to B) and the word cookie to the word oven (B to C), the individual would demonstrate transitivity by matching the picture of the cookie to the word oven without additional training (A to C). Based on previous literature (Debert et al., 2007; Debert et al., 2009), transitivity within the current study would be demonstrated by the individual vocalizing “different” in the presence of an AC compound having only been trained to do so in the presence of AB and BC compounds. 23 Compound Stimulus Maguire, Stromer, Mackay, and Demis (1994) described a complex stimulus as “A stimulus that consists of multiple components or elements, each of which may exert stimulus control over the same behavior” (p. 753). This term will be used interchangeably with “compound stimulus” to be consistent with current literature. The current study made use of abstract figures presented side-by-side as compound stimuli. Equivalence-Equivalence Performance consisting of relating compound stimuli with terms that are members of the same equivalence classes is considered equivalence-equivalence responding (Barnes, Hegarty, & Smeets, 1997). Essentially, if the A and B terms presented within a compound stimulus are both from Class 1, they are considered “same” or equivalent to one another. Selecting a “same” compound when presented another compound with “same” terms would be a demonstration of “same”-“same” matching, or equivalenceequivalence responding. Non-equivalence-Non-equivalence Relating compound stimuli with terms that are members of different equivalence classes is considered non-equivalence-non-equivalence responding (Barnes et al., 1997). In the reverse of the relations described above, if the A and B terms of a compound are from different classes, Class 1 and Class 2, they will be considered “different” or nonequivalent to one another. Selecting a “different” compound when presented another “different” compound would be a demonstration of “different”-“different” matching or non-equivalence-non-equivalence. 24 Justification Cognitive psychology has most widely studied analogical reasoning and explained the emergence of this skill as a product of accessing concept schema from memory, mapping between schema, and construction of mental relations (Schiff, Bauminger, & Toledo, 1999). However, this account of responding is based on models of cognition that have not and cannot be proven or specifically trained (Zuriff, 2003). Methodological issues like these can give rise to circular reasoning and subsequent insufficient investigation. For example, analogical reasoning is an indication of intelligence, and only intelligent individuals can engage in analogical reasoning. Models of this nature hinder scientific inquiry and keep practice from improving (Skinner, 1938). To date, the majority of the research has focused on the age of emergence of analogical reasoning in typical learners, rather than on factors controlling the acquisition of the skill itself (Goswami & Brown, 1989, 1990; Freeman & Goswami, 2001). This gap in research led to the perpetuation of a flawed finding that responding to analogical tasks is not possible or rare prior to age 11 or 12 (Inhelder & Piaget, 1958). Some research findings contradicted this early finding, demonstrating the emergence of basic analogical reasoning in children as young as three years (Goswami & Brown, 1989, 1990). However, more recent studies have shown that responding to the analogies presented in those studies could have been accomplished on the basis of associations alone without the application of more complex reasoning skills (Carpentier et al., 2004). Focus on the age of emergence of the skill is misguided; instead, studies should examine methodologies that are effective in establishing the skill as these findings will provide a pragmatic 25 benefit to the participants themselves and educators working to develop these skills in learners. Further, without a thorough examination of the types of procedures that can be used to establish this reasoning skill in typical learners, special educators will be presented with even steeper challenges when working with their students to teach these skills. Analogical reasoning is widely viewed as an indication of intelligence; therefore, it is crucial that individuals with disabilities be given every opportunity to respond successfully. Limitations The current study did not examine variables controlling acquisition of analogical reasoning as it occurs in typical development. The procedures suggested in the methods section were based on previous research in equivalence and analogy literature that have been effective in establishing the desired performance of matching compound stimuli. These procedures may have no relation to the manner in which analogy responding emergences naturally in typically developing children, but this was outside the scope of the current investigation. Additionally, the outcomes of this study give no indication of the prior knowledge or intelligence of the participants, though this is a common use of analogy problems. All the test stimuli were arbitrary and abstract, having no connection to actual, useful stimulus relations. Once sufficient research has demonstrated the effectiveness of these teaching and testing procedures with abstract stimuli, they should be validated with real, educationally significant stimulus relations. Additionally, the form of analogy testing 26 utilized in this study could be used independent of the training protocol to examine an individuals’ current knowledge of mastered stimulus relations. Finally, the analogies trained in this protocol do not encompass the wide range of possible stimulus-stimulus relations that could be tested with the use of an analogy problem. Only same-same and different-different analogies were examined within this protocol. This would include relations such as “50% is to half as 100% is to whole,” same-same, or “100% is to half as 50% is to whole,” different-different. This neglects a multitude of other relations such as part: whole, object: function, concept: opposite, etc. Future research should investigate parameters needed to establish analogical reasoning consistent with those more complex stimulus-stimulus relations. Organization of the Thesis This thesis is organized into five chapters. Chapter 1 introduces the use of analogical reasoning within education, the conceptual framework which gave rise to the current work, and outlines the parameters of this investigation. Chapter 2 examines the literature on analogical reasoning, the RFT approach to analogical reasoning, extensions of this account, the naming account of verbal behavior as relates to categorization, and the separable compounds account of stimulus equivalence. Chapter 3 provides a description of the participants, the research methods used, and a rationale for using such methods. Chapter 4 presents the findings using both quantitative and qualitative methods based on the research methods discussed in Chapter 3. Finally, Chapter 5 provides a synopsis of the findings and a discussion as to the limitations of the study and implications for future practice and research. 27 Chapter 2 REVIEW OF THE LITERATURE The procedures utilized within the current study were developed based on findings from research related to analogies, equivalence-equivalence, compound stimuli and verbal naming. The dearth of experimental research on classical analogies, particularly with individuals with disabilities, establishes the need for thorough investigation of this skill area. Furthermore, these studies served as the basis for the equivalence-equivalence research lines. The methods of the current study were developed largely from the equivalence-equivalence model of analogy (Barnes et al., 1997), thus a thorough review of the scope of these studies was included below. From the basic equivalence-equivalence model, several variations and extensions have been experimentally investigated. These findings were reported to show the existing boundaries of this line of research. Studies specifically examining the use of compound stimuli were included to give foundation to the mechanisms hypothesized to be responsible for the success of the equivalence-equivalence model of analogy and the methods within the current study. Additionally, the adaptations made to the basic equivalence-equivalence model were derived from the research in the use of compound stimuli and the verbal naming repertoire. Finally, results from verbal naming studies served as the basis of the use of a vocal response within the procedures of the current study. These studies do not relate directly to analogical reasoning, but show the effectiveness of a vocal response in establishing equivalence classes, a necessary component of the equivalence-equivalence approach. 28 Experimental Research in Analogical Reasoning The absence of clearly defined, verifiable hypotheses within the early studies relating to analogical reasoning contributed to controversial conclusions and limited treatment applications. Piaget, Montangero, and Billeter (1977) developed an analogy task based on categorizing images. Children aged 5 to 12 years were given a variety of images and instructed to match them into pairs then match the pairs with other pairs. Some of the analogies to be formed were based on notable features, such as ‘bicycle : handlebars :: ship :wheel.’ Other analogies were based on functional relationships, including ‘nurse : syringe :: barber : scissors.’ Findings suggested that young children, 57 years, could not form analogies as they often failed to identify the D term the experimenters’ had designated as correct. Based on these findings, Piaget hypothesized that children of this age were only capable of forming associations between pictures, what he termed ‘first-order’ relations and what behavior analytic research would refer to as equivalence. However, his analysis excluded the vocal-verbal behavior of the participants as they stated their intentions to locate images that would have completed the analogies appropriately. Results showed that children aged 7 to 12 years were somewhat successful in forming some analogies, but typically failed to take all four terms into account in their responses. Piaget asserted that children at this age had not yet developed the ability to derive relations between relations, what he called ‘second-order’ relations and what behavior analytic research has termed equivalence-equivalence (Barnes, Hegarty, & Smeets, 1997). This work gave no recommendations related to teaching 29 children to form analogies, but only concluded that young children were incapable of the skill. Goswami and Brown (1990) developed a task they believed to give a more thorough assessment of young children’s’ proficiency with formal analogies. The children were shown the A and B terms of the analogy, two related images, and the C term, a single image. They were then allowed to select the D term from a field of four images. The relationships between the stimuli were based on where an animal lived (e.g., spider : web and bee : hive), where items belong (e.g., gloves : hands and shoes : feet), or where things come from (e.g., cow : milk and hen : egg). Their findings indicated that while the 9-year old children were most successful, 4 and 5-year olds responded above chance levels. They determined that the children showed little evidence of reliance on ‘first-order’ relations, contradicting Piaget, Montangero, and Billeter’s findings (1977). When asked if there could be another solution to the analogy task, the majority of the children denied this possibility. Goswami and Brown (1990) hypothesized that this performance indicated an awareness of the interrelations between all four terms in the analogy. Further, they suggested that the failure observed in Piaget and colleagues’ (1977) study was due to the participants’ lack of familiarity with the items used in testing rather than cognitive deficits. Goswami and Brown (1989) developed a second task to attempt to specifically assess the effect of increased relational knowledge on proficiency with analogies. The terms were all represented with pictures of familiar items in various states, such as a loaf of bread shown whole or sliced. Of these images, only one could be considered a correct 30 response as it consisted of the same type of object as the C term and had the same relationship to the C term as the B term did to the A term. The three other choices could consist of images that were either visually similar, had similar function, an image similar to that of the B term, or an image similar to the C term but with a relationship not analogically related to the B and A terms. Results showed that increased success was correlated with age, though even the 3-year olds responded at levels above chance. Important to note, the experimenters had little control over the participants’ existing knowledge of the images used in the study and possible contamination from other sources of input such as school. As with the Piaget et al. (1977) study, findings from the study did not yield specific methodologies useful for teaching children to engage in analogical reasoning. Recent research has questioned the validity of the Goswami and Brown (1990) study in demonstrating children’s actual proficiency with formal analogies (Carpentier, Smeets, Barnes-Holmes, & Steward, 2004c). The authors hypothesized that the participants may have been able to solve the presented tasks based only on the relationship of the B term to the D term. The experimenters reformatted the analogy task from the Goswami and Brown (1990) study but replaced the A and C terms with an X and Y respectively. Thus, participants had to select a D term from four choices based only on the shown B term. The 12 adult participants selected the correct D term on 97% of the trials, strongly undermining the validity of these tasks as a test for classical analogical reasoning as only one term of the analogy was necessary for the correct image to be selected. These results also undermine the construct validity of the Goswami and 31 Brown (1989; 1990) studies, as measures of analogical reasoning, altogether and highlight the need for strongly internally controlled procedures. In one of the few experimental studies related to analogy performance with individuals with disabilities, Schiff (2009) compared performance on analogical reasoning tasks across different population groups to assess for the role of verbal abilities in mediating performance. Of the participants, 25 were typically developing, while 40 had learning disabilities. Among the participants with learning disabilities, 20 were categorized as having verbal learning disabilities and 20 were categorized as having nonverbal learning disabilities. In Stage 1, experimenters read two stories aloud to participants. The stories contained several common elements: a protagonist, a goal, an obstacle, and a solution. Participants were tested on their ability to recall these elements from the previously read stories in Stage 1, their ability to relate the common elements across stories in Stage 2, and finally to solve a similar problem in a real-world scenario in Stage 3. Across all three tests, the children without disabilities outperformed both groups with learning disabilities significantly, though notably only 44% formed analogies across stories. Responding was similar among the verbal and non-verbal learning disability populations, except on the physical task. Only one participant in the group with nonverbal learning disabilities correctly responded, while nine in the group with verbal learning disabilities solved the presented problem. The author suggested that these data indicate that verbal recall has little relationship to physical problem solving and that the processes giving rise to analogical reasoning are separate from both recall and physical problem solving. However, these results are limited as there was no pretest for the 32 physical problem-solving task, the performance of individuals was not assessed, and the failure rate of the analogical reasoning task suggests the procedure was flawed. Without assessing performance on the physical task prior to exposure to the stories, the mediation of verbal recall or abstraction cannot be analyzed. Participants may have been able to solve the physical task at the beginning of the study, thus no verbal mediation may have taken place. Additionally, without analysis across individuals, no meaningful analysis regarding the relationship between recall and abstraction can take place. Finally, since research suggests that children as young as three years can solve analogy tasks (Goswami & Brown, 1989; 1990), the failure rate of the control group suggest that the procedure was unclear to the participants. In summary, experimental research related to analogical reasoning suggests that correct responding is possible for young children (Goswami & Brown, 1989; 1990) and individuals with disabilities (Schiff, 2009), but the true sophistication of these skills has yet to be thoroughly examined. Across studies, little effort was made to control for prior knowledge or related skills, which limits the application of these studies in an educational setting as a teaching tool. The influence of these uncontrolled variables particularly affects the usefulness of these protocols with learners with deficits in language and cognition as their age alone gives a poor measure of their skill competency. Notably these early studies attempted to translate a conceptual, cognitive skill into a testable format, which contributed to the design of the equivalence-equivalence model (Barnes, Hegarty, & Smeets, 1997). 33 Analogies as Equivalence-Equivalence Responding Barnes, Hegarty, & Smeets (1997) developed a model of analogical reasoning drawing from the RFT account of equivalence relations: “We take the view that equivalence-equivalence responding is an example of a relational network as defined by relational frame theory” (p. 3). The authors hypothesized that analogies would be formed on the basis of equivalence relations within distinct contexts. In their protocol, participants were first trained to relate A to B stimuli and A to C stimuli across four classes. In Experiment 1, the participants were exposed to an equivalence test to assess for the emergence of symmetry and transitivity prior to testing for equivalenceequivalence responding. The same procedures were used in Experiment 2 with the exception that the order of the testing was reversed. All the participants successfully passed the equivalence-equivalence tasks, including two children aged 12 and 9. This was the first behavior analytic study to address analogical reasoning and the basis for the line of research to follow. Carpentier, Smeets, and Barnes-Holmes (2002) conducted a series of experiments to assess the variables affecting the emergence of equivalence-equivalence responding in children. In Experiment 1, three participant groups of different ages, adults, nine-year olds, and five-year olds, were exposed to the protocol described by Barnes et al. (1997). The participants were trained to relate A and B terms, A and C terms, then tested for symmetry and equivalence. Equivalence-equivalence and non-equivalence-equivalence testing followed with the use of BC compounds. All participants displayed equivalence responding in the form of relating B to A, C to A, B to C and C to B as is consistent with 34 symmetry and transitivity. However, only three adults and three nine-year olds displayed correct equivalence-equivalence responding, relating BC to BC compounds; no five-year olds passed these tests successfully. To troubleshoot this failure, in Experiment 2, the experimenters ran the protocol again with the addition of remedial sample-comparison compound construction training. In this training, the B term was presented with a field of three C term comparisons and with a field of three BC compounds. The experimenter pointed to a BC compound then assisted the participant in selecting the C term that when combined with the B sample formed the matching compound. Once participants met criteria for this task, they were exposed again to equivalence-equivalence testing. Results showed that all adults and all nine-year olds demonstrated the tested relations while, again, no five year olds did. In Experiment 3, the sample-comparison compound training was implemented following equivalence testing to assess if this sequencing of tasks facilitated correct responding. Findings showed no improvements, as all four five-year old participants still failed equivalence-equivalence testing. In the final experiment, four five-year olds were successfully trained to demonstrate equivalence-equivalence responding with a final modification to the original protocol. As soon as A-B and A-C training was complete, the participants were exposed to (sample-comparison) compound training with the use of only AB and AC compounds. The compounds were considered “baseline compounds” as they were composed only of stimuli presented within equivalence training. Relating these compounds would not require the participants to derive relations via symmetry or transitivity, as they would 35 with “emergent compounds.” Results showed that the participants all successfully passed equivalence-equivalence testing with baseline compounds (AB and AC compounds) and subsequent equivalence-equivalence testing with emergent compounds (BC compounds). The authors hypothesized that the five-year-old children had difficulty deriving relations among the compound stimuli, but that exposure to baseline compounds facilitated performance on the more difficult task. Carpentier, Smeets, and Barnes-Holmes (2003) further examined the procedural modifications from the Carpentier et al. (2002) study, which resulted in four out of four five-year-old children successfully passing the equivalence-equivalence testing. In the Carpentier and colleagues (2003) study, eight five-year-old children served as participants and were exposed to A to B and A to C training, sample-comparison construction training with baseline compounds, and then equivalence-equivalence testing with baseline compounds as in the previous Carpentier and colleagues (2002) study. In the Carpentier and colleagues (2002) study, the equivalence testing was initiated immediately following the baseline compounds equivalence-equivalence testing and it was suggested by the authors that this task sequence may have facilitated responding on the emergent compounds equivalence-equivalence testing. To examine this sequencing effect, sample-comparison construction training and equivalence-equivalence testing with emergent compounds immediately followed equivalence-equivalence testing with baseline compounds for the eight five year olds in the Carpentier et al. (2003) study. With this modification, only one participant passed the emergent compound equivalenceequivalence testing on the first exposure. On the following equivalence testing, all but 36 two participants passed. The disparity in effect across studies suggests that the order of training and testing contributed to the success of the four children in the Carpentier et al. (2002) study. Notably, 75% of the children that passed the baseline compounds equivalence-equivalence test on first exposure eventually passed the emergent compounds equivalence-equivalence test. Failure on the baseline compounds equivalence-equivalence test was 100% correlated with failure on the same task with emergent compounds. Experiments 2, 3 and 4 of the Carpentier and colleagues (2003) study examined additional parameters designed to facilitate correct responding on the equivalenceequivalence test with baseline compounds in an effort to positively affect success on the equivalence-equivalence test with emergent compounds. In the course of the experiments, five of eight participants passed the baseline compounds equivalence-equivalence testing without specific training and of these only one went on to pass with emergent compounds prior to equivalence testing. This finding demonstrated that the A to B and A to C training along with the sample-comparison construction training was not universally effective in establishing responding on equivalence-equivalence testing, as it was in Carpentier and colleagues (2002) study. Four more participants passed the equivalenceequivalence testing with emergent compounds after at least one exposure to equivalence testing, supporting findings from Experiment 1. Throughout Experiments 2, 3, and 4, seven of out 12 participants never demonstrated equivalence-equivalence with emergent compounds, even with substantial procedural modifications in place. To confirm that these failures were not due to errors in the procedure, adult participants were used in 37 Experiment 5. Three of four adults passed the equivalence-equivalence test with emergent compounds, without equivalence testing, and the fourth passed after equivalence testing. Taken together, these findings in combination with Carpentier et al.’s (2002) results show that the analogy teaching protocol has only yielded correct responding in 43% of the five year olds who have participated. Ruiz and Luciano (2011) assessed the strength of the equivalence-equivalence model by establishing several fully distinct classes of stimuli and then testing for analogies across these domains. The 12 adult participants were first trained to form three equivalence classes, Class A, Class B, and Class C, composed of three stimuli consisting of random syllables. Next, they were tested for the emergence of equivalenceequivalence and non-equivalence-non-equivalence responding with compound stimuli from within those three classes. In the next phase, participants were trained to form two three member classes composed of abstract figures, Series 1, and two three member classes composed of Greek letters, Series 2. To verify the formation of the distinct classes, the participants were tested for the emergence of transitive relations. Finally, an analogy task was presented in which the sample stimuli were compounds from Series 1 and the comparison stimuli were compounds from Series 2. Important to note, the participants had never been trained in any way to relate the abstract figures to the Greek letters. Correct performance on this task would indicate a highly generalized form of equivalence-equivalence responding, yet to be demonstrated in experimental research. Results showed that 11 of 12 participants passed the initial analogy task, however, most required three or more attempts. On the cross-domain analogy tasks, the 10 38 participants who attempted the test passed and six of the 10 did so on their first attempt. Notably, difficulty on the initial analogy task was correlated with difficulty on the crossdomain task. In Experiment 2, the participants were not exposed to a test for transitive relations prior to the cross-domain analogy test and findings showed that seven of the 10 participants passed. The three participants that did not pass on their first attempt were exposed to the transitivity test then re-tested on the cross-domain analogies until mastery was demonstrated. As in Experiment 1, these three participants also required multiple attempts to pass the initial analogy task. This study (Ruiz & Luciano, 2011) was the first to explore the extent of participants’ identification of relatedness and distinction across a broad range of categories. These findings support the equivalence-equivalence model of analogy by showing that it may be effective in teaching individuals to relate a variety of stimuli in a highly generative manner. Overall, the analogy protocol developed by Barnes, Hegarty, and Smeets (1997) has been highly successful with cognitively typical adult participants. The basic components of the procedure, establishing equivalence classes followed by testing for equivalence-equivalence, served as the foundation for the current study though the methods of training and order of testing were changed to incorporate verbal naming research (Horne & Lowe, 1996). Also included in the current study was the inclusion of “baseline compound” testing prior to testing with “emergent compounds” as this procedural variation was most successful in facilitating correct responding for the children participants (Carpentier et al., 2002; 2003). It was hypothesized that this procedure would be effective, despite the differences in the training methods, as the 39 equivalence-equivalence testing procedures share several characteristics to the earlier works. Findings of the studies reviewed above contributed strongly to the current study; however, research in analogical reasoning has extended to include a variety of other stimuli, testing methods and dependent measures. These works are described below to define the boundaries of investigation in the equivalence-equivalence phenomena. Extensions of Equivalence-Equivalence Models of Analogies Research in the equivalence-equivalence model has progressed beyond the basic protocol established by Barnes and colleagues (1997) to include a variety of different elements and measures. Stewart, Barnes-Holmes, Roche, and Smeets (2001) based their experimental procedures on the logic that many analogical relations have some element of non-arbitrary relations. Depending upon the terms within an analogy, respondents may be deriving relations extremely complex that could only be established through language (e.g., loquacious is to talkative as reticent is to taciturn) or ones simply based on shared visual properties such as sizes, colors, shapes, etc. Stewart and colleagues (2001) utilized an equivalence-equivalence procedure to assess if adult participants could select compounds based on abstractions from trained relations with non-arbitrary visual stimuli. Participants were first trained to relate arbitrary stimuli, nonsense syllables in written form, to blue and red colored shapes in order to form two distinct equivalence classes. Stimuli from Class 1 were all related to a blue shape while stimuli from Class 2 were all related to a red shape. The nine participants were broken into two groups; five were exposed to the equivalence test following training while the other five were immediately exposed to equivalence-equivalence testing. In the equivalence test, the participants were 40 tested for equivalence relations based on abstraction of color when presented only with the arbitrary stimuli (e.g., D1 to A1 or D2 to A2). During equivalence-equivalence testing the participants were presented with compound sample and comparison stimuli consisting only of nonsense syllables. If the compound stimulus was made up of terms from the same color-based class it was considered equivalent. If the terms were from different classes, the stimulus was considered non-equivalent. Of the five participants exposed to the equivalence test immediately following training, four passed both tests while one failed both. Of the four participants exposed to the equivalence-equivalence test immediately following training, two passed while two passed only after passing the equivalence test. In Experiment 2, the authors (Stewart et al., 2001) trained an additional stimulus relation to assess if the non-arbitrary stimulus relations, color or shape, could be abstracted from within an even more complex equivalence network. Four new adult participants were exposed to the training procedure described in Experiment 1 with the addition of a new phase consisting of only arbitrary stimuli. Participants were trained to relate the nonsense syllables to the colored shapes, tested for equivalence, and if they passed they were tested for equivalence-equivalence. Following success on this initial phase, the participants were trained to relate the nonsense syllables associated with the colored shapes to novel nonsense syllables. Lastly, participants were tested for the emergence of equivalence and equivalence-equivalence relations with the use of the novel nonsense syllables. Three of the four participants passed all four tests, while one 41 participant did not pass the equivalence-equivalence test from the first phase and thus was discontinued from the study. In a final experiment, the authors (Stewart et al., 2001) replicated the previous procedure with the use of non-arbitrary stimuli that were visually more complex to more closely approximate analogies of a more challenging nature. Visual stimuli consistent with the concept of “new” and “old” were used in place of the red and blue shapes for Experiment 3. Pre-training conditions were added to teach the participants to relate the various “new” images to “new” images and “old” images to “old” images. In a second pre-training task, nonsense syllables were superimposed over the images such that participants would incidentally be relating these arbitrary stimuli. “New” and “old” images were each paired with two different arbitrary stimuli (e.g., E1 and F1 for “new” and E2 and F2 for “old”) and the participants were tested on their ability to relate these arbitrary stimuli in the absence of the images (e.g., E1 to F1, E2 to F2) as the final step in the pre-training condition. Once this performance was established, participants were exposed to the two phases used in Experiment 2. All three adult participants passed the two equivalence tests and two equivalence-equivalence tests. These findings extended the use of equivalence-equivalence testing; however, no language based responding was utilized in training and testing, which is in glaring contrast to the typical context of analogy teaching in an educational setting. In fact, language in the form of covert naming may have been responsible for the success within these tests, though it was never explicitly required. In Experiments 1 and 2, the colored stimuli should have been sufficient to evoke the tact of “red” or “blue,” as 42 the participants were all verbally competent adults. Training the participants to relate arbitrary stimuli to the color stimulus served to expand the equivalence class such that all responding evoked by the non-arbitrary stimuli should now be evoked by the arbitrary stimuli. Specifically, once these color based equivalence classes were intact, all stimuli within the class should have been effective in evoking the vocal tact of either “red” or “blue.” Based on the naming account, participants should have been able to select the corresponding arbitrary stimulus based on the agreement between the tact evoked by the sample and the tact evoked by the comparison. Thus, completion of the equivalence tests could be explained by the participant tacting the color of the sample then selecting the comparison that evoked the same color label. Further, success on the equivalenceequivalence could follow the same model as the participants could tact an equivalent compound stimulus as “red- red” or “blue-blue.” This interpretation could still explain responding observed in Experiment 3, differing only in the response form of the tact evoked (e.g., “old” and “new”). Any procedure using non-arbitrary stimuli will be open to this interpretation; thus, procedures that make use of verbal skills should be favored in these instances rather than procedures that ignore verbal skills. Following up on their 2001 study, Stewart, Barnes-Holmes, Roche, and Smeets (2002) trained participants to derive relations along the dimensions of shape and color. In the first experiment, seven adults were trained using an MTS procedure to relate a colored shape (Stimulus A) to comparisons made up of nonsense syllables (Stimuli B and C). The samples consisted of a red circle, a red square, a blue circle and a blue square so that if the procedure was successful two classes would have a common color (either red 43 or blue) and two different classes would have a common shape (either circle or square). Testing indicated that four three-member classes (A-B-C) were formed via emergent relations across all seven participants. Following equivalence testing participants were divided into three groups, color, shape, and control, and exposed to equivalenceequivalence trials. Across all three groups, no non-equivalence-non-equivalence trials were conducted, meaning all correct relations involved stimuli from within the same experimenter designated classes. In the color based group, the correct comparison was related to the sample by color, while in the shape based group the correct comparison was related by shape. The control group contained trials that had both shape and color based relations. Testing showed that all seven participants displayed correct equivalenceequivalence responding. Finally, these participants were exposed to a task in which they were given the opportunity to relate stimuli either on the basis of color or shape, with the hypothesis being that participants from the color group would favor matching by color and participants from the shape group favoring matching by shape. Results confirmed this hypothesis, with control group participants displaying both color and shape matches. The authors (Stewart et al., 2002) hypothesized that the observed equivalenceequivalence responding was supported by the formation of relational frames related to the actual physical dimensions of the original sample term. This conclusion is not supported by the data from Experiment 1 as the equivalence-equivalence relations formed could have been completed on the basis of their class relation alone. As the sample and correct comparison both contained related terms and the incorrect comparison always contained unrelated terms, additional control from derived color or shape relations may not have 44 been necessary. It’s possible that responding could even have been controlled by the exclusion of the unrelated compound. For discrimination based on physical properties to be demonstrated, four additional stimulus classes would need to be generated and included in equivalence-equivalence testing with a field of three. Two new shapes would be introduced in both red and blue, such that in equivalence-equivalence testing the participant would have to choose between a comparison with unrelated terms, a comparison with related terms with no shared properties with the sample, and a comparison with related terms with one shared property with the sample. Stewart et al. (2002) tried to address this limitation in Experiment 2. In Experiment 2, four participants were exposed to the same procedure as in Experiment 1 with the addition of a block sorting task and two additional class members (D and E). Participants were trained to relate established equivalence class members to the new terms, and then tested for emergent relations. These new class members would be used in the equivalence-equivalence testing differently among pairs of participants. For two participants, stimulus compounds were presented in a field of three comparisons such that the correct comparison had both related terms and derived formal similarity to the sample, an incorrect comparison with unrelated terms but derived formal similarity to the sample, and an incorrect comparison with unrelated terms and no derived relations to the sample. Participants were hypothesized to relate stimuli on the basis of equivalenceequivalence and the derived formal properties of either shape or color. The other two participants were exposed to the D and E stimuli within the presented samples and comparisons without any modification to the Experiment 1 protocol. Correct responding 45 would suggest a more complex composition of the equivalence-equivalence class, as the D and E terms were not established with a many-to-one training. Results showed that all four participants passed the equivalence and equivalence-equivalence tests. Transformation of function was tested with the use of a block-sorting task with stimuli with common color and shape properties. Results showed that following equivalenceequivalence testing with stimuli with derived color relations, participants sorted the blocks according to color. The reverse was also true when the derived relation was shape, showing the transformation of function of the blocks according to the relational frame established through testing. The results of Experiment 2 suggest that the participants derived the formal property relations among the stimulus compounds, based on the reversals in block sorting, but the formal similarity was still not definitively shown to affect equivalenceequivalence responding. The inclusion of the third comparison ruled out responding controlled by exclusion of unrelated terms, but did not eliminate the suggestion that responding was based solely on “sameness.” The participants did not need to derive color or shape relations to choose the correct comparison, they could match based on the terms in the sample participating in the same equivalence class and the terms in the comparison participating in the same equivalence class. Additionally, while these findings (Stewart et al., 2002) are a robust demonstration of the effectiveness of MTS procedures in establishing equivalence relations and equivalence-equivalence relations without additional training, the participants were never tested on relating compounds composed of completely different derived relations. In Experiment 1, all the testing was conducted 46 with BC compounds and in Experiment 2, all testing was conducted with DE compounds. Based on subsequent research with use of compound stimuli (Debert et al, 2007; Debert et al., 2009), correct equivalence-equivalence responding should have emerged when presented with related compounds based on symmetrical relations (BA, CA, CE, and BD) and transitive relations (AD, AE, BE, CD, and ED). Finally, as noted by the authors (Stewart et al., 2002), these results are limited as all participants were adults with extensive verbal histories. Carpentier, Smeets, Barnes-Holmes, and Stewart (2004) extended the equivalence-equivalence literature to address possible inappropriate sources of stimulus control within the testing and training procedures. Results from Carpentier et. al (2003b) suggested that participants related some compound stimuli based on their discriminative functions, as demonstrated by matching compounds with equivalent terms to a Happy Face and compounds with non-equivalent terms to a Sad Face. To control for this possible source of spurious stimulus control, the experimenters tested analogies made up of only equivalent compounds. Three five-member equivalence classes consisting of three abstract forms (X, Y, and Z), a color, and a shape were trained using basic matching to sample (MTS) procedures on a computer: Class 1 (X1-Y1-Z1-red-triangle), Class 2 (X2-Y2-Z2-green-circle), and Class 3 (X3-Y3-Z3-blue-square). The four adult participants were trained to relate the X and Y stimuli to a color and the X and Z stimuli to a form, and then tested for the emergence of equivalence relations among these arbitrary figures. In the analogy task, participants were shown a sample consisting of equivalent terms related by either color or shape (e.g., X3Y3, both related to blue, or 47 X3Z3, both related to square) and allowed to select from two equivalent comparisons, one with terms related via color and one with terms related via shape (e.g., X2Y2, both related to green, or X2Z2, both related to circle). Matching the sample and comparison sharing the same basis for relatedness, either shape or color, was considered a correct response and a demonstration of equivalence-equivalence responding. All participants passed the equivalence-equivalence tests, though one participant reported that her responding was controlled by the relatedness only of the last term of the sample to the last term of the comparison. Performance of this nature could be interpreted as equivalence responding, rather than equivalence-equivalence responding, based on the formation of overlapping equivalence classes based only on relatedness to “color” or “shape” stimuli (e.g., relating Y1 to Y2 or Y3 as they are all related to “color” or Z1 to Z2 or Z3 as they are all related to “shape”). In Experiment 2, the procedures from Experiment 1 were repeated with an additional test to determine if these overlapping equivalence classes were formed. Adult participants were shown a single Y or Z stimulus as the sample and two comparisons from a different numeric class (e.g., Y1 with comparisons Y2 and Z2). Participants correctly related stimuli based on relatedness to color (e.g., Y to Y) and shape (e.g., Z to Z), supporting the hypothesis that the successful performance on the analogy tasks in Experiment 1 and 2 were possibly based on this type of control, rather than true equivalence-equivalence. To address this limitation, in Experiment 4, two additional comparisons were added to the array consisting of non-equivalent terms. Of these additional comparisons, one would share a term from the “overlapping” stimulus class 48 with the sample to assess if the participants would select based on this source of control (e.g., with a sample of X1Y1, the distracter stimulus would be X2Y3). All five adult participants passed the equivalence-equivalence test, rarely selecting the distracter stimulus. In a further examination of equivalence-equivalence responding, Barnes-Holmes, Regan, Barnes-Holmes, Commins, Walsh, Stewart, Smeets, Whelan, and Dymond (2005) measured mean response times across different sample and comparison arrangements. In accordance with the RFT interpretation of analogical reasoning, it was hypothesized that analogies composed entirely of equivalent compounds would be solved more rapidly than analogies composed of entirely non-equivalent compounds because the number of relational frames necessary is reduced. Equivalent-equivalent compound analogies require only the relational frame of coordination to solve, as the relation between the terms that make up the compounds is “same” and the relation between the compounds themselves is “same.” In contrast, non-equivalent-non-equivalent compound analogies require the relational frames of distinction and coordination to solve, as the relation between the terms that make up the compounds is “different” while the relation between the compounds themselves is “same.” Thus, the authors posited that the mean reaction times would be higher while participants are solving the non-equivalent-non-equivalent analogies. To test this hypothesis, 24 adult participants were first exposed to a delayed MTS procedure in order to establish A to B and A to C relations among four different experimenter designated classes of arbitrary stimuli. Equivalence testing was completed 49 to assess for the emergence of untrained relations (B to A, C to A, C to B, and B to C) and finally, equivalence-equivalence testing was employed with the use of BC compounds. Correct responding as well as response time was measured across four different trial types in the equivalence-equivalence testing: “same-same,” “differentdifferent,” “same-same with foil” and “different-different with foil.” Trials “with foil” were those in which the incorrect comparison stimulus had one term consistent with the sample. Results across all four trial types were also assessed in a speed contingency condition in which a response was required within three seconds for it to be counted as correct. Findings (Barnes-Holmes et al., 2005) showed that mean response time was significantly shorter in the speed contingency condition, regardless of trial type, but this condition had a slightly negative effect on accuracy. The “same-same” trials, with and without foils, had significantly shorter response times than did “different-different” trials, supporting the authors’ hypothesis. No difference in accuracy was observed across the four trial types in either speed condition. Important to note, these findings could support a naming approach to analogical reasoning, as well as the RFT interpretation. It stands to reason that if the individuals were covertly tacting the relations, between the terms of the compounds and the relations between the compounds themselves, trials in which only one response was occasioned (e.g., “same”) would have a shorter response time as compared to trials with two responses (e.g., “same” and “different”). Lipkens and Hayes (2009) extended the equivalence-equivalence literature by using several different forms of analogy tasks. 17 adult participants were first trained to respond to arbitrary relational cues for same and different via multiple exemplar training 50 with a variety of non-arbitrary visual stimuli serving as samples and comparisons. The contextual cue was shown directly below the sample and above an array of comparisons. Selection responses were differentially reinforced dependent upon the presence of either the “same” or “different” cue. Once the cues were established, then these cues were used to establish three-member, arbitrary stimulus classes. Several training and testing protocols were employed; one in which the individuals were required to select a comparison when presented with the relational cues and a sample, a second in which the individual was required to construct the comparison by typing the corresponding letters when shown the relational cue and the sample, and third in which the individual was presented with the sample and comparison and had to select the corresponding relational cue. Using these procedures, the individuals were trained to respond to Y as equivalent to X and Z as the opposite of X. Correct derived relational responding consisted of responding to Y and Z as opposites, based on their relations to the X term. Once participants were able to display these equivalence and non-equivalence relations, they were systematically trained to respond to compound stimuli consisting of two terms presented side-by-side. The individuals were presented with a unitary term paired with a relational cue and a second unitary term, and then prompted to match to a compound stimulus consisting only of two terms next to one another. Following this extensive training, the 11 participants who were proficient with the prior tasks were exposed to the actual experimental conditions. In Experiment 1, four participants were trained to relate stimuli such that: A1 was the same as B1, A2 was the same as B2, A3 was the opposite of B3, and A4 was the 51 opposite of B4 with same three procedures using in the pre-training condition. Participants were then exposed to two kinds of equivalence-equivalence tests with the use of AB and BA compound stimuli. In one test, the participant was to select a compound comparison when shown a compound sample based on the relatedness of the terms. If successful, the participant was exposed to the second test in which they were shown a compound sample with a contextual cue and were required to produce the correct comparison by typing the letters on the keyboard. In the next phase, the participants were trained to relate A to C stimuli in the presence of the contextual cues in the same manner as A to B training and testing. However, findings revealed that one participant related A to C as “same,” two participants related A to C as “opposite,” while the last related A to C as “same” for stimuli from Class 1 and 3 but “opposite” for Class 2 and 4. The selection and production tests were used to test for the emergence of BC and CB analogies, and three of the four participants passed all equivalence-equivalence tests. In Experiment 2, the 11 remaining participants from the pre-training condition were exposed to the three procedures from that condition to establish A1 as the same as B1 and A3 as the opposite of B3. The remaining relations were established by training the participant to select the corresponding relational cue given two compounds. Thus, compounds of A2-B2 were trained as same as A1-B1 and A4-B4 were trained as same as A3-B3. These relations and expected emergent relations were then tested by presenting a component sample and a component comparison and allowing the participant to choose between the two relational cues. Once the participant was proficient with this task, A to C training was initiated. As in Experiment 1, participants were trained and tested on varying 52 A and C relations then exposed to BC and CB analogy tasks. Of the seven participants, only four passed all the analogy tasks and only two did so upon first exposure. In two additional experiments, the authors (Lipkens & Hayes, 2009) assessed for performance on varying analogy tasks with various forms of trained relations. In Experiment 3, three participants received similar pre-training as in Experiments 1 and 2 except with the contextual cues of “same,” “smaller than,” and “larger than.” Training and testing was conducted in a similar manner as in the previous experiments, differing only in the types of relations established between the stimuli. Two of the three participants passed the final analogy test. In Experiment 4, the two remaining participants were trained to relate stimuli as either “same” or “smaller than” then tested as in the previous experiments. Both participants displayed correct responding on the final analogy test. Taken together, these findings were the first demonstrations of emergent analogical reasoning on the basis of relations more complex than equivalence and non-equivalence. Overall, the behavior analytic interpretation of analogical reasoning has forwarded investigation by developing hypotheses related to performance that can be experimentally validated and tested in the absence of abstract mental constructs. This progression is especially notable for children with disabilities as assumptions related to their cognitive development and prior learning cannot be made based solely on their age and school experience. For example, in the Goswami and Brown (1989; 1990) studies, the participants’ ages were taken as representative for their relational knowledge. This same assumption could not be made for a child with autism and thus, their performance on these analogy tasks could not be readily predicted. However, in an effort to exclude 53 unverifiable, intermediary responses (e.g., formation of mental schema), the behavior analytic literature has failed to overtly include the verbal repertoire in their procedures. Researchers studying naming have developed several experimentally sound protocols that have been used to successfully teach categorization skills to young children (Horne & Lowe, 1996). The current study was developed drawing on both these lines of research in an effort to create a procedure effective for children as young as 5 years of age or with equivalent functioning. Naming Account of Stimulus Equivalence and Emergent Categorization The current study deviated from the existing analogical reasoning work by explicitly including verbal naming to establish equivalence classes. The naming account of stimulus equivalence was developed based on the foundational principles of verbal behavior as described by Skinner (1957), particularly relating to the development of speaker and listener behavior (Horne & Lowe, 1996). The naming operant encompasses an individual’s simultaneous responding as both a speaker and a listener, which allows an individual to demonstrate emergent stimulus equivalence relations (Horne & Lowe, 1996). Eikeseth and Smith (1992) examined the effectiveness of MTS procedures in teaching children with autism to form equivalence classes of arbitrarily related stimuli and results showed that additional types of training was necessary for novel behavior to emerge. In Phase I, the participants were trained to relate A to C and A to B with MTS procedures and then tested for transitive relations. If transitive relations were not displayed, the participants were tested for symmetrical relations and then retested in transitive relations. Finally, the trained relations were retested to assess if the AB and AB 54 relations had been extinguished. Following training, which ranged from 360 to 1,546 trials, no participants were able to display transitive relations between stimuli and only two were able to demonstrate symmetry. The AB and AC relations were maintained in all the participants, so the lack of maintenance of these relations cannot explain the lack of emergent behavior in the participants with autism. To address the deficits in responding, the participants were trained to emit a common name in the presence of each class of stimuli and then retested in the baseline and transitive relations. Vocal naming training required 86 to 234 trials to establish correct responding in the presence of all stimuli. Following this training, two of the four participants proficiently formed transitive stimulus-stimulus relations. Participants were exposed to several other verbal naming procedures, after which they were presented with another MTS training procedure. Results showed that two participants were now able to form transitive relations via conditional discrimination where previously all participants had failed these equivalence tests. These findings (Eikeseth & Smith, 1992) suggest that proficiency in verbal naming may facilitate the formation of equivalence classes, even when the training procedures used do not include a vocal response. However, these findings are limited as only two of four participants showed proficiency and the participants were never tested for the emergence of bi-directional relations in the form of listener behavior. In the first of a series of studies on the role of the naming operant in establishing novel responding within stimulus equivalence training, Lowe, Horne, Harris, and Randle (2002) trained typically developing children to vocally tact stimuli and then probed for the emergence of untrained listener behavior and categorization. In Experiment 2, the 55 participants were presented with wooden blocks of varying shapes that had arbitrarily been grouped by the experimenters into the categories. The shapes and category labels were abstract and with no relation to functional items that children two to four years old would encounter. The participants were trained to tact “vek” or “zog” in the presence of the stimuli with the use of echoic prompts, reinforcement for correct responding, and error correction. Once mastery criterion were met, the participant was presented with an array of all six stimuli used in training and told to give the experimenter “the rest” when one member of a category was held up. Two forms of this categorization test were conducted; one in which the participant was required to simply look at the sample before giving the rest of the stimuli and one in which the participant was required to tact the sample prior to engaging in the categorization response. On both tests, correct responding consisted of giving only the two remaining members of the category, “vek” or “zog.” A listener behavior test was also conducted in which the participant was presented with the pairs of the stimuli and asked to point to the “vek” or “zog.” In the testing phases, no reinforcement or corrective feedback was provided contingent upon responding to assess if the behaviors emerged in the absence of specific training. Results showed that all participants were able to categorize the arbitrary stimuli without training and engage in correct listener behavior. These findings were notable in that the relations were established with as few as 34 training trials and at most 240 trials. Tact training was sufficient to establish both listener behavior and the formation of stimulus classes in children ages two to four years old, providing support for the naming hypothesis 56 regarding the emergence of novel responding within stimulus equivalence training (Lowe, Horne, Randle, & Harris, 2002). In the second of the series of experiments regarding the role of naming in the establishment of equivalence classes, Horne, Lowe, and Randle (2004) taught nine typically developing children to engage in a listener response in the presence of arbitrary shapes and tested for the emergence of untrained categorization and speaker behavior. Arbitrary shapes similar to those used in the studies by Lowe, Horne, Randle, and Harris (2002) were utilized and the same arbitrary category names, “vek” and “zog,” were employed within the study. Following listener training, in which the participant was presented with one stimulus from each category and asked to find either “vek” or “zog,” participants were tested for the development of equivalence classes within a categorization test requiring them to look at the sample then give the remaining two stimuli from within the class. After the categorization testing, tact testing was conducted to assess for the emergence of bidirectional relations between stimuli. In the tact test, arrays of all six shapes were presented and the experimenter pointed to one shape and asked, “What’s this?” All nine children failed the initial categorization test and seven of nine failed the tact test. Tact training was added and once mastered a total of six children passed the categorization tests. These results strongly suggest that listener behavior alone was not sufficient to establish categorization behavior; speaker and listener relations were necessary for proficient responding. It was noted by the experimenters that one participant, only 21 months of age at the start of the experiment, engaged in overt echoic responding during the listener training though not instructed to do so. This report follows 57 the theorized behavioral process responsible for the emergence of the naming operant, and thus, emergent bidirectional responding (Horne, Lowe, & Randle, 2004). Miguel, Petursdottir, Carr and Michael (2008) assessed the effectiveness of establishing categorization responding in typically developing children, ages three to five, after training them to emit either speaker or listener behavior. Stimuli used in the study were shapes of American states, designated “North” and “South” based on their actual geographical locations. Two categorization tests were used in both experiments, following procedures used by Lowe, Horne, Harris, and Randle (2002); one in which the participant was instructed to look at the sample stimulus and then give the rest of the stimuli within that category, and one in which the participant was required to tact the sample stimulus then give the rest of the stimuli within that category. Both tests were administered prior to training and after training. The first tact training procedure in Experiment 1 consisted of the experimenter presenting the stimuli in a pair-wise fashion, pointing to one and vocally stating, “What is this?” In the second tact training procedure, all the stimuli were presented in a row of six while the experimenter pointed to one in the array and stated, “What is this?” Praise was provided contingent upon correct responding in both these training conditions on varying schedules and a correction procedure was implemented following incorrect responding. Two listener responding tests were used before and after tact training. In one, stimuli were presented in a field of three, a picture of a Canadian province serving as a third distracter stimulus. The experimenter instructed the participant to “Give me North [South].” In the second listener test, all six North and South stimuli were presented in a row and the participant was instructed to “Give me all 58 the North [South].” All participants failed the categorization pretests and did not emit any correct tacts in the presence of the stimuli. After the first tact training, two participants were able to successfully categorize the stimuli within both posttests and tact the stimuli in the second posttest. The other two participants showed accurate tact responses in the second posttest, but no correct category sorts in either posttest. After the second tact training, they both demonstrated correct tact responses and category sorts in the second posttest. However, one of these participants continued to fail categorization tests in the first posttest condition. Interestingly, this participant displayed continued inaccurate listener behavior after the two tact-trainings, while the other participants showed proficiency above 80%. Taken together, these findings (Miguel et al., 2008) suggest that accurate category sorts require proficiency listener and speaker behavior. All participants showed evidence of the naming relation as listener behavior emerged after speaker training. To further examine the roles of speaker and listener repertories in categorization, a second experiment was conducted in which the participants were trained in listener behavior and tested for emergent speaker behavior and category sorts (Miguel et al., 2008). The listener tests used in Experiment 1 served as the training procedures in Experiment 2, with the addition of praise contingent upon correct responding and a correction procedure contingent upon incorrect responding. The tact training procedures from Experiment 1 served as the tact tests for Experiment 2, absent any consequences for responding. Categorization tests were identical to those used in Experiment 1. None of the participants displayed proficiency in tacting the stimuli or correctly categorizing them 59 prior to the listener training. In the first posttest, in which the participants were not required to tact the stimuli, categorization responding was variable or not present in three of the four participants. In the second categorization posttest, in which the participants had to tact the stimuli, the percentage of correct category sorts was significantly higher and less variable for three of the four participants. These participants also displayed proficiency in tacting the stimuli. The participant who failed to categorize the stimuli in the second pretest also displayed inconsistent, chance-level responding when asked to tact the stimuli. After undergoing a second listener training, the participant displayed correct category sorts in both posttests and an increased proficiency in tacting. These findings (Miguel et al., 2008) suggest that categorization behavior did not emerge consistently until the participants were required to tact the stimuli. All participants showed evidence of the naming relation as speaker behavior emerged after training in listener behavior. These experiments represent a highly efficient form of teaching as each participant was trained to engage in six behaviors, but eight additional behaviors were displayed without specific training. In Experiment 1, a range of 448-880 trials were required to train the six tact responses. The number of trials was inversely related with the participants’ ages. The nine emergent behaviors included six listener responses, categorization of “North” stimuli, and categorization of “South” stimuli. In Experiment 2, a range of 162-1,116 trials were required to train the six listener responses. There was no relationship between the ages of the participants and the number of trials required. The eight emergent behaviors included six tact responses, categorization of “North” stimuli, 60 and categorization of “South” stimuli. These findings were consistent with the other research related to the naming operant in that listener responding alone was not sufficient for categorization to emerge and speaker training produced more reliable bidirectional relations than listener training (Miguel et al., 2008; Lowe et al., 2002; Horne et al., 2004; Lowe et al., 2005; Horne et al., 2006). Combined, these studies supported the inclusion of the Phase I vocal response as a means of establishing equivalence classes. Notably, the current study required the participants to emit a tact in the presence of compound stimuli and then display several emergent responses with both compound and individual stimuli in Phase II and III. Research relating specifically to the use of compound stimuli gave rise to this element of the procedure, as naming research has yet to widely make use of compound stimuli. Separable Compounds Account of Equivalence Classes The current study’s assumption that the Level 1 Tact Training would be sufficient to give rise to the analogy and component matching performances tested with all three levels of stimuli was formed on the basis of the body of research with the use of compound stimuli. Markham & Dougher (1993) conducted a study with 11 undergraduate students in which they were trained to relate unitary comparison stimuli to compound sample stimuli. A total of nine relations between compound stimuli and unitary stimuli were trained in a matching to sample procedure (A1B1-C1, A1B2-C3, A1B3-C2, A2B1-C3, A2B2-C2, A2B3-C1, A3B1-C2, A3B2-C1, and A3B3-C3). All the stimuli were composed of abstract forms, presented side by side when serving as compounds, and in a field of three when serving as comparisons. Stimuli were arbitrarily 61 assigned to classes by the experimenter; the forms shared no common physical properties with one another to prevent stimulus generalization. The procedure was effective in establishing 18 untrained behaviors (A1C1-B1, A1C3-B2, A1C2-B3, A2C3-B1, A2C2B2, A2C1-B3, A3C2-B1, A3C1-B2, A3C3-B3, B1C1-A1, B2C3-A1, B3C2-A1, B1C3A2, B2C2-A2, B3C1-A2, B1C2-A3, B2C1-A3, and B3C3-A3) in all participants. These findings (Markham & Dougher, 1993) suggest that stimuli within conditional discrimination procedures are interchangeable between sample and comparison functions as the separable compounds account suggested (Stromer, McIlvane, & Serna, 1993). The designations of stimuli as “samples” and “comparisons” were essentially irrelevant as all terms participated within equivalence classes without discriminated responding related to how the stimuli were originally presented within training. It was unclear whether these responses were a product of stimulus equivalence relations as described by Sidman and Tailby (1982) because demonstrations of reflexivity, symmetry, and transitivity were not present. Observed emergent behavior could be attributed to simple discrimination because the stimuli were all presented together during training; no derived responding was demonstrated (Markham & Dougher, 1993). In Experiment 2, the experimenters designed the procedures to test for symmetry and transitive stimulus relations to unequivocally test for responding consistent with stimulus equivalence (Markham & Dougher, 1993). Six participants received the same training as in Experiment 1, but were tested for symmetrical relations with the unitary stimulus serving as the sample and the compound serving as the comparison (C-AB). Five out of the six participants demonstrated the emergence of symmetrical responding 62 by displaying nearly 100% accuracy on all test trials. To assess for transitivity, another six participants in the study were trained in the same nine relations used in Experiment 1 with the addition of three unitary to unitary relations (C1-D1, C2-D2, and C3-D3). After training, the participants were tested for the emergence of relations between the original, compound sample (AB) and the unitary stimulus trained related to the original, unitary comparison (D). Correct responding would be considered transitive as the stimuli were never presented together, but would be members of the same equivalence class based on relations to the original, unitary stimulus which functioned as a common term in all the equivalence relations (C). All six participants demonstrated nearly 100% accuracy across the test trials. In Experiment 3, the same 12 relations were trained, AB-C and C-D, but the participants were tested for the emergence of what were considered equivalence relations, D-AB, and the relations AD-B and BD-A. Three of the five participants demonstrated responding over 80% when tested for D-AB relations. Four out of five participants demonstrated responding over 80% when tested for AD-B and BD-A relations. Results from these studies (Markham & Dougher, 1993) supported the separable components account of stimulus equivalence, further demonstrating that the designations of “sample” and “comparison” stimuli are topographical only with no relation to their function. The studies had several limitations, notably that the participants were all adults without disabilities. Therefore, it is unknown whether these procedures would be effective with developmentally typical children or children with disabilities. Additionally, as the responses that were trained were non-verbal, no speaker or listener behaviors 63 would be expected to emerge as a product of the teaching. This limits the usefulness of these procedures, as students rarely need to display purely non-verbal skills in academic settings. In a series of experiments, Maguire, Stromer, Mackay, and Demis (1994) utilized MTS procedures to train adults with disabilities and typically developing children to establish stimulus classes with the use of compound stimuli. In Experiments 1A and 1B, participants were trained to touch a unitary comparison stimulus (D) when presented with a compound sample stimulus (AB). After training, the participants’ responding to individual components of the compounds was tested. The participants displayed the ability to select the correct comparison when presented with a single element of the compound (A-D and B-D), relate the elements of the compounds to one another (A-B and B-A), and select the related element of the original compound when presented with the original comparison as the sample (D-A and D-B). These findings demonstrated the efficient of these training procedures as only two behaviors were trained (A1B1-D1 and A2B2-D2) and twelve behaviors emerged (A1-D1, A2-D2, B1-D1, B2-D2, A1-B1, A2B2, B1-A1, B2-A2, D1-A1, D2-A2, D1-B1, and D2-B2). Alone, these findings did not meet the requirements established by Sidman and Tailby (1982) for stimulus equivalence as no transitive relations could be demonstrated. In Experiment 2A, two additional responses were trained, C1-D1 and C2-D2, and then participants were tested for the presence of equivalence responding. After correct responding was exhibited in training, participants demonstrated an additional 10 novel behaviors in testing (A1-C1, A2-C2, C1-A1, C2-A2, B1C1, B2C2, C1-B1, C2-B2, D1-C1, and D2-C2). As the C stimuli were 64 only presented with D stimuli in training, correct relations between the C stimuli and A or B stimuli would be considered derived transitive relations. In Experiment 2A, the typically developing children who served as participants showed all stimulus relations necessary for stimulus equivalence: reflexive responding (A-B and B-A), symmetry (DA, D-B, and D-C), and transitivity (A-C, B-C, C-A and C-B). As the experimenters (Maguire et al., 1994) did not use verbal behavior to train the responses, no speaker or listener behavior would be expected to emerge as a product of the teaching procedures. This was a limitation of the study for had the behaviors been trained with either a listener or speaker behavior, the number of emergent behaviors expected to emerge, based on the naming hypothesis, would have been substantially higher (Miguel et al., 2008; Lowe et al., 2002). Additionally, whether these procedures would be effective with children with disabilities was not demonstrated. To further study the use of complex stimuli in the establishment of equivalence classes, Debert, Matos, and McIlvane (2007) developed a simple discrimination procedure using compound stimuli. The compound stimuli were composed of two figures (after Markham & Dougher, 1993) presented side by side on a computer screen. The experimenters randomly assigned the figures to three stimulus classes; the six participants were never made aware of the designations. Compounds composed of two stimuli from within the same class were considered “related” and compounds made up of two stimuli from different classes were considered “unrelated.” When presented with related compounds, the participants received points for clicking the mouse on the figure, while mouse clicks in the presence of unrelated compounds were not reinforced. Once the 65 participants were accurately discriminating between the related and unrelated compounds (of the AB and BC form), they were tested for the emergence of untrained stimulus equivalence relations. Symmetry was considered compound stimuli made up of the same components of the trained related and unrelated stimuli, but with their positions reversed from left to right (BA and CB forms). Transitivity and equivalence were considered responding correctly to related and unrelated compounds made up of components that were never presented together whose relations could only be derived (AC forms and CA forms respectively). All six participants showed 100% correct responding with related and unrelated compounds on the first test for symmetry. Four participants displayed 100% correct responding in the equivalence and transitivity tests by the fifth test session, while the other two never displayed responding that was 100% correct within the six test sessions. These findings extended the research in the use of compound stimuli and supported the separable compounds hypothesis (Stromer, McIlvane, & Serna, 1993). The procedure described in the Debert et al. (2007) study was replicated in a study by Debert, Huziwara, Faggiani, Mathis, and McIlvane (2009) with compound stimuli of a more complex nature to further assess the supposed functions of compound and sample stimuli. During Experiment 1, participants were presented compounds made up of a figure on a background during the training phase (AB and BC forms). After training the undergraduate participants to respond correctly to the figure-ground forms, the participants were tested for demonstrations of transitivity and equivalence relations (AC and CA forms respectively). As the B component was the background in the testing phase when tested for transitivity and equivalence, the participants were exposed to two 66 side by side figures for the first time. All four participants showed the emergence of transitive and equivalent relations, correctly responding with 99-100% accuracy within one test session. In Experiment 2, the participants were presented with stimulus compounds consisting of one figure (the A and C components) and positioning on the left or right of the screen (the B component). Like in Experiment 1, in training compounds always consisted of one figure (AC and BC forms), while in testing for transitivity and equivalence (AC and CA forms), two figures were presented side by side. All six typically functioning adults correctly responded within the testing phases with 94-100% accuracy within two test sessions. These results further supported the hypothesis that once equivalence relations are established all terms have equal function within the class (Debert et al., 2009). Summary The current investigation represents the convergence of several distinct lines of research. The research in analogies established the need for systematic experimentation with children with and without disabilities; essentially the target that equivalenceequivalence research still needs to reach. The equivalence-equivalence research served as the framework for the methods of the current study, with the limitations of these works inspiring the procedural modifications utilized. Naming research suggested alternate training procedures to those already used in the equivalence-equivalence work. Finally, separable compounds literature formed the conceptual groundwork to support the appropriateness of the methodological changes made to the equivalence-equivalence basic procedures. 67 Chapter 3 METHODS Participants and Setting The participants for the study included a total of 12 typically developing adults, six in Experiment 1 and six in Experiment 2. Participants were recruited through their Psychology courses at California State University, Sacramento, personal contacts of the experimenter, and personal contacts of research assistants. Participants were required to speak English as their first language, be naïve to the purpose of the study, and have between two and six hours of availability to complete the experiment. Additional demographic information is included in Chapter 4 including: age, undergraduate area of study or degree, and G.P.A. All participants were given a $10 gift card at the completion of the study, as well as snacks throughout the testing condition. Sessions were conducted in a small room with several chairs, a table, and a computer. The participants were seated in front of the computer with the experimenter beside him or her to avoid cuing. During sessions with a second observer collecting data, this individual sat off to the side, in clear view of the computer with no visual access to the data collected by the experimenter. Materials Stimuli presented in all conditions except the Pre-Training condition consisted of abstract shapes. Previous experimental research has made use of these same stimuli (Markham & Dougher, 1993; Debert et al., 2007), thus validating their appropriateness for inclusion in this study. While the use of abstract stimuli limits immediate application in a classroom setting, they are necessary in order to rule out interference of previous 68 learning history and to validate the usefulness of the analogy protocol. Their use allows for comparisons to previous work that also made use of abstract stimuli with adults and children serving as participants (Carpentier et al., 2002, 2003; Barnes et al., 1997). The individual abstract shapes were designated A1, B1, C1, A2, B2, and C2 for the experimenter’s use only; the participants were never made aware of the designations. The number assigned to each stimulus indicated the class membership, either Class 1 or Class 2. The letter assigned to each stimulus was used to differentiate terms within the classes to facilitate systematic training and testing (See Figure 1). Combinations of these figures yielded a total of 12 compounds that have both terms from the same class and 12 compounds that have terms from different classes, across two stimulus classes (Class 1 and Class 2). Compounds made up of figures from within the same numerical class were considered “same” and compounds made up of stimuli from two numerical classes were considered “different”. Across both experiments, participants were systematically exposed to side-by-side combinations of the abstract shapes, called compounds, based on the types of relations between the images. Compounds used in the Vocal Label Training condition, AB and BC, were referred to as Level 1. Compounds consisting of stimulus combinations that were composed of switched positions of the terms, BA and CB, were considered Level 2 compounds. Compounds consisting of AC and CA terms were referred to as Level 3. 69 A B C Class 1 Class 2 Figure 1. Experimental Stimuli. Designations of the stimuli used in the present study, developed by Markham and Dougher (1993). Correct responding with the Level 2 and Level 3 compounds was expected to emerge, in the absence of specific training, as consistent with symmetry, transitivity, and equivalence as originally described by Sidman and Tailby (1982) and later tested using compound stimuli by Debert and colleagues’ (2007). Previous research has shown that participants were more successful within analogy testing when tested with the use of compounds used in initial training and that initial testing with familiar compounds improved performance with emergent compounds (Carpentier et al., 2002; 2003). All trials were pre-ordered and randomized by the current author with several constraints described here. In Vocal Label Training and Testing conditions, the “same” and “different” compounds were randomly interspersed across trials with the restriction 70 that neither type be presented on three consecutive trials. In the Component Matching Baseline Pretest and Posttest conditions, neither Class 1 nor Class 2 stimuli were presented on three consecutive trials, but were interspersed randomly. Across all phases, the experimenter controlled for positional bias by balancing correct responses on left and right sides. Once the trials were pre-ordered and randomized, the author created each task in the Mestre software created by Elias and Goyas (2010). To create the tasks, the author selected the format of the trial presentations as either visual or textual and determined the number of sample and comparison stimuli. Once these constraints were set, the author entered the sample and comparison stimuli desired for each trial. The tasks and phases were presented in the order shown in Table 1. Experimental Design, Dependent Variable, and Data Collection To directly examine the training parameters responsible for each individual’s performance during the testing phases, a steady-state strategy was employed within a single-subject design (Sidman, 1960) across both experiments. Specifically, a nonconcurrent multiple-probe design (Johnson & Pennypacker, 1993) was employed to analyze the effects of the training condition across participants. Participants were grouped into three dyads in each experiment and their exposure to the experimental conditions arranged, non-concurrently, to support the logic of the steady-state strategy. Each participant continued to be exposed to a given condition until a stable pattern of responding (criteria) was met. In Experiment 1, the Component Matching Baseline Pretest served as the baseline such that when one participant within each dyad showed stable responding, a prediction was made that if they were to continue to be exposed to this 71 Table 1 Experiment 1 Task Summaries Stimuli Level Task Trials Per Block Mastery Criterion Common Images Pre-Training 12 One trial block of 100% correct responding All Abstract Images Component Matching Baseline Pretest 24 Two trial blocks of 92% correct responding Level 1 Compounds Vocal Label Training 16 Two consecutive trial blocks of 100% correct responding Level 1 Compounds Vocal Tact Test 16 One trial block of 100% correct responding Level 1 Compounds Analogy Test 16 One trial block of 93% correct responding Level 2 Compounds Vocal Label Test 16 Level 2 Compounds Analogy Test 16 Level 3 Compounds Vocal Label Test 16 Level 3 Compounds Analogy Test 16 One trial block of 93% correct responding All Abstract Images Component Matching Posttest 24 Two trial blocks with 92% correct responding One trial block of 93% correct responding condition, no change would occur. In the same logic, for Experiment 2, the Level 3 Pretest served as the baseline. These were especially strong predictions as the stimuli and 72 tasks are unfamiliar and arbitrarily arranged, thus, not likely to be encountered by the participant outside of the control of the experimenter. Continued baseline measures of the participant offers the possibility of verifying the prediction made for the first participant in the dyad. When no change in responding was observed for the second participant while he or she continued to be exposed to pretesting, the prediction regarding the first participant’s performance was verified even as the first participant was exposed to Level 1 Vocal Label training. The effects of training, the independent variable, were inferred based on the first participant’s improved responding and the lack of change in the other participant’s behavior. This method was repeated with all participant dyads. If the behavior of the second participant changes the same way as observed in the first participant, then replication of the effects of the independent variable has been achieved. A multiple probe design was used to limit the number of exposures to the testing materials in baseline while maintaining the logic of the steady-state strategy. In both Experiment 1 and Experiment 2, the dependent variables of interest were the percentage of correct selection responses with the components of the compounds in the Component Matching tests, vocal labels in the presence of the compound stimuli during the Vocal Label Training and Test conditions, and equivalence-equivalence responding in the Analogy Test conditions. Data were also collected on the number of Level 1 Vocal Label Training trials needed before the participants met mastery criteria. Data were collected during remedial training on trials to criterion during the Component Matching Training and Level 3 Vocal Label Training as well. The computer program, Mestre, created by Elias and Goyas (2010) recorded data as the participants engaged in 73 selection responses via mouse click in the Component Matching and Analogy Test conditions. During the Vocal Label Training and Vocal Label Testing conditions, the experimenter collected the data by hand. All data were transferred to hardcopies at the completion of the testing sessions for storage and analysis. In Experiment 2, data were also collected on the participants’ vocalizations when asked to describe their performance in the testing phases. Any time a vocalization which the experimenter or research assistant judged to be related to a task strategy or the stimuli themselves occurred, the vocalization was written down, word for word. These data were reviewed at the completion of the study to assess for the emergence of any performance strategies or stimulus names developed by the participants, without direction from the experimenter. Additionally, at the completion of the study, the experimenter asked the participants open-ended questions regarding their performance and the participants’ responses was recorded and transcribed. Specifically, the experimenter said the following: “Now that you’ve completed the study I’d like to hear about any strategies you used throughout the procedure. If you would rather not answer or do not know how to explain your strategy you can say, ‘I’d rather not say.’” The questions asked to the adult participants were: “What strategy did you use to match the cards with the shapes presented side-by-side?” “What strategy did you use to match the cards with the single shapes?” “Did you have any additional strategies to complete the tasks you’d like to share?” These data were analyzed and grouped according to themes; results can be found in Chapter 4. 74 In Experiment 2, a second observer collected interobserver agreement (IOA) data on 79.6% of the trial blocks during tasks in which the computer software did not record the responses (e.g., Analogy Tests and Component Matching Tests). These were the only phases in which the experimenter directly collected data. Agreement or disagreement between these data generated by the experimenter and the second observer’s data was scored on a trial-by-trial basis. The number of agreements on all tasks was divided by the total number of agreements and disagreements, then multiplied by 100% to calculate the IOA coefficients. The IOA across tasks averaged 99.4% (range 99.1-100%) for PT-8, 99.5% (range 98.3-100%) for PT-9, 100% for PT-10, 100% for PT-11, 100% for PT-12, and 99.6% (range 98.8-100%) for PT-13. The second observer also collected treatment integrity (TI) data on 79.3% of the trial blocks during all conditions of Experiment 2. The experimenter’s behavior was coded as either “correct” or “incorrect” on a trial-by-trial basis. Performances considered “correct” would adhere to the methods exactly as written. Performances considered “incorrect” if they deviated from the methods in any way. The number of correctly performed trials was divided by the total number of correct and incorrect trials then multiplied by 100% to calculate the TI coefficients. In Experiment 2, the TI across tasks averaged 99.5% (range 99.1-100%) for PT-8, 99.9% (range 99.2-100%) for PT-9, 99.8% (range 99.0-100%) for PT-10, 100% for PT-11, 99.9% (range 99.4-100%) for PT-12, and 98.5% (range 95.3-100%) for PT-13. 75 Experiment 1 Procedure Pre-Training To ensure that the participants had a basic understanding of similarity and distinction across classes of stimuli and establish a familiarity with the computer based instruction methods, a task was developed in which the individuals were required to select stimuli dependent upon their visual characteristics and the auditory instructions played by the computer software. The stimuli presented consisted of six different pictures banana, an apple, and an orange (See Figure 2.). Of the six images, three participate in the class “animals,” while the remaining three participate in the class “fruit.” Fruit Animals Figure 2. Same and Different Pretest Stimuli. Images to be used during pretest, taken from Microsoft Clip Gallery ©. 76 The participants were expected to associate the images within these classes based on their own learning history; no instructions were given by the experimenter to specifically evoke category-based selection. It was hypothesized that if participants failed to relate these common images based on their class membership, they would struggle to relate the abstract images into arbitrary classes within the experiment proper. Thus, failure on this task would preclude their inclusion in the study. Further, the presentation of the images and the use of the auditory instruction played by the computer was identical to the Component Matching Baseline Pretest and Posttest conditions. At the beginning of the session, the experimenter read the participants the following script: Thanks for coming today. When you tell me you are ready to start we will begin. You will see a picture on the screen. When you click the picture you will see a white square appear and hear the words “Select same” or “Select different.” When you click the white square two pictures will appear. Click on one of these pictures. I won’t tell you if you’re right or wrong and I can’t help you. The harder you try, the faster this will go. Can you tell me what I just said? Once the participant repeated the instructions with reasonable accuracy, they were allowed to begin the testing. Following a standard matching to sample (MTS) preparation, the field consisted of two comparisons stimuli and one sample stimulus. The sample stimulus was a member of the same class as one of the comparison stimuli (e.g., two animal images, one fruit image). The sample stimulus was presented in the top left corner of the computer screen. Once the participant clicked on the stimulus using a computer mouse, the instruction of either “Select same” or “Select different” was given in a voice recording. A blank white 77 square simultaneously appeared to the direct right of the sample. Once the participant clicked the blank square, the two comparison stimuli appeared directly below. All stimuli stayed present on the screen for the duration of the trial. A correct response consisted of selecting the comparison stimulus from within the same class as the sample when “same” was targeted, and selecting the image from the other class when “different” was targeted. Targets and the positions of the stimuli varied randomly on each trial. A block of 12 trials was conducted with praise (e.g., “Keep up the good work!”) interspersed following every four-test trials. The participant started the Component Matching Baseline Pretest once 100% correct responding was displayed within a trial block. A maximum of three trial blocks was conducted; if mastery was never displayed within that time, the participant did not progress in the study. Component Matching Baseline Pre-test As consistent with a multiple probe design, each participant was exposed to the testing procedures used in the Component Matching Post-test prior to Vocal Label Training to demonstrate their prior lack of familiarity with the stimuli. Component relation trials were selected, rather than compound-compound relation trials, as success on these trials would be most indicative of the formation of equivalence classes. Additionally, previous research has indicated that many participants fail analogy tasks unless they have demonstrated the formation of equivalence classes; thus, the absence of equivalence classes can infer the absence of analogy formation (Carpentier et al., 2002, 2003). The first participant in each dyad was exposed to the pretesting until they demonstrated responding meeting the failure criteria, below 92% for two consecutive 78 trial-blocks. When these criteria were met, they progressed to Level 1 Vocal Label Training while the second participant in each dyad continued to be exposed to pretesting. As consistent with a multiple-probe design, even after the second participant met the failure criteria, they were exposed to an additional trial block to verify the steady-state prediction related to the first participant’s performance. If the participant’s performance improved performance, the participant continued to be exposed to additional trial blocks until stabilization occurred. If the scores stabilized below mastery criteria, the participant continued to Level 1 Vocal Label Training. If the score stabilized at a level meeting mastery criteria, the participant no longer continued in the study and a new participant was found to complete the dyad. During the pre-testing trials, the participants were directed to sit in front of the computer. When the adult participants were seated in front of the computer, they were read the following instructions: You will see an image on the screen, when you press it you will hear either “Select same” or “Select different” and see a blank square appear. When you click the blank square, two new images will appear below. Click one of these images that goes best with the one above. I will not tell you if you’re correct or incorrect. Every now and then you’ll see the pictures from an earlier task. Respond to those in the same manner as you did before. The harder you try, the faster this will go. Please repeat back these instructions to me. When the participants repeated the instructions with reasonable accuracy, they were allowed to begin the testing. In the Component Matching Baseline Pre-test condition, and in the Post-test, the components of the compounds for each of the three classes were presented individually on the computer screen as the sample and comparison stimuli. The sample stimulus 79 appeared on the top left corner of the screen. When the participant clicked the figure, they heard a vocal instruction, recorded previously and played automatically by the computer, to “Select same” or “Select different.” At the same time, a blank square appeared to the right of the sample. Selecting the blank square served as an additional observing response to indicate that the participant was attending to the task. Upon clicking the blank square, the comparisons appeared below. When the instruction “Select same” was given, selecting the comparison from within the same numerical class was considered correct. When the instruction “Select different” was given, selecting the comparison from the other numerical class was considered correct. Each unitary stimulus served as the sample on four trials with Class 1 (A1, B1, and C1) and Class 2 (A2, B2, and C2) stimuli randomly mixed within the 24-trial block. The position of the correct comparison was counterbalanced to prevent position bias. Responding at 91% or higher was considered mastery and performance below this criterion considered failure. After four test trials, a Pre-training trial was interspersed and reinforced to maintain responding to the task and keep a consistent rate of reinforcement across testing conditions. Level 1 Vocal Label Training The purpose of this condition was to establish the relations between the stimuli as either “same” or “different” with the use of AB and BC compounds. There are eight possible stimulus configurations of AB and BC compounds between the two numerical classes (See Table 2.); each was presented twice within a block of 16 trials. Based on previous research, once these two relations are established, the participants should correctly respond in the presence of all combinations of these stimuli (BA, CB, AC, and 80 CA) as consistent with symmetry and transitivity (Debert et al., 2007). At the beginning of the session, the experimenter read the following script to the adult participants: You will see a blue square on the screen. When I click on it another image will appear. I will tell you to “Say same” or “Say different” and praise you once you say it. You are free to guess at anytime, but soon I will not tell you what to say right away. I will correct you if your answer is wrong and praise you if your answer is correct. The harder you try, the faster this will go. Please repeat back these instructions to me. Once the participant has repeated the instructions with satisfactory accuracy, they were allowed to initiate trials. As described to the participants, a blue square was shown in the top left corner of the screen. The experimenter held the mouse in this phase to control the pacing of the trials. The experimenter clicked on the blue square, then a compound stimulus appeared on the top right of the screen. Initially, a zero-second delay was utilized until the participant correctly echoed the experimenter on 100% of the trials within a block of 16. Once a correct, unprompted response was emitted, unprompted responses were differentially reinforced following the procedures suggested by Karsten and Carr (2009) to promote independent responding. Unprompted correct responses were reinforced with praise on 100% of the trials, while prompted responses received no praise. A progressive prompt delay (Touchette, 1971) was instituted such that each time the participant responded correctly, prior to the prompt or with the prompt, on 80% of the trials, the prompt was delayed one second. When the participant emitted an incorrect response prior to the prompt the experimenter stated, “This one is same [different]. Repeat same [different].” No praise was provided for these corrected responses. When participants 81 Table 2 Level 1 Compound Stimuli Designations and Responses Stimulus Relation Vocal Response A1B1 Equivalent “Same” A1B2 Non-Equivalent “Different” A2B1 Non-Equivalent “Different” A2B2 Equivalent “Same” B1C1 Equivalent “Same” B1C2 Non-Equivalent “Different” B2C1 Non-Equivalent “Different” B2C2 Equivalent “Same” responded correctly on 100% of the trials, in the absence of the prompt, for two consecutive blocks, they progressed. Level 1 Vocal Label Test To ensure that the responding observed in the Level 1 Vocal Label Training could be sustained under a thinner schedule of reinforcement, the participants were exposed to the same stimuli in a test format. In the final two trial blocks of the Level 1 Vocal Label Training, the participants were receiving reinforcement on 100% of the trials, as they would continue in this phase until 100% correct responding occurred. However, in later tasks, reinforcement would only be delivered on only maintenance trials making up a maximum of 20% of the total trials. Thus, the participants were exposed to the Level 1 82 Vocal Label Test condition to assess for stabilization of responding when reinforcement was only delivered contingent upon responding to maintenance trials interspersed after four test trials. The participants were told the following at the beginning of the task: You will see a blue square on the screen. When I click on it another image will appear. You will say if the image is “Same” or “Different” but I will not tell you if you’re right or wrong. Every now and then you will see a common image from the previous task. Label those images by name. The harder you try, the faster this will go. Can you repeat that back to me? Once the participant repeated the instructions with sufficient accuracy, they were allowed to begin the test. The images appeared in the same manner as in Level 1 Vocal Label Training. The only difference between the conditions was the inclusion of maintenance trials, consisting of the common images used in the Pre-Training, and the modified provision of reinforcement. If responding was below 100% within a trial block, the participant returned to the Level 1 Vocal Label Training condition. They were required to meet the mastery criterion of this condition prior to attempting the Level 1 Vocal Label Test again. This process continued as many times as was necessary for 100% correct responding to occur within the condition. Once this criterion was met, the Level 1 Analogy Test began. Level 1 Analogy Test This phase was designed to assess if the prior vocal label training transferred across repertoires and if these relations were sufficient to produce matching responding consistent with the completion of formal analogies. The naming account predicts that once the vocal response is trained, the selection response should emerge without additional training (Horne & Lowe, 1996). Thus, the participants should be able to look 83 at, point to, or otherwise select the corresponding compound in response to a verbal stimulus of “same” or “different.” Research examining the role of the naming operant has shown that intact labeling and selection responding are both needed for participants to successfully categorize stimuli by their common name (e.g., Miguel et al., 2008). In this way, the response topography required within the analogy testing can be considered a similar performance to categorization as it is hypothesized that the participants will need to both vocally label the sample “same” or “different,” and then select the corresponding comparison that is also “same” or “different.” The experimenter read the following instructions to the participants prior to allowing them to begin: You will see a blue square in the corner of the screen. When you click on this square an image will appear next to it. When you click on this image, two more images will appear below. Click on the one below that goes best with the one above. I will not tell you if you’re right or wrong. Sometimes when you click on the image, only blue squares will be shown below. When that happens, say whether the image you see is “Same” or “Different” then click any of the squares. The harder you try, the faster this will go. Please repeat back these instructions to me. Once the participant repeated the instructions with satisfactory accuracy, they were allowed to begin testing. As described above, a blue square appeared in the upper left corner of the screen. Selecting this square served as the observing response to ensure that the participant was attending to the task. Once the participant clicked on the square, a compound sample stimulus appeared in the upper right corner of the computer screen. When the participant clicked on this image, two comparison compounds appeared below. Of the two 84 comparison stimuli, one had terms from within the same class and one had terms from different classes. Matching the sample and corresponding comparison was considered a correct response, though no reinforcement or feedback was provided. More specifically, when presented with a sample consisting of same terms, a correct response consisted of selecting the comparison with same terms. Conversely, when presented with a sample consisting of different terms, a correct response consisted of selecting the comparison with different terms. These performances were considered demonstrations of analogy responding with the sample acting as the A:B component and the comparison functioning as the C:D component. Essentially, the participants were forming analogies of either same: same or different: different. Important to note, the sample stimulus had no terms in the same positions as the comparisons to prevent non-arbitrarily relations between the stimuli from interfering. Participants were exposed to all possible stimulus combinations within those sets (See Table 3.) with the positions of the correct comparisons randomized so as not to favor either the left or right. Each block consisted of 16 test trials with the order randomized and with a maintenance trial interspersed after every four-test trials. Maintenance trials consisted of Level 1 Vocal Label Training trials; correct responses were reinforced to create a possible overall rate of reinforcement for 20% within each trial-block. Mastery criteria was considered 95% correct responding or higher within one trial block while failure criteria was considered less than 95% correct responding in three consecutive trial-blocks. Upon mastery, the participant was exposed to the Level 2 Vocal Label Test. 85 Table 3 Compound Stimuli Designations by Level Designations Same Different Level 1 A1B1 A2B2 B1C1 B2C2 A1B2 A2B1 B1C2 B2C1 Level 2 B1A1 B2A2 C1B1 C2B2 B1A2 B2A1 C1B2 C2B1 Level 3 A1C1 A2C2 C1A1 C2A2 A1C2 A2C1 C1A2 C2A1 Level 2 Vocal Label Test This task was included to assess for the emergence of vocal labeling in the presence of the novel compounds, BA and CB, which was hypothesized to facilitate analogy responding. The task was presented in the same manner as in the Level 1 Vocal Label Test, except with the use of the compounds related via symmetry and the inclusion of Level 1 stimuli as the maintenance trials. Progression through the experimental was not contingent upon responding in this phase. However, response patterns were used to generate remedial training conditions on a participant-by-participant basis. Following one exposure to this task the participant moved on to the Level 2 Analogy Test. 86 Level 2 Analogy Test This task was identical to the Level 1 Analogy Test, but with the inclusion of Level 2 stimuli. This condition assessed if the Level 1 Vocal Label Training was sufficient in producing emergent analogy responding with symmetry compounds, BA and CB. It was hypothesized that failures in the Level 2 Vocal Label Test would be correlated with errors in the Analogy Test as the ability to identify the images as either “same” or “different” would be foundational to the completion of the analogies. The participant was allowed up to three attempts on the task before he/she was exposed to remedial training. Once mastery criteria were met the participant progressed to the Level 3 Vocal Label Test. Level 3 Vocal Label Test This task was included to assess for the emergence of vocal labeling in the presence of CA and AC compounds as consistent with transitivity. The ability to vocalize whether these compounds were “same” or “different” was assessed along with the completion of analogies with the same stimuli in the Level 3 Analogy Test. Error patterns were compared across the tasks to develop remedial tasks for each participant. The task was presented in the same format as the Level 2 Vocal Label Test. Following one exposure to this task the participant moved on to the Level 3 Analogy Test. Level 3 Analogy Test This task was identical to the Level 2 Analogy Test, but with the inclusion of Level 3 stimuli. The participant was allowed up to three attempts to achieve a passing score. The participants progressed to Component Matching Posttest regardless of failure 87 or passing, however, if failure occurred the participants would later be exposed to remedial training. Component Matching Post-test The purpose of this phase was to determine if the Level 1 Vocal Label Training was sufficient to also establish relations among the individual terms of the compounds as is consistent with the separable compounds account (Stromer, McIlvane, & Serna, 1993). The separable compound account as described by Stromer et al. (1993) suggests that stimulus-stimulus relations of this nature could emerge as a by-product of the teaching procedures utilized within the current study. This procedure will test for the emergence of equivalence among the components of the compound stimuli, essentially, whether the participants derived the separate classes of stimuli based on the differential tact procedure. Important to note, the protocol recommended by Stewart and colleagues (2009) began with teaching equivalence among components and ended with testing for equivalence-equivalence. Thus, the current protocol sought to establish the same skills as the previous literature through a different course of training and testing. When the participants were seated in front of the computer they were read the following instructions: You will see an image in the top left corner of the screen, when you click it you will here “Select same” or “Select different” and a blank square will appear. When you click the blank square two more images will appear. Pick the one that goes best with the one above. I will not tell you if you’re correct or incorrect. Sometimes there will be trials with the familiar images, respond to these as you did before. The harder you try, the faster this will go. Please repeat back these instructions to me. 88 When the participants repeat the instructions with reasonable accuracy, they were allowed to begin the testing. Testing in Component Matching Post-test was identical to the Component Matching Baseline Pre-test. The components of the compounds for each of the three classes were presented individually on the computer screen as the sample and comparison stimuli. The sample stimulus appeared on the top left corner of the screen. When the participant selected the figure, the participant heard a vocal instruction, previously recorded and played automatically by the computer program, to “Select same” or “Select different.” At the same time, a blank square appeared next to the sample stimulus. Following an observing response to this blank square the comparison stimuli appeared below. When the instruction “Select same” was given, a response of selecting the comparison from within the same numerical class was considered correct. When the instruction “Select different” was given, a response of touching the comparison from the other numerical class was considered correct. Each unitary stimulus served as the sample on four trials with Class 1 (A1, B1, and C1) and Class 2 (A2, B2, and C2) stimuli randomly mixed within the 24-trial block. Examples of these trials are shown in Table 4. The position of the correct comparison was counterbalanced across sample presentations to prevent position bias. One trial block conducted with responding 91% or higher was considered mastery. The participant could attempt a maximum of three trial blocks before remedial training began. A maintenance trial with the common images was interspersed every four trials to promote attending within the testing session. Only correct responses on these trials were reinforced. 89 Table 4 Component Stimuli Designations and Example Trials Sample Instruction: “Select…” Comparisons A1 “same” “different” “same” “different” B1(+) B1(-) C1(+) C1(-) B2(-) B2(+) C2(-) C2(+) B1 “same” “different” “same” “different” A1(+) A1(-) C1(+) C1(-) A2(-) A2(+) C2(-) C2(+) C1 “same” “different” “same” “different” A1(+) A1(-) B1(+) B1(-) A2(-) A2(+) B2(-) B2(+) Remedial Component Matching Training When a participant failed any of the Analogy tests or the Component Matching Post-test they were exposed to the Remedial Component Matching Training (RCMT). The failure to derive the Level 3 relations was hypothesized to be due to the failure to view the components of the trained compounds as distinct from one another, leading to failures to form equivalence classes as supposed from the separable compounds account (Stromer et al, 1993). The Component Matching Test was developed to address these difficulties by presenting the A-B and B-C relations individually. The components of the compounds were presented as in the Component Matching tests; however, this training included only the A-B and B-C relations, as these were the Level 1 relations. Following the observing response to the blue square, a component stimulus, either A or B, appeared 90 on the screen. Once the component was selected, two comparison stimuli appeared below on the screen. The experimenter prompted the participant to select the comparison from within the same numerical class as the sample by pointing from the sample to the comparison. The prompting, error corrections, and reinforcement were provided in the same manner as in the Level 1 Vocal Label Training. As in that condition, participants were required to achieve 100% correct responding across two consecutive blocks of 16 trials before they could return to the Level 3 Vocal Label Test. Participants were reexposed to the Level 3 Vocal Label Test as it was hypothesized that improvements in the speaker behavior repertoire would be correlated with improvements in the selection based tasks. Following this test, the participants were exposed to the tasks that they previously failed in the same manner as before. Remedial Level 3 Vocal Label Training When a participant continued to display failure following the first remedial training, they were directly trained to emit the correct vocal labels for the Level 3 stimuli in the Remedial Level 3 Vocal Label Training (RL3VLT). It was hypothesized that this training would produce the Level 3 Analogy responding as consistent with naming (Miguel et al., 2008) and the Component Matching as consistent with the separable compounds account (Stromer et al., 1993). The Level 3 compounds were presented in the same manner as in the Level 1 Vocal Label Training with the same prompting, error correcting, and reinforcement procedures as in that condition. Following two consecutive blocks of 16 trials with 100% correct unprompted responding, the participants were exposed to the previously failed tasks. If failure continued to be displayed, the 91 individual’s participation in the study would be considered over and they would receive their gift card and be debriefed. Experiment 2 Procedure Based on analysis of Experiment 1’s data and procedural limitations, a second experiment was developed to address limitations to the procedure. Specifically, the remedial task presentation was not identical across PT-6 and PT-7, but tailored to their reports and analysis of their data. However, for this procedure to be applied in a variety of settings and by educators of various backgrounds, these tasks should be presented in systematic manner. Additionally, during testing many of the participants vocalized surprise when the Level 3 compounds were presented due to initial unfamiliarity with the images (e.g, “These are different!” “Where are the other ones?”). It was hypothesized that this increased exposure to the software with familiar images would prepare participants for this element of the experimental procedure. The participants could generalize the strategies used with the familiar images to the arbitrary stimuli. Lastly, PT-3 and PT-5 required several exposures to the Component Matching Post-test before mastery was demonstrated, which was correlated with their additional exposure to the stimuli in the Baseline Pre-test. These participants were exposed to double the number of component matching trials before starting the Level 1 Vocal Label Training, which could have led to the development of spurious stimulus-stimulus relations. More explicitly, during the initial Pre-testing, the participants may have been forming rules or relations regarding the class membership of the images, which endured despite the training and testing conditions. These faulty, participant generated relations could have influenced 92 responding during the Post-testing; thus, increasing the number of exposures to the task needed for the correct, trained relations to be displayed. These faulty relations may then have influenced their ability to display correct class relations in the posttest. To address this issue, the baseline condition was changed. The tasks and order of presentation for Experiment 2 are shown in Table 5. Pre-Training As in Experiment 1, the images presented in this task consisted of common images related based on actual category membership in order to familiarize the participants with the procedure without having to train arbitrary relations (See Figure 2 for images and classes). As opposed to Experiment 1, the Pre-training condition was extended to approximate every condition in the experiment proper. Thus, compounds of the images were presented with the same level designations as the abstract stimuli. Level 1 compounds consisted of the AB and BC relations, Level 2 compounds consisted of the BA and CB relations, and Level 3 compounds consisted of the AC and CA relations. The stimuli were trained and tested in the same sequence as the abstract stimuli in Experiment 1 and 2 with the same mastery criteria across all phases. Once a participant passed all tasks, they immediately began the experiment with abstract images. If a participant failed to master the tasks with the common images, they would not continue in the study. Level 3 Vocal Label Pretest As in Experiment 1, a multiple probe design was utilized to demonstrate each participant’s prior lack of familiarity with the stimuli and the validity of the independent 93 Table 5 Pre-Training Task Order in Experiment 2 Stimuli Level Task Trials Per Block Mastery Criterion Common Images Pre-Training Level 3 Compounds Vocal Label Pretest 16 Level 1 Compounds Vocal Label Training 16 Two consecutive trial blocks of 100% correct responding Level 1 Compounds Vocal Tact Test 16 One trial block of 100% correct responding Level 1 Compounds Analogy Test 16 One trial block of 93% correct responding Level 2 Compounds Vocal Label Test 16 Level 2 Compounds Analogy Test 16 Level 3 Compounds Vocal Label Test 16 Level 3 Compounds Analogy Test 16 One trial block of 93% correct responding All Abstract Images Component Matching Test 24 Two trial blocks with 92% correct responding One trial block of 93% correct responding variable. The first participant in the dyad was required to complete a minimum of one block of 16 trials. If the participant displayed responding above chance level (60% or 94 higher), they were required to complete additional blocks to assess for stabilization of this performance. The second participant in the dyad was required to complete one more block than the first participant to demonstrate the strength of the prediction that continued exposure to the task would not lead to mastery of the relations. If either participant’s responding trended upward or stabilized above 75%, they did not continue in the study. The task was conducted in a manner nearly identical to the Level 3 Vocal Label Test in Experiment 1, with the exception that the maintenance trials were composed of the common image compounds from Pre-Training. Previously the maintenance trials had required a different response type (e.g., component matching) than the tested response (e.g., vocal labeling). This switch in repertoires was hypothesized to inhibit generalization from the Pre-Training. Following this condition the participants were immediately exposed to the Level 1 Vocal Label Training. Level 1 Vocal Label Training This condition was the same as Experiment 1. Level 1 Vocal Label Test This condition was the same as Experiment 1. Level 1 Analogy Test This condition was the same as Experiment 1. Level 2 Vocal Label Test This condition was the same as Experiment 1. Level 2 Analogy Test This condition was the same as Experiment 1. 95 Level 3 Vocal Label Test To allow for a more thorough examination of error patterns, this condition was modified slightly from Experiment 1. If participants accurately vocally labeled “same” and “different” on less than 75% of the trials, the test was repeated a second time. This repetition allowed the experimenters to examine if errors made on the first attempt continued on the second attempt, and furthermore, if these errors were related to those made in later testing. Secondly, if the initial failure was due to surprise evoked by the AC and CA relations, the additional practice could create the opportunity for improved performance to occur. The participants could habituate to the new images, and then engage their trained repertoires to emit correct vocal labels. The 75% cut-off was determined based on results from Experiment 1; all participants that scored at or above this proficiency passed the Level 3 Analogy Test and Component Matching tests. Following the second exposure to the condition, the participants proceeded to the Level 3 Analogy as in Experiment 1. Level 3 Analogy Test This condition was the same as Experiment 1. Component Matching Test This condition was the same as Experiment 1. Remedial Component Matching Training This condition was the same as Experiment 1. Remedial Level 3 Vocal Label Training This condition was the same as Experiment 1. 96 Chapter 4 RESULTS In this chapter, findings from Experiment 1 and Experiment 2 are presented in sequential order. First, quantitative data on demographic information collected from the participants are presented including: age, gender, degrees held, and degree area. Second, quantitative data collected related to participants’ performances on tasks are presented including: pretesting, component matching, trials to criteria, vocal label tests, and analogy tests. Lastly, qualitative data related to participants’ verbal self-reports are presented and organized according to the problem solving strategies employed. Experiment 1 Participants’ Demographic Information Information related to the participants ages, gender, degree completion, area of study, and grade point average (GPA) are shown in Table 6. Table 6 Experiment 1 Participant Characteristics Demographic Information PT-2 PT-3 PT-4 PT-5 PT-6 PT-7 Age 25 31 23 23 24 31 Gender M M F F F F Degree Progress B.A. B.A. B.A. 2.5 years A.A. 2.5 years B.A. M.A. Degree Area Psychology English Psychology General Education Psychology Behavior Analysis GPA 3.72 2.75 3.0 2.61 3.2 3.75 97 Experiment 1 Participants’ Task Performance Participants’ performances across test conditions are displayed in Figures 3, 4, and 5. PT-2’s results are displayed in the top panel of Figure 3; PT-3’s results are displayed in the bottom panel of Figure 3. PT-4’s results are shown in the top panel of Figure 4; PT-5’s results are shown in the bottom panel of Figure 4. PT-6’s results are displayed in the top panel of Figure 5; PT-7’s results are displayed in the bottom panel of Figure 5. Performances on individual tasks are discussed in detail in sections below. 98 Figure 3. PT-2’s and PT-3’s Performance Across Test Conditions. PT-2’s data are displayed in the top panel; PT-3’s in the bottom panel. Stimuli levels are differentiated by shape: squares for Level 1, triangles for Level 2, circles for Level 3, and diamonds for Component Matching. Filled shapes for analogy tests, open for vocal label (VL) tests. 99 Figure 4. PT-4’s and PT-5’s Performance Across Test Conditions. PT-4’s data are displayed in the top panel; PT-5’s in the bottom panel. Stimuli levels are differentiated by shape: squares for Level 1, triangles for Level 2, circles for Level 3, and diamonds for Component Matching. Filled shapes for analogy tests, open for vocal label (VL) tests. 100 Figure 5. PT-6’s and PT-7’s Performance Across Test Conditions. PT-6’s data are displayed in the top panel; PT-7’s in the bottom panel. Stimuli levels are differentiated by shape: squares for Level 1, triangles for Level 2, circles for Level 3, and diamonds for Component Matching. Filled shapes for analogy tests, open for vocal label (VL) tests. 101 Pre-training All participants demonstrated success on the Pre-Training within one to two attempts. PT-2, PT-5, PT-6, and PT-7 achieved 100% correct upon first exposure to the test. PT-3 and PT-4 made one error on first exposure, yielding a score of 92% correct. Upon their second attempt, both participants responded correctly on 100% of trials. Component Matching Table 7 displays results from the Component Matching Pre-tests and Post-tests while Figure 6 shows the errors made during the Post-tests. Table 7 Experiment 1 Results- Component Matching Component Matching Pretest PT-2 PT-3 PT-4 PT-5 PT-6 PT-7 38% 42% 29% 33% 38% 58% 38% Component Matching Post-test 100% 83% 100% Post- Remedial Component Matching Training 33% 100% 79% 50% X 92% 80% X 67% 100% Post- Remedial Level 3 Vocal 96% Label Training Note: The X indicates that the trial block was not conducted due to the participant stating they did not believe their performance would improve and did not wish to repeat the task. Pre-test. No participant displayed sustained responding above chance level, nor were improvements in percentage of correct responses observed when the test was 102 repeated. PT-3, PT-5, and PT-6 received an additional trial block as compared to their dyad member (PT-2, PT-4, and PT-5 respectively). Upon first exposure to the task, the participants demonstrated percentages at or below chance level: PT-2 with 38%, PT-3 with 42%, PT-4 with 29%, PT-5 with 33%, PT-6 with 38%, and PT-7 with 58%. No improvements were observed on the second exposure to the task, with PT-3 achieving 38%, PT-5 achieving 33%, and PT-7 achieving 50%. Their failure to increase proficiency upon repeated exposures to the stimuli confirmed the prediction that practice could not lead to acquisition of the relations in the absence of specific training. Post-test. A correlation between length of pre-test exposure and the number of blocks needed to meet mastery criteria of the condition was observed with both PT-3 and PT-5. Both participants were exposed to two blocks of 24 test-trials prior to the Level 1 Vocal Label Training, and both these participants required two attempts to score above 91% correct on the post-test; PT-3 scored 83% upon first exposure (100% on second) and PT-5 scored 79% upon first exposure (92% on second). The errors made in the Post-test had a strong correlation with errors from the Pre-test (gray columns), as shown in Table 7. Of PT-3’s five errors during the Post-test, four of those errors occurred on relations that were failed in the Pre-tests (80%). All seven of the PT-5’s errors (100%) were made on relations that were also failed during the Pre-tests. This error correlation is strongest for both participants though the correlation with relation type (white column) is also significant. Level 3 relations accounted for three (60%) of PT-3’s errors and seven (100%) of PT-5’s errors. Failures during the Level 3 Vocal Label Test could account for two of PT-3’s errors (40%) and five of PT-5’s errors (71%). 103 Their dyad counterparts, PT-2 and PT-4, passed the Post-test with correct responses on all trials having only been exposed to these stimuli for one block in the initial pre-test. PT-6 was not exposed to the Component Matching Post-test immediately following failure on the Level 3 Analogies Test, but should have been to assess if the Level 1 and Level 2 relations were demonstrated when the figures were presented alone. Frequency of Errors The Component Matching Post-test was conducted with PT-6 following the Remedial Participants Figure 6. Component Matching Error Correlations in Experiment 1. Total errors (y-axis) on Component Matching Post-test(s) shown across participants (x-axis) in Experiment 1. From left to right: black columns are total errors, white columns are errors on Level 3 relations, charcoal columns are errors on Level 1 and Level 2 relations, light gray columns are errors correlated with vocal label errors, and gray columns are errors correlated with Component Matching Pre-test errors. 104 Component Matching Training (RCMT); an overall score of 67% was achieved with 100% of the errors occurring on Level 3 relations. Only after the Remedial Level 3 Vocal Label Training (RL3VLT) did PT-6 demonstrate A-C and C-A relations, scoring 96% overall. Similarly, PT-7 scored 80% on the Post-test, with all errors occurring on Level 3 relations. However, following RCMT, PT-7 demonstrated emergence of these untrained, transitive relations as evidenced by the score of 100% on the task. The errors on the Posttest were also strongly correlated with failures during the Vocal Label Test for PT-6 and PT-7. PT-6 failed to vocally label 100% of the relations that were later failed during the Post-test, and PT-7 failed to label 80% of the relations. Errors made during the Pre-test accounted for only 60% of the Post-test errors for PT-6, but 80% of the errors for PT-7. Training Trials to Criteria Table 8 displays the total number of trials required to meet mastery criteria across the Level 1 Vocal Label Training, RCMT, and Level 3 Vocal Label Training. Table 8 Experiment 1 Results- Trials to Criteria During Training Tasks PT-2 PT-3 PT-4 PT-5 PT-6 PT-7 Level 1 Vocal Label Training Blocks 144 176 336 240 176 208 Remedial Component Matching Training 64 48 Remedial Level 3 Vocal Label Training 48 105 Level 1 Vocal Label Training. All participants passed the Level 1 Vocal Label Training within 144-336 trials. The number of trials to meet mastery criteria in this task did not correlate with success or failure on later testing. PT-4 required the most trials (336), but passed all later tests in the absence of retraining. PT-2 required the least number of trials (144), followed by both PT-3 and PT-6 (176), PT-7 (208), PT-5 (240). While PT-6 required the second lowest number of trials to meet criteria, the participant required two additional types of training to demonstrate analogy and component matching responding with all class members. PT-7 also required remedial training to pass all tests, but required only an average number of trials, 208, in the Level 1 Vocal Label Training. Component Matching Training. PT-6 and PT-7 were the only participants to be exposed to the RCMT. Both passed the test with relative ease, requiring only 64 and 48 trials respectively. PT-7 went on to pass the analogy and component matching tests following this training, while PT-6 continued to fail. Level 3 Vocal Label Test. PT-6 was the only participant in Experiment 1 to be exposed to the RL3VLT. PT-6 required 48 training trials to meet criteria on the eight relations, substantially fewer than to master the Level 1 relations. Vocal Label Tests Table 9 displays results from the Level 1, Level 2, and Level 3 Vocal Label Tests. Level 1. All participants passed the Level 1 Vocal Label Test with 100% correct responding on their first attempt. This was the only Vocal Label Test to require 100% 106 correct responding for progression to further tasks, but all participants met these criteria on first exposure. Level 2. All participants passed the Level 2 Vocal Label Test with 100% correct responding on their first attempt. Level 3. Responding differed substantially on the Level 3 Vocal Label Test, as shown in Table VLT. PT-4 achieved the highest score of 94%, followed by PT-5 with 88%, PT-2 with 81%, PT-3 with 75%, and PT-6 achieving the same score as PT-7 with only 38%. PT-7’s score improved dramatically following the RCMT, reaching 94% correct. PT-6’s scores worsened after the RCMT, falling to 13%. During both exposures to the test, PT-6 made errors with all eight stimuli, but the highest number of errors occurred with the A1C1 and C1A1 relations (5 and 4 respectively). PT-7 never made errors with the C1A2 and A2C1 stimuli, but did so on all others. Table 9 Experiment 1 Results- Vocal Label Tests PT-2 PT-3 PT-4 PT-5 PT-6 PT-7 Level 1 100% 100% 100% 100% 100% 100% Level 2 100% 100% 100% 100% 100% 100% Level 3 81% 75% 94% 88% 38% 38% 13% 94% Post- Remedial Component Matching Training 107 Analogy Test Table 10 displays results from the Level 1, Level 2, and Level 3 Analogy Tests. Figure 7 shows the correlation with various error patterns and correlation with Vocal Label failures. Table 10 Experiment 1 Results- Analogy Tests PT-2 PT-3 PT-4 PT-5 PT-6 PT-7 Level 1 94% 100% 100% 100% 100% 100% Level 2 94% 100% 100% 100% 100% 100% Level 3 88% 100% 88% 94% 56% 63% 81% 75% 44% X 100% 100% 50% X X 94% Post-Remedial Component Matching Training Post-Remedial Level 3 Vocal 100% Label Training Note: The X indicates that the trial block was not conducted due to the participant stating they did not believe their performance would improve and did not wish to repeat the task. Level 1. All participants passed on first exposure. PT-3, PT-4, PT-5, PT-6 and PT7 scored 100% correct, while PT-2 made one error and scored 94%. Level 2. All participants passed the Level 2 Analogy Test upon first exposure. PT2 scored the lowest with 94% correct, while all other participants scored 100%. 108 Level 3. PT-3 and PT-5 passed on first exposure, scoring 100% and 94% accordingly. PT-2 required three attempts before achieving mastery criteria (88%, 81%, then 100%), consistently failing trials in which the correct comparison was a reversal of the sample (e.g., A1C1 to C1A1). Johnson and Sidman (1993) described this pattern of behavior as “responding away” from a comparison in the presence of a certain sample. For example, in the presence of A1C1, the participant responds away from the C1A1 to select the C1A2. This error pattern, to be called the visual reject relation, accounted for 100% of PT-2’s failures during the task, while errors during the Vocal Label Test were correlated with 43% of PT-2’s errors. PT-4 demonstrated a different error pattern, called the visual select relation, by consistently selecting the comparison with a term in common to the sample (e.g., A1C2 with C2A2). Notably, this strategy would lead to correct responding on all trials when the correct comparison was a reversal of the sample (e.g., C1A1 to A1C1). The visual select relation accounted for all six errors made during the Analogy Tests, though failure during the Vocal Label Test was also correlated with five of those errors. PT-4’s scores fluctuated from 88% to 75%, then reached 100%. PT-6 initially scored 56%, then 44%, and finally 50% before being exposed to the RCMT and RL3VLT. The Analogy Test was not repeated following the RCMT as PT-6’s Level 3 Vocal Label Test score worsened following this task. It was hypothesized that scores would show no improvement and take a significant portion of time to accomplish. The Analogy Test was repeated following the RL3VLT, at which time a score of 100% was achieved. PT-6 made a total of 23 errors during this task, all of which could be accounted for by the visual select relation described above. 109 PT-7 scored 63% on first exposure to the task, with the visual select relation error pattern accounting for all seven errors made. Vocal label failures accounted for only 6 of the errors during this task. Based on the consistency of the error pattern and the participant’s self-report of chance responding, the full failure criteria were not required before progressing to later conditions. Following the first failure, rather than requiring three, the participant went on to the Component Matching Post-test. Once the RCMT was Frequency of Errors conducted, the participant was able to correctly form the analogies on 94% of the trials. Experiment 2 Results Participants Participants Figure 7. Level 3 Analogy Error Correlations in Experiment 1. Total errors (y-axis) on Level 3 Analogy Test(s) shown across participants (x-axis) in Experiment 1. From left to right: black columns are total errors, white columns are errors consistent with visual select relation, charcoal columns are errors consistent with visual reject relation, and light gray columns are errors correlated with vocal label errors. 110 Experiment 2 Participant’s Demographic Information Information related to the participants ages, gender, degree completion, area of study, and grade point average (GPA) are shown in Table 11. Table 11 Experiment 2 Participant Characteristics Demographic Information PT-8 PT-9 PT-10 PT-11 PT-12 PT-13 Age 21 26 21 22 27 21 Gender F F F F M F Degree Progress 3 years 3 A.A.s, 4 years 4 years B.A. 3 years Degree Area Psychology Psychology Sociology Psychology Art & Design Psychology GPA 3.3 3.5 2.75 3.5 3.5 4.0 Experiment 2 Participants’ Task Performance Participants’ performances across test conditions are displayed in Figures 8, 9, and 10. PT-8’s results are displayed in the top panel of Figure 8; PT-9’s results are displayed in the bottom panel of Figure 8. PT-10’s results are shown in the top panel of Figure 9; PT-10’s results are shown in the bottom panel of Figure 9. PT-11’s results are displayed in the top panel of Figure 10; PT-12’s results are displayed in the bottom panel of Figure 10. Performances on individual tasks are discussed in detail in sections below. 111 Figure 8. PT-8’s and PT-9’s Performance Across Test Conditions. PT-8’s data are displayed in the top panel; PT-9’s in the bottom panel. Stimuli levels are differentiated by shape: squares for Level 1, triangles for Level 2, circles for Level 3, and diamonds for Component Matching. Filled shapes for analogy tests, open for vocal label (VL) tests. 112 Figure 9. PT-10’s and PT-11’s Performance Across Test Conditions. PT-10’s data are displayed in the top panel; PT-11’s in the bottom panel. Stimuli levels are differentiated by shape: squares for Level 1, triangles for Level 2, circles for Level 3, and diamonds for Component Matching. Filled shapes for analogy tests, open for vocal label (VL) tests. 113 Figure 10. PT-12’s and PT-12’s Performance Across Test Conditions. PT-12’s data are displayed in the top panel; PT-13’s in the bottom panel. Stimuli levels are differentiated by shape: squares for Level 1, triangles for Level 2, circles for Level 3, and diamonds for Component Matching. Filled shapes for analogy tests, open for vocal label (VL) tests. 114 Pre-Training Results from each Pre-Training task for all participants are displayed in Table 12. Participants passed the Level 1 Vocal Label Training with the common images after a range of 48-160 trials. Across participants, PT-8 and PT-9 passed after 48 trials, PT-12 with 64, while PT-11 and PT-13 required 80 trials. Notably, PT-10 required the most trials (160) and resisted following the experimenter’s prompts to vocally label “same” in the presence of compounds from within the same class. For example, the participant would state, “Fine, but they are not same. An apple and an orange are not same.” The experimenter provided the directions a second time for the condition and added that during “training” conditions, the answers that are prompted will always be correct. After this discussion, the participant vocalized, “Oh they’re fruit and animals,” referring to the stimuli by their class names. From that point on all PT-10’s responses were correct during Pre-Training. Following the Level 1 Vocal Label Training, all participants passed all tasks. 115 Table 12 Experiment 2 Results- Pre-Training PT-8 PT-9 PT-10 PT-11 PT-12 PT-13 Level 1 Vocal Label Training Trials 48 48 160 80 64 80 Level 1 Vocal Label Test 100% 100% 100% 100% 100% 100% Level 1 Analogy Test 100% 100% 100% 100% 100% 100% Level 2 Vocal Label Test 100% 100% 100% 100% 100% 100% Level 2 Analogy Test 100% 100% 100% 100% 100% 100% Level 3 Vocal Label Test 100% 100% 100% 100% 100% 100% Level 3 Analogy Test 100% 100% 100% 100% 94% 100% Component Matching Test 100% 100% 100% 100% 100% 100% Training Trials to Criteria Table 13 shows the number of trials required for participants to meet mastery criteria for the Level 1 Vocal Label Training, Remedial Component Matching Training (RCMT) and Remedial Level 3 Vocal Label Training (RL3VLT). Level 1 Vocal Label Test. The number of trials ranged from 84-320 with limited correlation between length of training and mastery of tasks. PT-10 required 84 trials to meet mastery criteria and was the only participant to pass all tests in the absence of remedial training. PT-13 required the second least number of training trials, 96, and was the only participant to not pass every presented task. PT-8 required 192 trials to learn the Level 1 relations but later passed all tasks following only the RCMT. PT-11 and PT-12 116 Table 13 Experiment 2 Results- Trials to Criteria During Training Tasks PT-8 PT-9 PT-10 PT-11 PT-12 PT-13 Level 1 Vocal Label Training Blocks 192 304 84 320 208 96 Remedial Component Matching Training 48 48 48 80 64 96 80 96 64 Remedial Level 3 Vocal Label Training showed emergence of some relations in later testing, they required 320 and 208 training trials respectively. PT-9 required both remedial trainings to demonstrate all tested relations and required 304 training trials. Remedial Component Matching Training. PT-8, PT-9, and PT-10 required only 48 trials, three blocks of 16, to pass. However, PT-13 required an additional block, 64 trials total, and PT-12 an additional two, 30 trials total. PT-8 was the only participant to go on to master all previously failed tasks. Remedial Level 3 Vocal Label Test. PT-11 passed after 80 trials, while PT-9 and PT-12 required 96 to meet mastery criteria. Notably, the number of trials to master these relations was nearly half what was required to train the Level 1 relations. PT-13 required the least number of trials, 64, to master the Level 3 stimuli, but engaged in a unique error pattern with the maintenance stimuli throughout later conditions. During the training, PT13 vocalized, “Oh they’re all opposite” and proceeded to pass all test trials. This vocalization and error pattern indicate that PT-13 developed a rule of opposition during 117 this task such that she would emit the response opposite of what she believed was correct (e.g., she believed A1C2 was “same” so she would give the opposite response and say “different.”). This led to her mastering the task while failing all maintenance trials as she emitted opposite responses from this point on during testing. Vocal Label Tests Results from the Level 1, Level 2, and Level 3 Vocal Label Tests are displayed in Table 14 below. Table 14 Experiment 2 Results- Vocal Label Tests Level 1 PT-8 PT-9 PT-10 PT-11 PT-12 PT-13 100% 94% 94% 100% 100% 100% 100% 100% Level 2 100% 100% 100% 94% 100% 81% Level 3 38% 50% 100% 50% 25% 31% 44% 50% 50% 19% 31% 100% 50% 50% 75% 0% 50% 50% 100% 100% Post- Remedial Component Matching Training Post- Remedial Level 3 Vocal Label Training 0% 100% 100% Level 1. PT-8, PT-11, PT-12, and PT-13 passed the Level 1 Vocal Label Test upon first exposure. PT-9 and PT-10 each made one error and were required to return to 118 the Level 1 Vocal Label Training and meet mastery criteria again. On second exposure, both participants achieved 100% correct responding and continued to the next task. Level 2. All participants demonstrated proficiency in vocally labeling the Level 2 stimuli, as shown in Table 14. PT-11 made an error, leading to a score of 94%; while PT13 made three errors, leading to a score of 81%. All other participants achieved 100% correct responding. Level 3. Only PT-10 demonstrated proficiency labeling the Level 3 stimuli upon first exposure to the task; scoring 100% correct on the task. PT-8, PT-9, PT-11, PT-12, and PT-13 all labeled below 75% on their first attempt so they were exposed to the task a second time to assess for stabilization. PT-8 labeled 38% then 44% of the stimuli correctly, however, PT-8 labeled all images correctly following the RCMT. 84% of the errors occurred on same compounds, as PT-8 favored the response “different.” Similarly, PT-9 and PT-11 labeled 50% of the stimuli correctly on both attempts favoring the response “different.” PT-9 and PT-11’s responding remained unchanged following the RCMT; they continued to correctly identify 50% of the stimuli. This pattern, labeling “different” on same stimuli, accounted for 91% of PT-9’s errors and 100% of PT-11’s errors. They both displayed 100% accuracy following the RL3VLT. PT-12 labeled 25% then 19% of the stimuli correctly without clear trends in the response pattern. Following the RCMT, PT-12 correctly labeled 75% (the minimum criterion) such that the test did not have to be repeated. However, 100% accuracy was only exhibited after the RL2VLT. PT-13 labeled 31% of the stimuli correctly on both attempts, then 0% of the images on both attempts after the 119 RCMT. This was achieved by PT-13 labeling all same compounds as “different” and all different stimuli as “same.” Following the RL2VLT, PT-13 scored 100% correct. Analogy Test Table 15 displays results from the Level 1, Level 2, and Level 3 Analogy Tests across participants. Figure 11 shows the errors that occurred on those tasks as correlated with various strategies. Table 15 Experiment 2 Results- Analogy Tests Level 1 PT-8 PT-9 PT-10 PT-11 PT-12 PT-13 100% 100% 88% 81% 94% 88% 100% 75% 100% 100% Level 2 100% 100% 100% 100% 100% 100% Level 3 50% 50% 100% 50% 69% 81% 44% 44% 44% 69% 100% 50% 50% 50% 50% 100% 63% 63% 63% 50% 50% 31% 50% 50% 38% 94% 100% 100% Post-Remedial Component Matching Training Post-Remedial Level 3 Vocal Label Training 120 Level 1. PT-8, PT-9, and PT-12 required only one exposure to meet mastery criteria, scoring 100%, 100% and 94% respectively. PT-10 and PT-13 failed on their first attempts with 88% correct. However, upon second exposure they passed with 100% correct. PT-11 required three attempts to reach mastery criteria, scoring 81%, 75%, then 100%. Level 2. All participants passed the test on first exposure with 100% correct. Level 3. The majority of the participants failed the Level 3 Analogy Test on their first attempt. PT-10 was the only participant to pass the test on first exposure, and did so with no errors. PT-13 did not meet mastery criteria during the initial Level 3 Analogy Test with a score of 81%, but passed on second exposure with a score of 100%. Of the errors made by PT-13, they could all be accounted for by a visual select relation error pattern or correlation with prior failure on the Vocal Label Test. PT-8, PT-9, and PT-11 demonstrated an identical pattern of responding during all three exposures to the task prior to remedial training scoring 50%, 44%, then 50%. The visual select relation error pattern accounted for 100% of PT-8’s errors while prior failures during the Vocal Label Test could account for up to 90% of the errors. No errors occurred following the RCMT. PT-9 and PT-11 again had identical responses on all trials based on the visual-select relation described above (e.g., selecting the comparison with a term in common with the sample), they each matched 63%, 50%, then 50% of the corresponding comparisons to the samples. However, 71% of PT-9’s errors occurred with stimuli that were failed during the Vocal Label Test, while this correlation was only 49% 121 for PT-11. Following the RL3VLT, PT-9 correctly related 92% of the images and PT-11 made no errors at all. PT-12 failed the task with scores of 69%, 69%, then 50% and the RCMT led to lower overall scores (63%, 31%, then 38%). Only after the RL3VLT did PT-12 reach 100% correct. 40% of the errors could be accounted for by the visual select relation and the remaining 60% by the visual reject relation. Prior failure on the Vocal Label Test was Frequency of Error correlated with 96% of the errors. Participants Participants Figure 11. Level 3 Analogy Error Correlations in Experiment 2. Total errors (y-axis) on Level 3 Analogy Test(s) shown across participants (x-axis) in Experiment 2. From left to right: black columns are total errors, white columns are errors consistent with visual select relation, charcoal columns are errors consistent with visual reject relation, and light gray columns are errors correlated with vocal label errors. 122 Component Matching Table 16 displays the performance of the Experiment 2 participants on the Component Matching task. Figure 12 shows the error patterns observed across participants. PT-10, PT-11, and PT-12 demonstrated mastery with relations between the single images by scoring 100%, 96%, and 96% respectively. Table 16 Experiment 2 Results- Component Matching Component Matching Post- Remedial Component Matching Training Post- Remedial Level 3 Vocal Label Training PT-8 PT-9 PT-10 PT-11 PT-12 PT-13 88% 75% 100% 96% 96% 63% 88% 80% 67% 100% 88% 67% 80% 67% 100% 29% 29% PT-8 failed with two consecutive scores of 88%. PT-8’s errors all occurred with Level 3 relations, though a strong correlation was observed with failures on the Vocal Label Test (83%). Following the RCMT, PT-8 related all the images correctly, scoring 100% on the task. PT-9’s overall scores were 75% then 80%, then 88% and 80% following the RCMT. PT-9 made errors with all relations of stimuli, but most frequently with Level 3 Stimuli. 16 of the 19 errors occurred with Level 3 stimuli (84%); these 16 errors were Frequency of Errors 123 Participants Figure 12. Component Matching Error Correlations in Experiment 2. Total errors (yaxis) on Component Matching Post-test(s) shown across participants (x-axis) in Experiment 2. From left to right: black columns are total errors, white columns are errors on Level 3 relations, charcoal columns are errors on Level 1 and Level 2 relations, and light gray columns are errors correlated with vocal label errors. 100% correlated with failure on the Vocal Label Test. No errors occurred after the RL3VLT. PT-13 correctly related 63% then 67% of the images. No changes in responding were observed following the RCMT with scores of 67% across two blocks. PT-13’s responding changed substantially following the RL3VLT, correctly relating only 29% of the images across two consecutive trial blocks. PT-13’s errors were distributed across the stimulus levels, however, all errors with the Level 1 and 2 relations occurred following 124 the Level 3 Vocal Label Training. The errors with the Level 3 stimuli were 100% correlated with failure to correctly vocally label the images. As PT-13 met failure criteria on the task, her participation in the study ended and she proceeded to debriefing in the same manner as the other participants. Participants’ Verbal Self-Reports After each participant completed all tasks within the study, they were asked to describe the strategies used during their performance. These responses were transcribed by the experimenter and combined with vocalizations made by the participants during their actual work on the various tasks. Following review of these data, the experimenter grouped the responses into themes related to the various strategies the participants stated that they employed. The strategies are also listed in Table 17 below and correlated to each participant. Table 17 Experiment 2 Results- Self-Reported Strategies Strategy Type PT-8 Visual Properties PT-9 PT-10 X Visual Imagining PT-11 PT-12 PT-13 X X X X X X X X X Named Images X Named Relations X Named Classes X Verbal Rules X X X X 125 Visual Properties. PT-9, PT-11, PT-12 and PT-13 reported using strategies related to the visual properties of the experimental stimuli. PT-11 stated that she memorized the combined figures exactly as presented, to the point that the Level 2 compounds were challenging with the switched positions. PT-9 stated that she believed C1, A2, and C2 were related based on the shapes bearing similarity to the buttons on a VCR. She also stated that at one point she guessed that the images with sharp edges (e.g., B1, C1, B2, and C2) were related while the images with rounded edges were related (e.g., A1 and A2). The participant reported that the size of the figures also influenced her decisions to relate the images; figures of similar size were believed to be same (e.g., A2C1). PT-13 also reported this type of visual similarity (i.e., size) as influencing her guesses at the relations between the images. PT-12 noted that he believed the direction that the “triangle” (C2) was pointed was related to the task because it reminded him of an arrow. When the triangle pointed to a shape, they were “same,” which could only occur when C2 was shown on the right. Visual Imagining. PT-10 was the only participant in Experiment 2 to report using a visual imagining strategy. PT-10 stated that she imagined separating the compound images then reassembling them to identify which were in the same classes. Furthermore, PT-10 reported that upon seeing a simple image she could imagine the two images that went with it, which functioned to guide her determinations of “same” or “different.” Named Images. Across all participants, A1 was reported to be called: “cloud,” “flower,” “clover,” and “swimmer’s head.” B1 was called: “negative,” “L-shape,” “upside-down L,” “angle,” “90-degree angle,” “right-angle,” “corner,” and “seven.” C1 126 was consistently referred to as “rectangle” and “square,” as C2 was always called “triangle.” A2 was dubbed “oval,” “circle,” “oblong,” and “bald head.” B2 was named, “positive,” “plus,” and “plus sign.” These were the most frequent vocalizations for the participants, often stating their desire to identify the correct relationship for the images (e.g., “Oh it’s the L-shape with the clover.”). Named Relations. PT-8 and PT-10 stated that they utilized strategies in which they covertly verbalized the relations between the presented images. PT-8 built upon her strategy of labeling B2 as “positive” and B1 as “negative” by extending these labels to the novel AC and CA compounds. PT-8 stated that in the presence of A2C2, for example, she would think about the relations those images had with B2 (e.g., “That one goes with positive, that one goes with positive.”) and use this to determine if the images were “same” or “different.” This label differed slightly from naming the class membership as she fully described the stimulus’ relationship with the B-term. PT-10’s reports were similar, however, PT-10 omitted individual names for the stimuli in her description of her performance. PT-10 said that she covertly was stating the relations to herself, “This one goes with this one- so if it’s not there I know they’re different.” Named Classes. PT-8 was the only participant to identify the stimuli by class membership. Building upon the names PT-8 gave to the B1 and B2 stimuli and the verbal behavior related to the relationship between those terms, the participant extended those names to the actual stimuli. Thus, A1 and C1 were referred to as “negative” based on their relationship to B1, as A2 and C2 were referred to as “positive” based on their relationship to B2. PT-8 stated that as she grew familiar with the images she would not 127 use the image specific names or describe the relationship to the B term, instead she would immediately label the class name to facilitate her responding. Thus, during the analogy task she would label the sample to herself, then seek the comparison that evoked the same label (e.g., “same-same” or “different-different”). Verbal Rules. PT-8, PT-12 and PT-13 reported several rules they derived related to their performance on the various tasks. PT-8 and PT-12 both specifically stated that when presented with tasks for a second or third time they inferred that they had been previously responding incorrectly. Thus, they would correspond their behavior to this observation by changing the strategy that they had been employing, regardless of whether they believed the strategy to be viable. PT-13 indicated that she was constantly assessing for rules related to the testing conditions. During the RL3VLT, she stated aloud, “Oh, they’re opposite,” and proceeded to correctly label all AC and CA compounds, going on to match their components correctly as well. However, in correspondence with her rule that all the relationships she’d previously learned were opposite, she failed all AB, BA, BC, CB and Maintenance trials. During the debriefing period, PT-13 stated that she believed she had learned the correct AC and CA relations (considering A1C2, A2C1, C2A1, and C1A2 as “same”) such that when the actual class relations were trained she deduced that the experimenter had completely changed the testing conditions. Thus, she corresponded her responding to the “opposite” rule, even to the point of selecting an animal when shown a fruit and told to “Select same.” 128 Summary of Results The current procedure was successful with the majority of the 12 participants; only PT-12 failed to display all emergent relations. Five participants, PT-2, PT-3, PT-4, PT-5, and PT-10, successfully passed all tests for emergent relations following only Level 1 Vocal Label Training. Only one remedial training was necessary for PT-7 and PT-8 to demonstrate mastery level responding. The remaining participants, PT-6, PT-9, PT-11, and PT-12, passed following two remedial trainings. These participants required a range of 288 to 448 cumulative training trials to achieve mastery level responding. This range led to the decision to not expose children to the current procedure. It was hypothesized that learners with less sophisticated verbal repertoires would require more extensive training. Thus, the number of training sessions could not be reasonably predicted. Without knowing the likely duration of participation in the experiment, informed consent from caregivers could not be obtained. Thus, the experimenter utilized only adult participants within the current study. Once consistent findings with adult participants can be achieved, child-aged participants will be appropriate. Once consistent findings with children can be achieved, participants with disabilities will be appropriate. In Chapter 5, recommendations for future research that may lead to a more viable procedure can be found. 129 Chapter 5 DISCUSSION In Experiments 1 and 2, training was provided to 12 adult participants in order to establish two distinct, arbitrarily related classes of abstract stimuli (e.g., A1-B1-C1 and A2-B2-C2). Pre-testing results indicated that none of the participants could relate the images according to the experimenter-arranged classes at levels above chance prior to training. Across both studies, training consisted of prompting the participants to label “same” in the presence of compounds consisting of related images and “different” in the presence of compounds consisting of unrelated images. These relations were initially trained with the Level 1 stimuli consisting of the AB and BC compounds only. Once training was complete, emergent relations were tested in the analogy responding (compound-compound matching) and component relations (individual images to individual images matching). Level 2 (BA and CB) and Level 3 (AC and CA) relations were also tested to assess for the actual formation of equivalence classes. The performances of PT-2, PT-3, PT-4, PT-5, and PT-10 support the viability of the current protocol in establishing equivalence-equivalence responding and equivalence class formation through purely language-based training. PT-7 and PT-8 required only brief supplemental training with the Level 1 relations (A1-B1, B1-C1, A2-B2, and B2C2) before analogy and component-matching demonstrations emerged. However, it is the errors made across conditions by PT-6, PT-9, PT-11, PT-12, and PT-13 that give rise to the most compelling conceptual analyses. This chapter will provide a conceptual perspective of these results, suggest practical implications, and recommend directions for 130 future research. Specifically, the interaction between equivalence class formation, analogy responding, and verbal repertoire will be discussed, as these processes were foundational to the success or failure of the participants. Further, learning by exclusion, rule governance, and naming research findings will be applied to these results as possible explanations of participants’ failures to form distinct equivalence classes. These conceptual discussions will support the recommendations for future research questions. Equivalence Class Formation In the current protocol, PT-6 and PT-9’s results suggest failure to form distinct equivalence classes, as they did not respond accurately to the analogy and componentmatching tests until specifically trained during the RL3VLT. These participants were likely only successful with the Level 2 tasks (BA and CB) due to generalization from the trained Level 1 (AB and BC) stimuli. Response generalization could explain these successes, as these compounds were highly visually similar (i.e., identical except for the switched positions of the components). Thus, the specific training to label “same” in the presence of the A1B1 compound could yield a correct vocal label of “same” in the presence of the B1A1 compound based solely on the visual characteristics of the images, rather than derived class membership (Skinner, 1953). PT-6 and PT-9 likely relied on alternative strategies (e.g., visual select relation) when presented with Level 3 stimuli as their prior training failed to give rise to distinct equivalence classes. Specifically, when presented with an array of sample and comparison stimuli, rather than relating the compounds according to the emergent “same” or “different” relations, these participants selected the compound with an image in common with the sample. For example, the 131 comparison C1A2 was selected, rather than C1A1, in the presence of the sample A2C2 based on the common term of A2. Failure to form equivalence classes could be attributed to Sidman’s (2000) reinforcement contingency hypothesis or learning via exclusion. The inter-relation of verbal behavior will also be examined below. Reinforcement contingency. Findings from the vocal label, analogy, and component matching tests were highly consistent with undifferentiated stimulus classes. In other words, participants did not learn that stimuli which appeared together and were labeled as same belonged to two different response classes, namely 1 and 2 (A1B1C1 and A2B2C2). This outcome is consistent with Sidman’s (2000) account of the role of reinforcement contingency in equivalence class formation. Sidman’s account emphasizes the participation of all elements of the contingencies used to establish the relations within the actual classes; thus, the stimuli, the responses, and the reinforcers would all participate within the equivalence classes. When classes contain common elements such as the reinforcers (e.g., social praise in the current work), these terms should drop out of the contingency to prevent the formation of one large equivalence class. Applying this interpretation to the current work, the provision of the same reinforcement to within class (“same”) and across class (“different”) relations may have yielded a large, blended class of stimuli. Essentially, the participants contacted the same consequences for relating stimuli from within the same classes (e.g., A1 to C1) as they did for relating stimuli from within different classes (e.g., A1 to C2). If the reinforcement failed to “drop out” as Sidman (2000) predicted it should, the individual would end up deriving a single class of all the terms from the contingencies (e.g., A1-B1-C1-A2-B2-C2-reinforcement) rather 132 than two distinct classes (e.g., A1-B1-C1 and A2-B2-C2). Furthermore, the participants were trained to emit the differential vocal labels of “same” or “different” in the presence of both Class 1 and Class 2 stimuli. Thus, in the presence of A1B1 the response “same” was considered correct just as in the presence of A2B2. In the absence of class-distinct verbal responses, the participants may have been more likely to blend these relations (Sidman, 2000). Previous research procedures (Debert et al., 2007; 209, Carpentier et al., 2002; 2003, Ruiz & Luciano, 2011) would not have been vulnerable to these phenomena because only within class selection responses were trained and reinforced. These participants were never exposed to across class relations before testing, thus these stimuli would have never been paired with reinforcement and could not participate in a blended response class (e.g., A1-C1-reinforcement, A1-C2-no reinforcement). The procedures used by Stewart and colleagues (2001; 2002) would have been even less sensitive to blending due to the use of non-arbitrary relations to establish the equivalence classes. In the 2002 study, Class 1 stimuli were all related to a red square while all Class 2 stimuli were related to a red circle. Thus, despite uniform reinforcement for all responses the distinct, common images would have likely prevented any blending from occurring (e.g., Red Square-B1-C1 and Red Circle-B2-C2). Further, upon testing Sidman’s (2000) hypothesis Minister, Jones, Elliffe, and Muthukumaraswamy (2006) found that as long as the participants had the opportunity to relate according to class membership, no blending should occur. For example, in the presence of the A1 sample, the selection of the C1 comparison should occur, though no reinforcement was ever provided for relating A1 to 133 C2. These procedural elements should be incorporated in future revisions of the current protocol to address the noted limitations. Future studies should replicate the procedures, but omit the vocal label “different” to prevent across class responding from influencing class formation. Only the response “same” would be trained in the presence of the related compounds, and participants would be trained to emit no response in the presence of unrelated compounds. Related compounds may become conditioned as reinforcers based on this correlation and yield more stronger transitive stimulus-stimulus relations. Additionally, if saying “same” in the presence of unrelated compounds was error corrected (e.g., “wrong”), the participants may come to respond to these figures as conditioned aversive stimuli and avoid across class relations when presented as compounds (Skinner, 1938). This preference for related compounds and avoidance of unrelated compounds may facilitate equivalenceequivalence relations during the analogy and component matching tests. Exclusion. Johnson and Sidman (1993) examined the outcomes of learning stimulus-stimulus relations via rejection rather than selection as a possible cause of equivalence class failures. When a participant acquired relations during training via exclusion, the participant indirectly forms relations between stimuli (Sample to S+) based on their rejection of the other comparison (not S-). Thus, in the presence of the sample A1, participants may not actually learn to select B1, but to reject B2. This response pattern would initially be effective for participants as they master the trained relations and even display symmetrical relations (e.g., reject A2 in the presence of the sample B1). However, when tested on transitive or equivalence relations, the participants would 134 consistently fail due to the blending of classes (A1-rejectB2-C1; A2-rejectB1-C2). In the current protocol, the participants may have learned to vocally label the compounds as “same” or “different” based on a rejection strategy. For example, in the presence of A1B1 the participant may overtly label “same,” but do so based on learning that A1B2 is “different” therefore A1B1 is “not-different.” Thus, when presented with tasks composed of Level 3 stimulus-stimulus relations the participants would demonstrate chance responding or the use of non-trained strategies (e.g., selecting A1C2 in the presence of C2A2 as they share a common term), as faulty equivalence relations would have been established during training. Future research should address this limitation by manipulating the response effort required to relate stimuli via the reject relation. Johnson and Sidman (1993) found that participants’ responding within equivalence protocols were sensitive to the response effort required to form the discriminations between stimuli. For example, when multiple comparisons with a possible select control were present, participants favored the single, consistent reject relation. Thus, future research could bias responding to the select relation by increasing the field of comparisons during analogy and component matching testing. With a field of two comparisons, the participants may display relations based on the reject-relation as easily as they might with the select-relation, as they have a 50% chance of contacting reinforcement. However, if the field were expanded to three comparisons, the participants would have to learn to reject two comparisons rather than select one (e.g., not A1, not A2, yes A3). The increased response effort for the reject- 135 relation would likely establish conditions to favor select-relation, leading to correct class formation (Johnson & Sidman, 1993). Naming. Randell and Remington (2006) examined the interactions between participants’ verbal behavior and their performance on equivalence tasks and suggested that as often as sophisticated repertoires promote “rapid and extensive stimulus-class expansion,” they may also hinder “the intended effects of experimental procedures,” (p. 353). The current protocol relied on the participants’ use of overt and covert verbal behavior to yield the desired performances; however, failure to engage in the desired repertoires or the inclusion of untrained stimulus-stimulus relations may have interfered with the effectiveness of the procedure. As discussed in Chapter 2, Horne and Lowe (1996) described two verbal response patterns that could yield equivalence class formation: common naming and intraverbal naming. Common naming would occur when a participant emitted a distinct vocal label in the presence of all stimuli within each class. This common term would function to partition each class from one another. For example, PT-8 reported covertly referring to all Class 2 images as “positive” and Class 1 images as “negative.” The current experiment was not designed to explicitly produce common naming, rather it was intended that the participants learn the relations between the images. This version of intraverbal naming was believed to be a viable means for the desired equivalence classes to form, as all stimuli from within the same classes would evoke the same relational labels. For example, A1 and C1 would both be “same” with B1, but “different” with B2. PT-5 was observed to overtly state these interrelations during testing, though PT-8 reported doing 136 so covertly. These participants both successfully responded in accordance with equivalence class formation, though as Arntzen (2004) noted, “there is no way to prove that the two classes of behavior correlate,” (p. 287). But to extend the correlational analysis, the participants that failed to form equivalence classes did not report the use of either type of naming strategy. Another interpretation of these findings implicates the role of sophisticated forms of naming. It was assumed that when the participants learned to label the relation between the stimuli as either “same” or “different,” they would behave as listeners to their own speaker behavior, as consistent with the naming hypothesis (Horne & Lowe, 1996). Listener behavior would take the form of some form of simple orientation to the stimuli or more complex visual imagining strategies (e.g., organizing or manipulating the images). As suggested by previous research, the speaker behavior may not have evoked the “separation” of the compound stimulus such that the participant responded differentially to the elements (Wulfert, Dougher, & Greenway 1991). Thus, without engaging with A1, B1, and C1 as if they were separate but related terms, stimulusstimulus relations between these terms would not be formed and no equivalence class would be established. The participant would simply display the ability to say “same” in the presence of the A1B1 and B1C1 compounds. Conversely, without engaging with A1 and B2 as if they were truly “different” from one another, the participant would have no reason to favor C1 over C2 when later presented in the Component Matching task. Notably, PT-6, PT7, PT-8, PT-9, and PT-11 reported attempting to simply memorize each individual stimulus during Level 1 Vocal Label Training. This form of rote 137 memorization may have impeded the more complex naming and hindered the formation of equivalence classes. Alternate sources of stimulus-stimulus relations may also have acted to interfere with the formation of equivalence classes. Wulfert, Dougher, and Greenway (1991) acknowledged the broad variety of spurious relations that can come to exert control over responding during experimentation and suggested the analysis of the participants’ descriptions of their own performance. The authors stated that this might be the best means to access the covert behavior that influenced performances throughout the experimental process. Examining the verbal reports of the participants in Experiment 2, all individuals developed names for the presented images that were not directly trained by the experimenter. For example, PT-9 reported that the “oval,” “rectangle,” and “triangle” stimuli were related because they were similar to VCR buttons. These spurious relations could have hindered the formation of equivalence classes in cohesion to what the experimenters had designated as “correct.” Arntzen (2004) designed stimulus-stimulus relations for which the occurrence of covert verbal behavior would facilitate the formation of equivalence classes. When classes included stimuli that were easily “nameable,” more participants were able to form the desired classes. Thus, the intentional inclusion of familiar, easily recognizable stimuli may have facilitated the formation of the desired equivalence classes and reduced the occurrence of spurious stimulus-stimulus relations. However, findings from PT-11 and PT-12 indicate that the establishment of equivalence classes in the absence of mediation from the verbal repertoire does not yield equivalence-equivalence responding. 138 To address the limitations, future research should examine the inclusion of even more abstract, nonsensical relations. The use of more abstract images or sequences of letters may have prevented previously learned relations from interfering with responding by making them more difficult to name. Alternatively, the inclusion of one highly recognizable, common image within each class may have established more distinct classes as Arntzen (2004) suggested. From an applied perspective, the use of recognizable stimuli would be the most socially significant variation to examine, as this would bring the protocol an additional step closer to examining actual curricula. For example, while arbitrarily arranged into classes, the images used could be familiar to participants of school age. A crayon, an apple, and a cat could make up one class while a pen, an orange, and a dog could make up the second class. These images would be readily nameable for verbally competent individuals, but have no functional relation that would allow participants to differentiate them into classes outside of the experimental procedure. The Role of the Verbal Repertoire The current work was based on the hypothesis that the verbal repertoire was the foundation for the display of analogical reasoning and the results of the current study support this assertion. The correlation between proficiency on the Level 3 Vocal Label Test, in which participants were required to vocally label the AC and CA compounds, and Analogy Test, in which participants were required to match AC and CA relations, was shown across all participants in both experiments. Even PT-13’s findings can be interpreted as a display of verbal mediation, though the vocalized relations were exactly 139 opposite of what were intended. The role of naming will be examined below as well as the possible role of rule governance. Naming. PT-2, PT-3, PT-4, PT-5, and PT-10 proficiently vocally labeled the Level 3 compounds upon their first attempts and subsequently scored highly on the Level 3 Analogy Test. This correlation continued with PT-7 and PT-8 who both required the RCMT prior to success in labeling the Level 3 compounds. Both participants overcame prior failures on the Level 3 Analogy Tests once they proficiently vocally labeled the relations. Similarly, PT-6, PT-9, PT-11, and PT-12 accurately related the Level 3 compounds only after being specifically vocally trained to label the same images in the RL3VLT. PT-13’s data, while seemingly in contrast to this hypothesis, actually correlate with this assertion. The success that PT-13 displayed on the Level 3 Analogy Test was likely due to verbal mediation, though she failed to correctly vocally label all the stimuli. This type of failure could only be achieved when the vocal labels emitted were exactly opposite of those designated by the experimenter. Thus, when PT-13 was intended to relate compounds as “same”-“same” (e.g., A1C1-A2C2), PT-13 was relating based on her vocal labels of “different”-“different.” Taken together these data strongly implicate the role of verbal mediation in the formation of the equivalence-equivalence relations. The performances of PT-6, PT-9, PT-11, PT-12, and PT-13 actually indicate that the involvement of the verbal repertoire may be more crucial to the display of analogical reasoning than equivalence class formation; none of these participants could display the desired stimulus-stimulus relations until the vocal label response was trained. 140 Though PT-11 and PT-12 passed the Component Matching Test, they failed to equivalence-equivalence responding across all levels of stimuli until they were able to vocally label all levels of stimuli. The participants were able to accurately relate the terms of the compounds to one another, but failed to match the actual compounds. This demonstration of equivalence relations and failure to display equivalence-equivalence was a common finding with child-aged participants in prior research (Carpentier et al., 2003; 2004); however, adult participants have rarely responded in this manner. The failures in the current study and prior literature may be caused by the absence of verbal mediation. PT-11 and PT-12’s Level 3 Vocal Label Tests indicated weak, derived stimulus-stimulus relations when required to tact the relations as “same” or “different.” The terms, when presented side by side, may not have functioned to evoke speaker behavior of saying “same” or “different” due to the required strength of the response. Skinner (1957) asserted that in some circumstances verbal behavior may occur on the covert level due to a history of socially mediated punishment. While not intentionally established in the current protocol, the audience, composed of the experimenter and data collector(s), may have functioned to weaken the verbal behavior of PT-11 and PT-12 as the individuals did not wish to vocalize an incorrect response in the presence of their peers. Thus, in the absence of verbal mediation the participants relied upon spurious stimulus-stimulus relations and failed the Level 3 Analogy Tests. To address this limitation, conditions should be modified to either allow complete privacy to the participants or require the participants to give continuous verbal reports throughout the procedure. The software and procedures could be modified to allow the 141 participants to be alone in the room, with data collectors observing covertly from outside the room or scoring data from video recording. This modification should remove overt stimuli related to the availability of socially mediated punishment (e.g., the experimenters), which may lead to more vocalizations (Skinner, 1957). Participants may be reluctant to vocalize when alone in a room, so an alternative approach would be to require the participants to vocalize aloud. The experimenters could modify the instructions to include elements of a “talk aloud” method (Ericsson & Simon, 1993), such that participants are required to vocalize throughout training and testing. If the participants have derived the stimulus relations correctly and can accurately vocally label them, this modification may strengthen their performance as cohesion between the verbal and selection repertoire would be likely via joint control (Lowenkron, 1998). For example, during the analogy testing, the participants would label the sample compound “same” then the comparisons as “same” and “different.” The repetition of the label for the sample and the correct comparison (e.g., “same” then “same” or “different” then “different”) should effectively occasion the selection response; as the participants all demonstrated the ability to follow this general procedure with familiar stimuli during Pretraining. Overall, future research should examine the means to remove any interference with participants’ verbal behavior to prevent response suppression. Rule Governance. In the absence of strong, experimentally established verbal behavior, rule governance may have also come to influence several of the participants. PT-13 most obviously generated false “rules” related to the conditions of the experiment as evidenced by the reversal of her performance following the RL3VLT. During the task, 142 PT-13 vocalized that the stimuli were “opposite,” and following this performance went on to pass the task, pass the Level 3 Analogy Test, but fail the Component Matching Test. In her failure, PT-13 reversed her previous performance exactly. These correlations and her reports strongly implicate the involvement of verbal rules. PT-13 stated that she believed the experimenter had changed the conditions of the experiment in RL3VLT and subsequently reversed all her prior responses. This response pattern maintained even in the absence of reinforcement during the interspersed maintenance trials. As all participants were typically developing adults, the involvement of self-derived rules of performance is a viable consideration. Prior research has shown that elements of the procedure itself may have functioned to establish rules related to performance, unintended by the experimenter. Green, Sigurdardottir, and Saunders (1991) have suggested that differential instructions between testing conditions can impede the transfer of function across stimuli within classes. The segmented instructions influenced the participants’ combination of repertoires as the participants inferred that the distinction between the tasks was intentional. In the current study, at the initiation of each task requiring the display of response topography different from the prior task, the experimenter read the participants a corresponding set of instructions. This may have allowed participants to develop rules that prevented the cohesion between the trained and derived relations. Participants PT-8 and PT-9 reported that when encountering the “unfamiliar” Level 3 compounds, they assumed that they would soon be specifically trained with these images based on the prior experimental procedures. This rule was likely derived from the transition between the 143 baseline task with abstract images to the Level 1 Vocal Label Training. The participants stated that this rule kept them from exerting substantial effort to derive the Level 3 relations. The segmented tasks combined with a history of training then testing may have functioned to create a rule that yielded prompt dependency for several of the participants. Modifications to these procedures could address these limitations by using more explicit directions to influence the overt and covert behavior of participants. Participants should be specifically instructed to utilize various “strategies” during tests for emergent relations. For example, during the analogy testing, the participants could be explicitly told to identify whether the sample and comparison were “same” or “different” before making a selection. During the Level 2 (BA and CB) and Level 3 (AC and CA) tasks, participants could be directed to use the previously trained A-B and B-C relations to derive the new relations. By specifically indicating how to respond to the stimuli, spurious or participant-specific strategies may be prevented. Furthermore, by telling the participants the relations between the tasks, the segmenting effect noted above would be prevented. For example, the experimenter could say to the participant “This task is similar to the previous task and you should use the same strategy…” The use of more explicit verbal directions would bring the current work in greater alignment to the existing literature from the cognitive psychology approach (Thibaut, French & Vezneva, 2010; Natsopoulos, Christou, Koutselini, Raftopoulos, & Karefillidou, 2002). The use of explicit instruction and verbal discussion of the target problems are commonly used teaching strategies as the “cognitive” approach seeks to directly influence how individuals “think” about analogies (Chen, 1996). The goals of behavior analytic and 144 cognitive psychology approaches are likely similar; however, the terminology used differ widely according to the philosophical values of each camp. Thus, future behavior analytic studies should utilize strategies to influence participants’ covert verbal behavior (e.g., thinking) by increasing the response requirements of their overt verbal behavior (e.g., talking). In summary, the variables responsible for the disruption of the desired equivalence class relations, verbal behavior, and analogy responding can only be hypothesized based on the current data. The participants’ own verbal reports cannot be relied upon too heavily, as the literature has shown that these repertoires do not always correspond to their selection behavior (Randell & Remington, 2006). The current examination can only apply established conceptual approaches and past research findings to suggest interpretations of these data. Directions for future research to expand upon the current literature are included below. Future Research The current protocol was the first to utilize a verbal behavior response within an equivalence-equivalence protocol and requires additional protocol variations to produce consistent responding with adult participants. However, once consistent findings can be observed across adult participants, the research line should be extended to typically developing children and then to children with disabilities. Aside from these future research suggestions, variations on the model could also be developed to extend the use of verbal behavior in teaching analogical reasoning. The explicit use of individual 145 stimulus labels and class labels should be examined as these procedures more closely adhere to the typical instructional methods used in educational settings. In an effort to closely replicate the Debert and colleagues’ (2007; 2009) studies, specific names for the individual stimuli and broader category names were not established as an element of the current protocol. This omission is counter to the typical curriculum introduction that children would encounter within a school setting. For example, a child would learn to label the names of food items (e.g., apple, carrot) and labels for their corresponding categories (e.g., fruit, vegetables) prior to being presented with an analogy utilizing these stimuli (e.g. apple is to fruit as carrot is to vegetable). Based on reports from some of the participants, the absence of non-arbitrary relations for the “sameness” and “differentness” between the figures caused them to utilize learned relations from outside of the experimental protocol. This limitation could be addressed, while maintaining experimental control, by utilizing arbitrary category names. The participants could be trained to label the images with the distinct names (e.g., “vek” and “zog”) prior to the training and testing sequences already established. With this modification, by the time the participants had the opportunity to label the relationship between the compounds, they would have a non-arbitrary basis to emit the responses. For example, upon seeing A1C1 they participants could label “same” as they would have previously learned that A1 is “vek” and C1 is “vek.” Alternatively, if a future protocol taught participants to individually label the images with arbitrary names they would have additional intraverbal-based strategies to utilize in testing. The participants could be taught to label A1 as “mek,” B1 as “noy,” and C1 as “biv” and when presented with the 146 images in the Vocal Label Tests or Component Matching Tests utilize verbal skills to determine whether the images were “same” or “different” (e.g., “Mek is same with noy, noy is same with biv, biv is also same with mek.”). This modification would approximate a more natural instructional sequence and set the stage for a study utilizing educationally relevant classes. For example, a study could be conducted to establish analogical reasoning using images related to functional categories such as tools, appliances, and art supplies. The items would be of similar size and color to prevent visual matching strategies from influencing class formation and unfamiliar to the participants at the start of the study. Participants would be trained to emit the category names and possibly the individual stimulus names following the same procedures described above. If successful, the protocol would yield actual analogical relations based on actual categorical relations. A hammer with a blender would be considered “different” as they are from the classes of “tool” and “appliance.” However, a hammer with a level would be considered “same” as they are both “tools.” While the current study did not yield a procedure that is ready for application with young learners or individuals with disabilities, findings from the study function to guide future research and yield some points of immediate application. Implications for Practice In the process of identifying the most efficient and adaptable means to instruct individuals with special needs, the foundational elements of learning should always be central to the endeavor. The current work sought to utilize language to develop distinct equivalence classes and found that participants responded idiosyncratically to the 147 method. The lack of uniformity in these results is a clear reminder that learning is not a passive process. In fact, many of the participants reported utilizing abstract and complicated reasoning in an attempt to facilitate their responding. Remembering the extent of a learner’s history should prompt educators to specifically instruct their students on means to appropriately apply their knowledge. In the absence of cohesive, useful strategies, learners may come to rely on spurious stimulus-stimulus relations or continued assistance from their instructors. Critical thinking of the complexity required to solve analogies should not be an assumed byproduct of prior education. The participants demonstrated this in testing; even when given all the information necessary to solve the analogies, not all individuals were able to apply their knowledge appropriately. Thus, it is recommended that educators consider the specific strategies they wish their students to engage in when teaching then challenging critical thinking tasks. In an applied example, simply teaching a child that 50%, ½, and .5 can all be considered “half” does not necessarily yield correct performance on a word problem. Teaching critical thinking skills requires not only the establishment of individual concepts, but also the means to apply them. This poses a challenge; teachers do not have access to their students’ thoughts. However, Skinner’s (1957) interpretation of verbal behaviors gave a conceptual basis for “thinking” to be examined through observable means. If thinking can be considered covert verbal behavior, one simply needs to establish the desired “thoughts” via overt training. Thus, instruction in analogies or other critical thinking skills can be approached through requiring students to verbalize their responses aloud or write them. Teachers would be able to shape the students’ overt 148 behavior and likely influence their covert behavior when presented with similar problems in the future. For example, had the current participants been required to discuss their reasons for relating the images aloud during training, the experimenter could have corrected their faulty reasoning and perhaps yielded stronger performance in testing. The role of the verbal repertoire was well established in the current work and highlights the importance of training this repertoire with learners of all skill levels. Many practitioners of ABA and special education teachers rely on non-verbal response products to evaluate learning and comprehension such as matching or selection from an array. Considering the findings of the current study, those behaviors may only be displayed based on covert, verbal mediation. Thus, educators should emphasize the use of verbal behavior throughout teaching and training as this repertoire may be facilitating the others. Conclusion The equivalence-equivalence model for analogical reasoning allows for the specific instruction of a skill considered highly abstract, yet crucial for intellectual functioning (Stewart & Barnes-Holmes, 2004). While the model has yet to be extended to individuals with disabilities, the growth of this line of research will hopefully reach this goal. The inclusion of language and perhaps the modifications suggested above may yield a procedure that can eventually produce reliable, consistent results across adult participants and typically developing children. Once stable and efficient findings can be produced with these populations, these methods should be examined with children with disabilities. The current work is a step in that direction that will ideally be forwarded in future research. 149 REFERENCES Arntzen, E. (2004). Probability of equivalence formation: Familiar stimuli and training sequence. The Psychological Record, 54, 275-291. Barnes, D., Hegarty, N., & Smeets, P. M. (1997). Relating equivalence relations to equivalence relations: A relational framing model of complex human functioning. The Analysis of Verbal Behavior, 14, 57-83. Barnes-Holmes, D., Regan, D., Barnes-Holmes, Y., Commins, S., Walsh, D., Stewart, I., Smeets, P. M., Whelan, R., & Dymond, S. (2005). Relating derived relations as a model of analogical reasoning: Reaction times and event-related potentials. Journal of the Experimental Analysis of Behavior, 84(3), 435-451. California State Board of Education. (1999). English-language arts content standards for California public schools kindergarten through grade twelve. Retrieved from: http://www.cde.ca.gov/be/st/ss/documents/elacontentstnds.pdf Carpentier, F., Smeets, P. M., & Barnes-Holmes, D. (2002). Matching functionally same relations: Implications for equivalence-equivalence as a model for analogical reasoning. The Psychological Record, 52, 351-370. Carpentier, F., Smeets, P.M., & Barnes-Holmes, D. (2003). Equivalence-equivalence as a model of analogy: Further analyses. The Psychological Record, 53, 349-371. Carpentier, F., Smeets, P.M., Barnes-Holmes, D., & Stewart, I. (2004). Matching derived functionally- same stimulus relations: Equivalence-equivalence and classical analogies. The Psychological Record, 54, 255-273. 150 Chen, Z. (1996). Children’s analogical problem solving: The effects of superficial, structural, and procedural similarity. Journal of Experimental Child Psychology, 62, 410-431. Danielsson, H., Henry, L., Ronnberg, J., & Nilsson, L. (2010). Executive functions in individuals with intellectual disability. Research in Developmental Disabilities, 31, 1299-1304. Debert, P., Matos, M. A., & McIlvane, W. (2007). Conditional relations with a compound abstract stimuli using a go/no-go procedure. Journal of the Experimental Analysis of Behavior, 87, 89-96. Debert, P., Hunziwara, E. M., Faggiani, R. B., De Mathis, M. E. S., & McIlvane, W. J. (2009). Emergent conditional relations in a go/no-go procedure: Figure-ground and stimulus-position compound relations. Journal of the Experimental Analysis of Behavior, 92, 233-243. DSM IV- TR, 2005, American Psychiatric Association, 2005 Eikeseth, S., & Smith, T. (1992). The development of functional and equivalence classes in high-function- ing autistic children: The role of naming. Journal of the Experimental Analysis of Behavior, 58, 123–133. 151 Elias, N. C. & Goyos, C. (2010). MestreLibras no Ensino de Sinais: Tarefas Informatizadas de Escolha de Acordo com o Modelo e Equivalência de Estímulos. In: Enicéia Gonçalves Mendes; Maria Amelia Almeida. (Org.). DAS MARGENS AO CENTRO: perspectivas para as políticas e práticas educacionais no contexto da educação especial inclusiva. Araraquara: Junqueira&Marin Editora e Comercial Ltda, p. 223-234. Ericsson, K. A., & Simon, H. A. (1993). Protocol Analysis: Verbal reports as data (rev. ed.). Cambridge, MA: MIT Press. Fisher, W. W., Piazza, C. C., Bowman, L. G., Hagopian, L. P., Owens, J. C., & Slevin, I. (1992). A comparison of two approaches for identifying reinforcers for persons with severe and profound disabilitcies. Journal of Applied Behavior Analysis, 25, 491-498. Freeman, K. E. & Goswami, U. (2001). Does half a pizza equal half a box of chocolates? Proportional matching in an analogy task. Cognitive Development, 16, 811-829. Goswami, U., & Brown, A. L. (1989). Melting chocolate and melting snowmen: analogical reasoning and causal relations. Cognition, 35, 69–95. Goswami, U., & Brown, A. L. (1990). Higher-order structure and relational reasoning contrasting analogical and thematic relations. Cognition, 36, 207–226. Green, G., Sigurdardottir, Z. G., & Saunders, R. R. (1991). The role of instructions in the transfer of ordinal functions through equivalence classes. Journal of the Experimental Analysis of Behavior, 55(3), 287-304. 152 Hayes, S.C., Barnes-Holmes, D., & Roche, B. (2001). Relational frame theory: A post Skinnerian account of human language and cognition. New York: Plenum. Horne, P. J., & Lowe, C. F. (2006). On the origins of naming and other symbolic behavior. Journal of the Experimental Analysis of Behavior, 65(1), 185-241. Horne, P. J., Lowe, C. F., & Randle, V. R. L. (2004). Naming and categorization in young children: II. Listener behavior training. Journal of the Experimental Analysis of Behavior, 81, 267-288. Individuals with Disabilities Education Act of 2004, 20 U.S.C. Sec. 1400 et seq. (2004). Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence: Any essay on the construction of formal operational structures. New York, NY: Basic Books. Johnson, J. M., & Pennypacker, H. S. (1993a). Readings for strategies and tactics of behavioral research (2nd ed.). Hillsdale, NJ: Erlbaum. Johnson-Martin, N. M., Attermeier, S. M., & Hacker, B. J. (2004). The Carolina curriculum for infants and toddler with special needs (3rd ed.). Baltimore, MD: Brooks Publishing Company. Karsten, A. M. & Carr, J. E. (2009). The effects of differential reinforcement of unprompted responding on the skill acquisition of children with autism. Journal of Applied Behavior Analysis, 42, 327-334. Keintz, K. S., Miguel, C. F., Kao, B., & Finn, H. (in press). Using conditional discrimination training to produce emergent relations between coins and their values in children with autism. Journal of Applied Behavior Analysis. 153 Lane, S. D., & Critchfield, T. S. (1998). Classification of vowels and consonants by individuals with moderate mental retardation: Development of arbitrary relations via match-to-sample training with compound stimuli. Journal of Applied Behavior Analysis, 31, 21–41. LeBlanc, L. A., Miguel, C. F., Cummings, A. R., Goldsmith, T. R., & Carr, J. E., (2003). The effects of three stimulus-equivalence testing conditions on emergent US geography relations of children diagnosed with autism. Behavioral Interventions, 18, 279-289. Lipkens, R. & Hayes, S. C. (2009). Producing and recognizing analogical relations. Journal of the Experimental Analysis of Behavior, 91(1), 105-126. Lovaas, O. I., Schreibman, L., Koegel, R., & Rehm, R. (1971). Selective responding by autistic children to multiple sensory input. Journal of Abnormal Psychology, 77(3), 211-222. Lowe, C. F., Horne, P. J., Harris, F. D. A. & Randle, V. R. L. (2002). Naming and categorization in young children: Vocal tact training. Journal of the Experimental Analysis of Behavior, 78, 527-549. Lowe, C. F., Horne, P. J., & Hughes, J. C. (2005). Naming and categorization in young children: III. Vocal tact training and transfer of function. Journal of the Experimental Analysis of Behavior, 83, 47-65. Lowenkron, B. (1998). Some logical functions of joint control. Journal of the Experimental Analysis of Behavior, 69(3), 327-354. 154 Maguire, R. W., Stromer, R., Mackay, H. A., & Demis, C. A. (1994). Matching to complex samples and stimulus class formation in adults with autism and young children. Journal of Autism and Developmental Disorders, 24(6), 753-772. Mahoney, A. M., Miguel, C. F., Ahearn, W. H., & Bell, J. (2011). The role of common motor responses in stimulus categorization by preschool children. Journal of the Experimental Analysis of Behavior, 95, 237-262. Markham, M. R. & Dougher, M. J. (1993). Compound stimuli in emergent stimulus relations: Extending the scope of stimulus equivalence. Journal of the Experimental Analysis of Behavior, 60, 529-542. McBride, B. J. & Schwarz, I. S. (2003). Effects of teaching early interventionists to use discrete trials during ongoing classroom activities. Topics in Early Childhood Special Education, 23(1), 5-17. Michael, J. (2004). Concepts and principles of behavior analysis (rev. ed.) Kalamazoo, MI: Society for the Advancement of Behavior Analysis. Miguel, C. F., Petursdottir, A. I., Carr, J. E. & Michael, J. (2008). The role of naming in stimulus categorization by preschool children. Journal of the Experimental Analysis of Behavior, 89, 383-405. Miguel, C. F., & Petursdottir, A. I. (2009). Naming and frames of coordination. In R. A. Rehfeldt & Y. Barnes-Holmes (Eds.), Derived relational responding: Applications for learners with autism and other developmental disabilities: A progressive guide to change (pp. 129-148). Oakland, CA: New Harbinger. 155 Molen, M. J. (2010). Working memory structure in 10- and 15-year old children with borderline intellectual, disabilities. Research in Developmental Disabilities, 31, 1258-1263. Natsopoulos, D., Christou, C., Koutselini, M., Raftopoulos, A., & Karefillidou, C. (2002). Structure and coherence of reasoning ability in down syndrome adults and typically developing children. Research in Developmental Disabilities, 23, 297307. Peters, M. T. & Heron, T. E. (1993). When the best is not good enough: An examination of best practice. The Journal of Special Education, 26(4), 371-385. Piaget, J., Montangero, J. & Billeter, J. (1977). La formation des correlats. In J. Piaget (ed.) Recherches sur L'Abstraction Reflechissante I, pp. 115-129. Paris: Presses Universitaires de France. Randell, T. & Remington, B. (2006). Equivalence relations, contextual control, and naming. Journal of the Experimental Analysis of Behavior, 86(3), 337-354. Ruiz, F. J. & Luciano, C. Cross-domain analogies as relating derived relations among two separate relational networks. Journal of the Experimental Analysis of Behavior, 95(3), 369-385. Schiff, R., Bauminger, N., & Toledo, I. (1999). Analogical problem solving in children with verbal and nonverbal learning disabilities. Journal of Learning Disabilities, 42(1), 3-13. Schwering, A., Kuhnberger, K. U., & Kokinov, B. (2009). Analogies- integrating cognitive abilities. Cognitive Systems Research, 10, 175-177. 156 Sidman, M. (1960). Tactics of scientific research evaluating experimental data in psychology. New York, NY: Basic Books. Sidman, M. (2000). Equivalence relations and the reinforcement contingency. Journal of the Experimental Analysis of Behavior, 74(1), 127-146. Sidman, M., & Tailby, W. (1982). Conditional discrimination vs. matching-to-sample: An expansion of the testing paradigm. Journal of the Experimental Analysis of Behavior, 37, 5-22. Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: Appleton-Century-Crofts. Skinner, B. F. (1953). Science and human behavior. New York: Macmillion. Skinner, B.F. (1957). Verbal behavior. New York: Appleton-Century-Crofts. Sternberg, R.J. (1977). Intelligence, information processing, and analogical reasoning. Hillsdale, NJ: Erlbaum. Sternberg, R.J. (1985). Beyond IQ. New York: Cambridge University Press. Stewart, I., Barnes-Holmes, D., Roche, B., & Smeets, P. M. (2001). Generating derived relational networks via the abstraction of common physical properties: A possible model of analogical reasoning. The Psychological Record, 51, 381–408. Stewart, I., Barnes-Homes, D., Roche, B., & Smeets, P. M. (2002). A functional-analytic model of analogy: A relational frame analysis. Journal of the Experimental Analysis of Behavior, 78(3), 375-396. 157 Stewart, I., Barnes-Holmes, D., & Roche, B. (2004). A functional-analytic model of analogy using the relational evaluation procedure. Psychological Record, 54, 531–552. Stewart, I. & Barnes-Holmes, D. (2009). Training analogical reasoning as relational responding. In R. A. Rehfeldt & Y. Barnes-Holmes (Eds.), Derived relational responding: Applications for learners with autism and other developmental disabilities: A progressive guide to change (pp. 129-148). Oakland, CA: New Harbinger. Stromer, R., McIlvane, W. J., & Serna, R. W. (1993). Complex stimulus control and equivalence. The Psychological Record, 43(4), 585-598. Thibaut, J. P., French, R., & Vezneva, M. (2010). The development of analogy making in children: Cognitive load and executive functions. Journal of Experimental Child Psychology, 106, 1-19. Touchette, B. (1971). Transfer of stimulus control: Measuring the moment of transfer. Journal of the Experimental Analysis of Behavior, 15, 347-364. Voress, J. K. & Maddox T. (2002). Developmental assessment of young children (DAYC). Los Angeles, CA: Western Psychological Services. Ward, R., & Yu, D. C. T. (2000). Bridging the gap between visual and auditory discrimination learning in children with autism and severe developmental disabilities. Journal of Developmental Disabilities, 7, 142-155. 158 Wulfert, E., Dougher, M. J., & Greenway, D. E. (1991). Protocol analysis of the correspondence of verbal behavior and equivalence class formation. Journal of the Experimental Analysis of Behavior, 56(3), 489-504. Zuriff, G. E. (2003). Science and human behavior, dualism, and conceptual modification. Journal of the Experimental Analysis of Behavior, 80(3), 345-352.