1 IMUSCLE Boilerplate Method Section The following text must appear verbatim on every paper, poster, and publication that uses IMUSCLE Data: This material is based upon work supported by the National Science Foundation under Grant No: HRD-1136143. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not reflect the views of the National Science Foundation. What follows is a fairly thorough description of methods. You may copy and paste sections of this text as necessary for the particular product being produced. Note that this method section does not describe every possible measure that can be constructed from the data. Individual researchers will have to describe the measures they use as well as their psychometric properties. Information about the construction and properties of many commonly-used constructed variables can be found in the IMUSCLE data dictionaries (To be produced). Data used in this paper were collected as part of the Incremental Mindset and Utility for Science Learning and Engagement (IMUScLE) Project (IMUScLE, Schmidt, Shumow & Durik, 2011) – a quasi-experimental study designed to test the impact of targeted treatments in middle and high school science classrooms on the interest, engagement and achievement of male and female students in science. Setting Data were collected from 2011-2013 in 29 science classrooms in two middle schools (n=14 classrooms1) with control writing samples and time one and two surveys from two additional 1 Two additional middle school classrooms participated in a modified study protocol in which they only completed surveys and the writing task, which was teacher-administered. Because these classrooms were not observed and did 2 classrooms) and a single comprehensive high school (n = 15 classrooms) serving students from a diverse community located on the fringe of a large metropolitan area. Median family income for residents in the community served by the school was around $63,000 at the time of the study, with residential home values averaging close to $203,000. Sixty percent of students in the school district were considered “low income.” The first middle school served 6th -8th grade students with an enrollment of approximately 760 students. Sixty-three percent of the student body was Hispanic, 16% White, 14% Black, 3% Asian and 3% two or more races. Seventy-seven percent of the students were considered lowincome. The average class size was 18 pupils. The second middle school served 6th -8th grade students with an enrollment of approximately 640 students. Fifty-two percent of the student body was Hispanic, 32% White, 12% Black, and 2% two or more races. The percent of low-income students was 62. The average class size was 16 pupils. The high school served 9th - 12th graders, with an enrollment of approximately 3,550 in 2012. Thirty-four percent of the student body was White, 45% Hispanic, 15% Black, 4% Asian and 3% two or more races. Average class size was 23.3 students, and teachers in the school district had an average of 13 years’ experience. The graduation rate was 75%. In order to graduate from high school, students are required to take 3 years of science courses. The two most typical science course-taking sequences to fulfill this requirement are biology-chemistry-physics or general science-biology-chemistry. Students who are reading below grade level in 9th grade are placed into general science rather than biology. General science was selected as the context for this study because these are the students with the greatest need. This high school also offered a not complete the ESM, they will be omitted from many analyses. If these two classrooms are included, the total middle school classrooms is 16. 3 school-within-a-school (Freshman Academy) exclusively for the first year freshmen students who were identified as most likely to benefit from small class size and individualized attention. In addition, some science classes (co-op) had greater than average proportions of students with IEPS placed in them by design; those classes were co-taught by a science teacher and a special educator. Treatment and control groups were assigned to naturally-occurring sections of regular-track 7th grade life science classes and 9th grade general science classes. Each teacher had multiple class sections participating in the study; the range of participating sections was 2-5. Due to assigned teaching loads in participant schools and the nature of the treatment procedures it was not possible to assign all 4 treatment groups to each teacher. Rather, teachers within schools were randomly assigned to one of two “clusters” of treatment groups carefully designed to afford some control for teacher effects by having each teacher be responsible for multiple treatment groups. Treatment groups were spread across the 2 middle schools to control for school effects. Teachers in the first cluster group were assigned the mindset and mindset + utility conditions, and as part of the treatment, received education about influencing mindset. Because teacher education was involved in treatment, these teachers were only assigned to conditions including the mindset component (combination of the mindset only and the mindset + utility conditions) to avoid contamination effects. In the second cluster group, teachers were assigned the utility and control groups, as these groups both involved a writing task and did not involve teacher education. Teachers were aware of the class mindset condition but were blind to the assignment of utility value treatment or control group. Distribution of treatment groups is represented in Table 2. 4 Table 2. Assignment of treatment groups across middle schools (MS), high schools (HS) and teachers (T) MS1 MS2 HS T1 T2 T3 T4 T5 T6 T7 T8 T9 Total Treatment Mindset only 2 2 1 1 3 9 Utility only 1 2 1 2 6 Mindset+Utility 1 2 1 1 2 7 Control 2 2 2 1 7 Total 3 3 4 4 2 2 3 3 5 29 *numbers in each cell indicate the number of classrooms assigned to this treatment condition. *In MS 1, there was 1 additional classroom of utility and 1 of control that participated in the study in a more limited way. Participants Teachers Seventh grade teachers. All four seventh grade teacher participants were White females. While they ranged in age from 26-54 years; two were in their twenties and two in their fifties. They had between four and 20 years of teaching experience, with the vast majority of each teacher’s experience being in their present school. The two older teachers had each taught briefly in another school. The two older teachers were tenured; the two younger were not. Both of the older teachers had earned master’s degrees. Ninth grade teachers. Six ninth grade teachers participated in our study with two of these teachers co-teaching one class. All teachers were White and only one was male. Two of the teachers earned a college degree, and four had master’s degrees. The mean age of the teachers’ was 37.33 years. Teachers had between 2 and 10 years of teaching experience, with an average of 6.41 years of experience at any school, and an average of 5.5 years of teaching experience at their current school. The two youngest teachers (age=26) were not tenured; the older teachers were. Demographic characteristics of the teachers are displayed in Table 3. 5 Table 3 Demographic characteristics for Teacher Sample Variable # Teachers Middle School High School Total (n=4) (n=6) (N=10) Male 0 1 1 Female 4 5 9 4 6 10 Four Year College Degree 2 2 4 Master’s Degree 2 4 6 Sex Race White Education Level Completed # of Years Mean Age (range 26-59 MS) 39.75 37.33 38.30 Mean years of Teaching Experience (range 2-20) 10.75 6.41 8.15 Students Demographic Characteristics Students. Total Sample. In total, 726 students participated in the study. Fifty-one percent of the students were in 7th grade (mean age=12.24), and 49 % were in 9th grade (mean age=14.30). The sample was half male and half female. The student sample was 18% White, 61% Hispanic (regardless of race), 12% African American, 2% Asian, 1% Native American, and 6% multiracial (non-Hispanic). According to school records, 71% of students in the sample were eligible to receive free or reduced lunch. Fifty percent of the students in the sample reported that neither of their parents had attained a college degree. Twelve percent said that at least 1 parent had graduated from college, and 10% indicated that at least one parent had earned an advanced degree. Twenty-eight percent of students in the sample did not know their parents’ educational 6 attainment. Thirty percent of the total sample received the mindset treatment, 24% received the utility treatment, 21% received both utility and mindset treatments, and 25 % of the students were in the control group. Seventh Grade Sample. Three hundred and seventy-four 7th grade students participated in the study (mean age=12.24). The seventh grade sample was 45% male and 55% female. The sample was 22% White, 56% Hispanic, regardless of race 11% African American, 3% Asian, less than 1% Native American, and 7% multi-racial (non-Hispanic). According to school records, 61% of students in the sample were eligible to receive free or reduced lunch. Forty-three percent of the students in the sample reported that neither of their parents had attained a college degree. Twelve percent said that at least 1 parent had graduated from college, and 11% indicated that at least one parent had earned an advanced degree. Thirty-four percent of students in the sample did not know their parents’ educational attainment. Four classrooms, or 26% of the total sample received the mindset treatment, four classrooms, or 26% received the utility treatment, three classrooms or 17% received both utility and mindset treatments, and five classrooms or 31 % of the students were in the control group2. Ninth Grade Sample. Three hundred and fifty-two 9th grade students participated in the study (mean age=14.30). The ninth grade sample was 54% male and 46% female. The sample was 15% White, 66% Hispanic, regardless of race, 13% African American, less than 1% Asian, less than 1% Native American, and 5% multi-racial (non-Hispanic). According to school records, 81% of students in the sample were eligible to receive free or reduced lunch. Fifty-seven percent of the 2 If you are including the 2 classrooms that did not participate fully in the study (survey & writing sample only), add 1 classroom to the utility group and 1 to the control group. 7 students in the sample reported that neither of their parents had attained a college degree. Twelve percent said that at least one parent had graduated from college, and 9% indicated that at least one parent had earned an advanced degree. Twenty-three percent of students in the sample did not know their parents’ educational attainment. Five classrooms, or 34% of the total sample received the mindset treatment, three classrooms, or 21% received the utility treatment, four classrooms or 26% received both utility and mindset treatments, and three classrooms or 19 % of the students were in the control group. Demographic characteristics of students are displayed in Tables 4 through 7. Table 4 Demographic characteristics of All Students (N=726) Variable Percentage Sex Male 49.2 Female 50.8 Race Hispanic, regardless of race White only 61.0 18.5 Black only 12.0 Multi Racial (non-Hispanic) 6.2 Asian/Pacific Islander only 1.9 American Indian only .4 Grade Level 7th 51.5 9th 48.5 Free/Reduced Lunch 70.7 Parent Education High school or less 36.4 Some college 11.8 Graduated from college 11.9 8 Advanced Degree 10.1 Don’t Know 29.8 Intervention Group Mindset 29.5 Utility 23.8 Mindset&Utility 21.3 Control 25.3 School Middle School 1 27.4 Middle School 2 24.1 High School 48.5 9 Table 5 Demographic characteristics of All Middle School Students Variable Percentage Middle Middle Total School 1 School 2 (n=374) (n=199) (n=175) Male 43.2 46.9 44.9 Female 56.8 53.1 55.1 Hispanic, regardless of race 59.2 53.0 56.3 White only 13.1 31.5 21.7 Black only 13.6 8.3 11.1 Multi Racial (nonHispanic) 8.4 6.5 7.5 Asian/Pacific Islander only 5.2 0.6 3.1 American Indian only 0.5 0.0 .3 70.1 49.0 60.6 High school or less 31.6 26.5 29.2 Some college 10.9 13.8 12.4 Graduated from college 11.1 13.9 12.4 Advanced Degree 7.4 15.6 11.2 Don’t Know 39.0 30.1 34.8 Mindset 23.1 28.0 25.4 Utility 26.6 25.7 26.2 Mindset&Utility 11.1 24.6 17.4 Control 39.2 21.7 31.0 Sex Race Free/Reduced Lunch Parent Education Intervention Group 10 Table 6 Demographic characteristics of High School Students Variable Percentage Regular (n=165) CoLab (n=73) Freshman Academy Total (n=352) (n=114) Sex Male 53.3 52.1 55.3 53.7 Female 46.7 47.9 44.7 46.3 Hispanic, regardless of race 65.8 62.9 68.5 66.1 White only 16.1 15.7 13.0 15.0 Black only 13.7 15.7 10.2 13.0 Multi Racial (non-Hispanic) 3.7 2.9 7.4 4.7 Asian/Pacific Islander only 0.6 1.4 0.0 0.6 American Indian only 0.0 1.4 0.9 0.6 High school or less 45.6 38.6 45.4 44.1 Some college 10.1 17.1 9.0 11.2 Graduated from college 12.5 15.7 7.3 11.5 Advanced Degree 10.6 4.3 9.1 8.8 Don’t Know 21.3 24.3 29.1 24.4 Mindset 30.3 0.0 60.5 33.8 Utility 29.7 35.6 0.0 21.3 Mindset&Utility 27.3 0.0 39.5 25.6 Control 12.7 64.4 0.0 19.3 Race Parent Education Intervention Group 11 Table 7 Demographic characteristics of Students by Teacher Variable Middle School Teachers High School Teachers 11 12 21 22 31 32 33 34 35 (n=131) (n=68) (n=83) (n=92) (n=73) (n=47) (n=70) (n=48) (n=114) Male 45.0 39.7 50.6 43.5 52.1 57.4 54.3 47.9 55.3 Female 55.0 60.3 49.4 56.5 47.9 42.6 45.7 52.1 44.7 Hispanic, regardless of race 61.4 54.7 55.7 50.6 62.9 68.9 60.9 70.2 68.5 White only 11.8 15.6 29.1 33.7 15.7 17.8 18.8 10.6 13.0 Black only 10.2 20.3 6.3 10.1 15.7 11.1 14.5 14.9 10.2 Multi Racial(nonHispanic) 7.9 9.4 7.6 5.6 2.9 2.2 4.3 4.3 7.4 Asian/Pacific Islander only 7.9 0.0 1.3 0.0 1.4 0.0 1.4 0.0 0.0 American Indian only 0.8 0.0 0.0 0.0 1.4 0.0 0.0 0.0 0.9 High School or Less 32.3 30.1 35.1 19.1 38.6 42.2 41.1 55.3 45.4 Some College 10.2 12.8 13.0 14.6 17.1 4.4 10.3 14.9 9.0 Graduated from College 8.7 15.9 16.9 11.2 15.7 15.6 11.8 10.6 7.3 Advanced Degree 9.4 3.2 14.3 16.9 4.3 13.3 14.7 2.1 9.1 Don’t know 39.4 38.1 20.8 38.2 24.3 24.2 22.0 17.0 29.1 Free/Reduced Lunch 72.3 66.2 51.4 46.8 84.1 69.6 74.3 89.6 82.9 Mindset 0.0 67.6 0.0 53.3 0.0 53.2 0.0 52.1 60.5 Utility 40.5 0.0 54.2 0.0 35.6 0.0 70.0 0.0 0.0 Mindset&Utility 0.0 32.4 0.0 46.7 0.0 46.8 0.0 47.9 39.5 Control 59.5 0.0 45.8 0.0 64.4 0.0 30.0 0.0 0.0 Sex Race Parent Education Intervention Group 12 Table 8 Demographic characteristics of Student by Intervention Groups Variable Mindset Utility Mindset&Utility Control Male 50.9 50.9 45.8 48.4 Female 49.1 49.1 54.2 51.6 62.4 60.4 60.8 60.2 White only 18.0 21.3 20.3 14.8 Black only 14.1 9.5 10.8 13.1 Multi Racial (nonHispanic) 5.4 7.1 7.4 5.1 Asian/Pacific Islander only 0.0 1.8 0.0 5.7 American Indian only 0.0 0.0 0.7 1.1 High school or less 31.9 31.8 44.3 40.0 Some college 11.6 9.0 10.9 15.5 Graduated from college 10.6 16.2 12.2 9.1 Advanced Degree 9.2 10.8 10.2 10.3 Don’t Know 36.7 32.3 22.5 25.1 7th 44.4 56.6 41.9 63.0 9th 55.6 43.4 58.1 37.0 Sex Race Hispanic, regardless of race Parent Education Grade Level School 13 Middle School 1 21.5 30.6 14.2 42.4 Middle School 2 22.9 26.0 27.7 20.7 High School 55.6 43.4 58.1 37.0 Procedures General Data Collection Procedures Brainology® Intervention. The Brainology® intervention consisted of an interactive online software program based on Carol Dweck’s mindset research. Students participated in the interactive program for six weeks, which included brain science education as well as information on study skills. The program was completed either in the school’s computer lab or using laptops in the science classroom, depending on available resources for that class. One full class period per week was devoted to the program, supplemented by brief homework assignments or additional in-class activities on other days. Each week, the program included an opening activity led by one of the IMUScLE researchers, followed by the computer module section. Students were required to apply their Brainology® content knowledge during the module, and were also given frequent opportunities to reflect on the material in an “e-journal” during the computer module component. Following the completion of the module, students were given a follow-up activity (this was completed as homework if they did not finish in class). In addition, participant teachers selected additional supplementary activities from the Brainology® teachers’ manual to reinforce relevant concepts during the week. Utility Value Treatment. Once a week for a period of six weeks, students were prompted at the end of science class to write five sentences or more about the usefulness of the day’s topic to their life (utility value). A control group of students also completed a writing task where they were asked to write five sentences or more summarizing what they did in class that day. The 14 writing task in both conditions took approximately 10-15 minutes to complete. Researchers collected these statements and teachers did not discuss the writing task with their students, so that teachers were not aware of whether their students were in the treatment or control group. Similar treatments have been found to have effects on interest & achievement in undergraduate populations, though mostly in subjects other than science (Hulleman et al., 2010), and one study has shown similar effects in 9th grade science classrooms (Hulleman & Harackiewicz, 2009). Coding of the Utility and Control Writing Tasks Researchers typed these writing tasks verbatim (i.e., retaining the students’ typos and grammatical errors) in an Excel spreadsheet immediately after they were collected. The writing tasks were then coded in Excel according to a coding scheme developed by the researchers. Any type of utility statement that was made about science was coded first on the specificity of the content’s usefulness to real life (i.e., 1 = utility of content is stated without goal, 2 = content impacts real life for a passive outcome, 3 = content is useful for some specific goal). Utility statements were then coded on three aspects including: a) for whom the content was useful (e.g., self, others), b) time frame of the content’s usefulness (e.g., immediate future, long term), and c) for what goal the content was useful. In addition to utility value, all writing tasks were coded separately for other values that may be have been present in the essay, namely, attainment, intrinsic, and cost values. Coders first coded these writing tasks individually, and then met with another coder to compare their codes, resolving any disagreements and reaching a consensus by discussion. Interrater reliability was recorded on 54 % of the Utility writing task administrations (i.e., 7 utility + 3 control periods= 10 for HS; 7 utility+ 5 control periods= 12 for MS; total # of utility periods = 15 14, 14 x 6 = 84 is the total # of utility essay administrations. We coded 45/84 utility administrations for reliability (which makes 54%) which yielded a high reliability on all coding categories: 82% agreement on identifying the utility statements in the essay, 88% agreement on identifying the level of the primary utility statement, and more than 95 % agreement in all other categories. Once all writing tasks were coded in Excel, they were converted to SPSS for statistical analysis. The resulting SPSS dataset contained a total of 2797 student essays: 1548 essays were collected from 290 seventh graders, and 1249 essays were collected from 261 ninth graders. Instruments and Measures Student Survey Student mindset. Four items were used to measure students’ beliefs about the malleability of intelligence. The items asked students to report on a six-point scale (from disagree a lot = 1 to agree a lot =6) whether they believed it was possible to change one’s intelligence in science (2 items) or whether science intelligence is fixed (2 items which were reverse scored to create this variable). A factor analysis provided evidence of the construct validity of this subscale. Cronbach’s alpha for these items was: .60 in the initial survey, .74 in the post intervention survey, and .74 in the follow-up survey. Items were drawn from published studies (Aronson et al., 2002; Blackwell et al., 2007), which reported test-retest reliabilities ranging from .77 to .82. Learning goals. Two similar subscales were created. A mastery goals scale was created from four items on the student survey (I do science work to learn new things, I want to work on hard science work, hard assignments mean I’ll learn, and my goal in science is to learn as much 16 as possible) . Cronbach’s alpha was .77, .79 and .82 on the initial, post, and follow-up surveys respectively. A productive goals scale was created from five items (four on the mastery scale and main reason to work is to show I am good at it). The productive goals scale was identified by a factor analysis of items related to mindset. Cronbach’s alpha was .81, .82, and .83 on the initial, post, and follow-up surveys. These subscales were created from z scores of the items because some items were measured on a six point scale and other items were measured on a five point scale. A single item indicates a performance approach goal orientation (goal is to perform better than other students). That item was measured on a five point scale from 1 = strongly disagree to 5 = strongly agree. (Elliot & Murayama, 2008; Blackwell et al., 2007; Midgely et al 1998). Utility value beliefs. To assess students’ utility value beliefs, they were asked three questions about the usefulness of their science learning and their ability to apply that learning to their life outside of school (items from Eccles at al, 1993). They answered using a seven point Likert scale from 1 = strongly disagree to 7 = strongly agree. These questions were used to create a utility value composite variable. Cronbach’s alphas were .81, .85, .88 for the initial, post and follow-up surveys respectively. Science interest. A science interest variable was also created from student ratings of the importance of science, their excitement about learning science and their intrinsic interest (items used in Harackiewicz et al, 2007). Cronbach’s alphas were .92, .90, .93 for the initial, post and follow-up surveys respectively. Success expectancies. Students were asked about how well they expected to do in science class and how capable they were of learning new things in science (items from Eccles et al, 1993). These questions were combined to create a measure of success expectancies. Cronbach’s alphas were .74, .74, .81 for the initial, post and follow-up surveys respectively. 17 Perceived competence. Student ratings of their science ability, both in comparison to other students and subjects, was combined to create a composite variable of perceived competence (items from Wigfield & Eccles, 2000). Cronbach’s alphas were .80, .84, .86 for the initial, post and follow-up surveys respectively. Occupational aspirations. Occupational aspirations were assessed as an indication of science interest during the initial survey and during the final survey that student participants completed. The survey asked students to report “what job do you expect to have when you are 30 years old?” Individual responses were coded and were later grouped by career field. This method is in keeping with the formatting and coding employed in national surveys conducted by the National Center for Education Statistics. Experience Sampling Method. During each year of data collection, students’ subjective experience in each science classroom was measured repeatedly using a variant of the Experience Sampling Method (ESM; Csikszentmihalyi & Larson, 1987). Turner and colleagues have used procedures very similar to those used in the current study, combining ESM and observational data to demonstrate relationships among specific instructional practices and students’ subjective experience (Turner, et al., 1998; Schwinle & Turner, 2006). Following the day’s lesson, students completed an Experience Sampling Form (ESF) in which they were prompted to “think about their work in class today” and report on several dimensions of their subjective motivational and affective experience using Likert scales. The ESFs took approximately three minutes to complete, and were administered 11 times in each year; two times before treatment, once per week during the six-week treatment period, and once per month for three months following treatment completion. 18 In total, 6610 ESM responses were collected. From 7th graders, 3171 responses were collected, for an average of eight responses per participant (80% response rate). From 9th graders , 3439 responses were collected, for an average of 9.3 responses per participant (93% response rate). Participant non-response to the ESM was nearly entirely attributable to school absence. The method has a high degree of external or “ecological” validity, capturing participants’ responses in everyday life. There are indications that the internal validity of the ESM is stronger than one-time questionnaires as well. Zuzanek (1999) has shown that the immediacy of the questions reduces the potential for failure of recall and the tendency to choose responses on the basis of social desirability (see Csikszentmihalyi & Larson, 1987, and Hektner, Schmidt, & Csikszentmihalyi, 2007 for extensive evidence on validity and reliability). Classroom Observations. Classrooms were observed on 11 different occasions before, during, and after the intervention which was significantly more than the sufficient number of observations suggested by some studies (e.g., Shih, 2013) therefore allowing us to effectively capture the qualities of these classrooms. On each of these 11 occasions, a team of two to three trained observers recorded instructional activities and multiple dimensions of classroom context including event sampling of explicit and implied messages conveyed by teachers and students regarding… [MINDSET AND/OR UTILITY- MODIFY AS NEEDED]. Observers were intentionally placed in different positions in order to capture student messages that might not be heard from across the classroom. One of the observers was always a principal investigator. Principal investigators had extensive experience observing classrooms, and together they trained the other observers. Trainees received a field manual with detailed instructions, participated in several half-day 19 training sessions, and practiced observing independently, using videotapes filmed in science classrooms similar to those participating in the study. Observers did not enter the field until they reached 90 % or greater inter-rater reliability with the ratings the senior observation instructor had pre assigned on two videos. Reliability on classroom ratings among coders was high (see below for details). Notes from all coders present were used to compile a comprehensive set of field notes documenting [MINDSET AND/OR UTILITY- MODIFY AS NEEDED] messages expressed by teachers and students in the classroom. These field notes were later coded (see description of coding below). Field Notes Activity. The method that the teacher was using and the type of work that students were doing was recorded. We adopted the criteria of Duke (2000) classifying the instructional practices that the majority of students were doing in the classroom (p. 210). The time when the activity began and when the activity code changed was recorded. The following descriptors were used to categorize and code the described activity. Instances of observer disagreement were nearly nonexistent. 1. Teacher Presentation: pertained to large-group instruction when a teacher explained concepts or ideas, presented facts about science, or demonstrated concepts or explanations about subject matters (unless related to lab work). Teacher presentation may involve teacher questioning of students (IRE pattern) interspersed with teacher presentation, stopping a film to elaborate or emphasize content, or elaborating and providing more information about a student presentation. 2. Individual Student seatwork: used when students worked independently on class assignments under teacher guidance. 20 3. Group seatwork: described activities during which students worked in pairs or small groups on class assignments under teacher guidance. 4. Tests/quizzes: utilized to describe test preparation, test taking, and reviewing graded tests. 5. Whole-class discussion: described teacher led student discussion. To be coded as discussion, the teacher must have used open-ended questions; asked for students’ explanations before presenting their (teachers’) answer; asked students to formulate their own questions, alternative perspectives, or problem solving process (Barak & Shakhman, 2008, p. 15). 6. Student presentations/demonstrations: described activities when students shared their work in a formal way such as reading their written work, showing results or conclusions from lab reports or models, going to board to demonstrate. It implies a kind of special preparation in a science topic (Thier & Daviss, 2002). 7. Video/movie: used when students watched a movie related to science topic. 8. Lab work: is used to describe anything related to direct opportunities for science laboratory experiences (Von Secker & Lissitz, 1999) and included lab instructions, preparation to conduct lab, direct experiences in experimentation, or discussing and reviewing observations, conclusions, and questions about completed lab work. 9. Non instructional time: Described course-related but not content related activities including announcing due dates, test dates, activity schedule, changes in the class schedule or routine, distributing materials, getting things set up, checking if students did the homework not its content. 10. Off task-activity described any activity that is unrelated to science. Examples include discussing what you are doing over the weekend, sports, pledge of allegiance, etc. When the majority of students seem to be off task even instructed to be working, this category is used. 21 11. Codes related to study. 21 (ESF), 22 (utility writing), 23 (control writing). Subject matter. The concepts and content of the science activities was recorded. One of the observers collected any handouts that were distributed to the students. The page numbers of the textbook were noted if the students were working in it. Utility events. Observers recorded instances during the class which pertained to any type of value statement that was made about science (utility, attainment, intrinsic, cost values). Both the initiator and referent audience were noted. In describing the event, observers recorded multiple aspects of it including: (a) whether the instance referred to science generally (“science is fun”) or to a specific topic in science (e.g. humidity, Newton’s first law); (b) to whom it was identified as of value; (c) when the value accrued or would accrue (e.g. past, shortly, long term); (d) in what way it was of value (e.g. career, health, school, hobbies), and (c) the general relationship between science and usefulness (relationship exists, passive value, achieves a particular goal). Field notes were coded using the NVivo10 software program. Once coding in NVivo10 was completed, data were analyzed using SPSS. The field notes were coded so that any change in the type of value, type of utility, the initiator, or the referent audience signaled a new event. The aspects pertaining to the event were then coded for each event that appeared in the notes. A total of 154 (11 days x 14 class periods) field notes were collected in middle schools. Just over 20% of these field notes (n=33) were coded for inter-rater reliability by two different coders, and yielded a 96-100% reliability across all coding categories. A total of 165 (11 days x 15 class periods) field notes were collected in middle schools. Twenty percent of these field notes (n=33) were coded for inter-rater reliability by two different coders, and yielded a 96100% reliability across all coding categories. 22 Mindset events. Observational event-sampled field notes were coded for the purpose of coding teacher-provided messages related to mindset. For each teacher, we coded field notes from a total of 11 days per teacher: one day of regular instruction per week in each classroom for two weeks prior to the intervention, the six weeks in which the Brainology® program was being implemented, and three weeks post intervention later in the school year. The day of the week we observed varied from week to week. Field notes were coded using the NVivo10 software program. Altogether, 29% of field notes were coded in pairs. After demonstrating greater than 90% agreement on what was a mindset message and greater than 85 % agreement on the dimensions of that message, coders completed coding individually. Once all coding was completed by individual coders, 20 % of field notes in each grade were coded for inter-rater reliability: A total of 154 (11 days x 14 class periods) field notes were collected in middle schools. Just over 20% of these field notes (n=33) were coded for inter-rater reliability by two different coders, and yielded a 98-100% reliability across all coding categories. A total of 165 (11 days x 15 class periods) field notes were collected in the high school. Twenty percent of these field notes (n=33) were coded for inter-rater reliability by two different coders, and yielded a 98-100% reliability across all coding categories. Mindset messages were identified as any explicit statement or behavior that referred to Brainology® program content, task difficulty/ease, effort, study strategies, ability, or performance criteria, regardless of whether the reference explicitly mentioned mindset. Each mindset message was coded along multiple dimensions which recorded the nature of the messages as promoting or undermining a growth orientation. Messages that were coded as 23 promoting a growth mindset specifically mentioned growth of intelligence, referenced Brainology® content, emphasized effort, or suggested/modeled study strategies. Messages that were coded as undermining a growth mindset included those that clearly mentioned a fixed view of intelligence, valued low effort, and focused on task ease, difficulty, and ability without reference to effort. Once coding in NVivo10 was completed, data were analyzed using SPSS. Global ratings by activity. For each instructional activity except non-instructional and off task, five ratings were made. On Task (1 = < ¼ , 2 = ¼-1/2 , 3 = More than ½ to ¾, 4 = More than ¾ ) referred to the percentage of students who appear to be on task during the classroom activity. This global rating was dependent upon attention and participation. Instruction Relative to Academic skill level of class indicated whether what was planned for and asked of the students fit with the academic competence of the majority of the students using four categories: 1 = frustration, 2 = required directive control, 3 = at students’ instructional level, or 4 = students were already at mastery level. Conceptual development indicated the degree to which teachers promoted higher order thinking, critical thinking, elaboration (why, how, compare), and problem solving, leading students to go beyond fact and recall to make inferences, hypothesize, analyze, interpret, reason on a four point scale from 1 = almost none to 4 = extensive. Direct Instruction (or Drill) indicates the degree to which rote learning (surface learning strategies like repetition) is emphasized and was rated on a scale from 1 = almost none to 4 = extensive. Instructional Feedback describes the extent to which teachers support and extend student learning through responses, scaffolding, promotion of student skills, and participation in activities on a scale from 1 = almost none, to 4 = extensive. Spot checks of reliability of these ratings indicated good (ranging from 88 – 94 %) agreement. 24 Global ratings by class period. Immediately following the class period, observers rated three aspects of the overall classroom environment during the class period observed. Emotional climate of the class described overall interaction patterns between teachers and students in the class and was rated on a three-point scale as negative (indicating unpleasantness, anger, or hostility), neutral (generally flat, not emotionally charged), or positive (respectful, friendly, caring, helpful). Percent agreement was 93% between the two principal investigators; the average agreement corrected for chance among all raters was .85. Productivity/Organization indicated how well the class was organized and run in terms of routines, directions, and time management and was rated on a four point scale from 1= chaotic to 4 = highly efficient. Percent agreement was 87 % between the two principal investigators; the average agreement corrected for chance among all raters was .80. Teacher enthusiasm described the interest and passion communicated by the teacher during the class period using a four point scale from 1 = projects boredom to 4 = passionate. Percent agreement was 93% between the two principal investigators; the average agreement corrected for chance among all raters was .78. School records. Information was provided from students’ records by a school official during the three years of the project. Background characteristics of the participants included whether they received free and reduced lunches, if they were gifted students, or if they had Individual Educational Plans (IEPS). Their absences from school also were provided. During the three years of treatment we received students’ quarterly grades and annual test scores. Scores from the ISAT and IL Student achievement PSAE (7th and 11th grade) EXPLORE (9th grade), PLAN (10th grade) and ACT (11th 25 grade) were also collected. Science courses the students took during the time of the study were noted and a measure of science interest was derived from that record. Teacher Instruments Teacher Survey. Prior to the start of data collection in classrooms, participant teachers completed a survey in which they provided information about their demographic characteristics, professional training, and current teaching assignment. Also included in the survey were a series of questions used by Blackwell, Trzesniewski & Dweck (2007) to assess mindset, learning goals, and positive beliefs about effort. These items exactly mirrored those in the student survey. Teachers were asked a number of questions about male vs. female students in science. Specifically, they were asked to indicate whether males and females differed from one another in terms of aptitude, effort, interest, perceived utility, anxiety, mindset, learning goals, effort beliefs, and effective motivational strategies. Teacher Interviews. Following the completion of the six-week treatment, each teacher participated in a one-on-one interview, which was audio-recorded. In addition to discussing their impressions of the various treatments that were administered in their classrooms, teachers were also asked to discuss the themes they felt were important in the units that had been observed. Teachers were asked to discuss whether they observed gender differences in science interest, aptitude, or achievement. It was expected that teachers would give socially desirable responses indicating few gender differences. Therefore, more specific beliefs about gender were tapped. First, teachers were asked to identify a particular student in their class who had the greatest potential for a science career and explain why they predicted this. Student gender and the reasons given for selection were noted. Second, teachers in the Mindset and Mindset + Utility conditions identified the male and female student in their classes that had the strongest growth and fixed 26 mindset and were asked to compare and contrast them. Similarly, teachers in the Utility and Control conditions identified the male and female students who had the highest and lowest utility value for science, and were asked to compare and contrast them. Data Available for Subsamples of Students Brainology® Program Output: Students in Mindset or Mindset X Utility Conditions Brainology® survey (Mindset Assessment Profile). Students completed the Mindset Assessment Profile (MAP) tool which was designed to assess their beliefs about the malleability of intelligence, the relative importance of learning and perfect performance, and their attitudes toward effort and mistakes. Students completed this test before and after the Brainology® intervention to assess any changes in their mindset beliefs. The MAP test was a paper and pencil test and consisted of the following 8 items which students rated on a 6-point Likert-type scale (1=disagree a lot, 6=agree a lot): 1. No matter how much intelligence you have you can always change it a good amount. 2. You can learn new things, but you cannot really change your basic amount of intelligence. 3. I like school work best when it makes me think hard. 4. I like school work best when I can do it really well without too much trouble. 5. I like school work that I'll learn from even if I make a lot of mistakes. 6. I like school work best when I can do it perfectly without any mistakes. 7. When something is hard, it just makes me want to work more on it, not less. 8. To tell the truth, when I work hard at my schoolwork, it makes me feel like I'm not very smart. Data were entered into SPSS. Negatively worded items (2,4,6,8) were reverse coded. The dataset includes responses from 165 seventh graders, and 209 ninth graders. The mindset profile score is obtained by summing the individual scores for the 8 items, with higher scores indicating 27 a stronger growth mindset orientation. For middle school, alpha reliability was .60 for pre-test (M=27, SD=6), and .65 for post-test (M=32, SD=6.5). For high school, alpha reliability .52 for pre-test (M=27, SD=5), and .50 for post-test (M=28, SD=5). Brainology® log of student responses to interactive aspects of program. Student reflections during each unit of Brainology®. Within the Brainology® program students were asked to give quick reflections within each unit to several open-ended questions (Introduction: Quick Reflection on what I learned from Dr. C. about my ability to control my own brain; Unit1: e.g., What do I think my brain has to do with my life?, Quick Reflection on what I learned about what the brain is involved in, Quick Reflection on how what I am learning can help me in school; Unit2: Do I have any ideas about how my brain works?, Quick Reflection on what the brain looks like and how neurons send messages, Quick Reflection on the tools I learned from Dr. C to better control my own brain; Unit3: How do I think the brain learns?, Quick Reflection on how the brain gets stronger with exercise; Unit4: Ideas on how my memory works, Quick Reflection on the three different types of memory, Quick Reflection on the five BRAIN strategies to help me learn and remember new things). Students’ responses (usually one or two sentences) to these questions were then compiled in an SPSS datafile. The datafile includes responses from 163 seventh graders, and 206 ninth graders. Students’ end-of-unit reflections. Within the Brainology® program students were asked to give quick reflections at the end of each unit to 3 open-ended and one Likert-type question where they were asked to indicate 1.what they learned in the unit, 2. How helpful it was (1=not at all, 4=a lot), 3. Why it was helpful, and 4. Any other thoughts they had. Students’ responses to 28 these questions were then compiled in an SPSS datafile. The datafile includes responses from 163 seventh graders, and 206 ninth graders. Pre-Post Challenges in School. Within the Brainology® program students completed a survey that asked about their effort beliefs and study strategies as well as the challenges they face in school. Students completed this survey twice, before and after the Brainology® program. Students were asked to rate the following six items on a 6-point scale (0 = disagree a lot, 5 = agree a lot): 1. I work hard to learn new things, 2. I know study techniques that help me learn effectively when I study, 3. If an assignment is hard, it means I'll probably learn a lot doing it, 4. I have trouble paying attention in class (reverse coded), 5. I believe that the harder I study, the more successful I will be in school, 6. I believe that I can succeed in school. The overall score was obtained by taking the mean of the individual scores for the six items. For middle school, pre-test M = 3.7, SD =.79, alpha reliability = .70; post-test M = 3.8, SD = 76, alpha reliability = .75. For high school, pre-test M = 3.4, SD = .78, alpha reliability = .71; posttest M = 3.5, SD = .76, alpha reliability = .72. A seventh item on this survey asked students to indicate the specific challenges they face in school : 7. What are your biggest challenges in school? Yes = 1, No = 0). The specific challenges students rated were: 1. I have trouble concentrating on school work. 2. I get really nervous when I take a test, 3. I forget things that I read or hear in class, 4. Some subjects are very hard for me to learn, 5. I’m too far behind in my class, 6. I’m just not a good student, 7. I don’t know how to take notes in class, 8. I don't have enough time to do everything, 9. I lose papers, notes or assignments, 10. I don’t like school, 11. There’s nobody to help me, 12. I don’t know how to study for a test, 13. I don’t have a good place to study or do homework, 14. Personal problems get in the way, 15. Other challenges (open-ended). The number of areas meriting 29 attention was computed by summing students’ ratings on these 14 challenges. For middle school, pre-test M = 2.09, SD = 1.69, post-test M = 1.58, SD = 1.57. For high school, pre-test M = 2.74, SD = 1.70, post-test M = 1.89, SD = 1.79. Final Brainology® evaluation. Within the Brainology® program students responded to a post-program survey that asked about their final thoughts upon completing the program. Students were asked about their 1.impressions of Brainology® (open-ended), 2. Whether they enjoyed it (1 = not at all, 4 = yes, a lot. Seventh grade M = 3.23, SD = .76; Ninth grade M = 2.88, SD = .69), 3. Whether they found it helpful (1 = not at all, 4 = yes, a lot; Seventh grade M = 3.44, SD = .66; Ninth grade M = 2.97, SD = .73), and 4. Their thoughts and improvements (open-ended). Brainology® seatwork. Following each weekly Brainology® activity, students completed an in-class worksheet to reinforce the lesson of that Brainology® activity. Altogether, students completed eight worksheets that were scanned and archived. Those worksheets covered issues including: (a) experiences with success on challenging tasks as a result of effort; (b) report and analysis of sleep and dietary behavior in the past day together with setting short term goals to improve; (c) creating mnemonic and visual cues for learning; (d) personal stress inventory and coping strategies; (e) knowledge about neurons, fight-flight responses, and coping with test anxiety; (f) links between attitudes and success; (g) analysis of fixed and growth mindsets; and (h) making a plan to use study strategies. Data for Students in Utility or Control Writing Conditions Science anxiety. Science anxiety was measured in eight general science classrooms in the high school on the initial and post surveys. Three self-report items were adapted from the anxiety scale of the Achievement Emotions Questionnaire (AEQ) (Pekrun, Goetz, Frenzel, 30 Barchfeld, & Perry, 2010). The instructions and items from the AEQ were adjusted to be science course specific. Students were instructed to report how they typically feel while attending class, studying, or taking a test about science. Students rated their responses on a 5-point Likert scale ranging from 1 (Completely disagree) to 5 (Completely agree). Student scores for these three self-report items were summed to create a total science anxiety score for both the pre and post measurements. Cronbach’s alpha coefficients were .76 for the initial survey and .80 for the post survey. Coping behaviors. Coping behaviors were measured among the same students who responded to the stress scale. The coping behaviors used by the students were measured though seven self-report items adapted from scales of the Adolescent Coping Orientation for Problem Experiences (A-COPE) as well as coping behaviors described in previous studies (Patterson & McCubbin, 1987; Prins & Hanewald, 1999). Coping behaviors measured were (a) relaxing, (b) positive self-talk, (c) self-reliance and optimism, (d) social support, € engaging in difficult work, and (f) seeking diversions. Seven items were adapted to reflect six coping behaviors (1 item measured each type of coping except that two items were combined to reflect coping through seeking diversions with Cronbach’s alphas of .65 and .60 for the initial and posttest surveys respectively). Instructions for the measure were specific to how often students engage in certain coping behaviors when students face difficulties or feel tense about science. Students responded to each of these items with a response on a five point Likert scale ranging from 1 (Never) to 5 (Most of the time). 31 References Aronson, J., Fried, C. B., & Good, C. (2002). Reducing the effects of stereotype threat on African American college students by shaping theories of intelligence. Journal of Experimental Social Psychology, 38, 113-125. Barak, M. & Shakhman, L. (2008). Reform-based science teaching: Teachers’ instructional practices and conceptions. Eurasia Journal of Mathematics, Science and technology Education, 4(1), 11-20. Blackwell, L., Trzesniewski, K., & Dweck, C. (2007). Implicit Theories of Intelligence Predict Achievement across an Adolescent Transition: A Longitudinal Study and an Intervention. Child Development, Vol. 78, No. 1, pp. 246-263. Csikszentmihalyi, M., & Larson, R. (1987). Validity and reliability of the experience sampling method. Journal of Nervous and Mental Disease 175, 526-536. Duke, N. K. (2000). 3.6 minutes per day: The scarcity of informational texts in first grade. Reading Research Quarterly, 35, 202-224. Reprinted in Mason, P. A., & Schumm, J. S. (Eds.) (2003). Promising practices in urban reading instruction. Newark, DE: International Reading Association. Eccles, J., Wigfield, A., Harod, R.D., Blumenfeld, P. (1993). Age and gender differences in children’s self- and task perceptions during elementary school. Child Development, 64, 3, 830-847. Elliot, A.J & Murayama, K. (2008). One the measurement of achievement goals: critique, illustration, and application. Journal of Educational Psychology, 100, 2, 613-638. Harackiewicz, J. M., Durik, A. M., Barron, K. E., Linnenbrink-Garcia, E. A., & Tauer, J. M. (2007). The role of achievement goals in the development of interest: Reciprocal relations between achievement goals, interest and performance. Journal of Educational Psychology. Hektner, J.M., Schmidt, J.A., & Csikszentmihalyi, M. (2007). Experience sampling method: Measuring the quality of everyday life. Thousand Oaks, CA: Sage. Hulleman, C. S., Godes, O., Hendricks, B. L., & Harackiewicz, J. M. (2010). Enhancing interest and performance with a utility value intervention. Journal of Educational Psychology, 102, 880–895. Hulleman, C. S., & Harackiewicz, J. M. (2009). Promoting interest and performance in high school science classes. Science, 326, 1410–1412. 32 Midgley, C., Kaplan, A., Middleton, M., Maehr, M.L., Urdan, T. Anderman, L.H. (1998). The development and validation of scales assessing students’ achievement goal orientations. Contemporary Educational Psychology, 23, 113-131. Patterson, J. M., & McCubbin, H. I. (1987). Adolescent coping style and behaviors: Conceptualization and measurement. Journal of adolescence, 10(2), 163-186. Pekrun, R., Goetz, T., Frenzel, A. C., Barchfeld, P., & Perry, R. P. (2011). Measuring emotions in students’ learning and performance: The Achievement Emotions Questionnaire (AEQ). Contemporary Educational Psychology, 36, 36–48. Prins, P. J. M., & Hanewald, G. J. F. P. (1999). Coping self-talk and cognitive interference in anxious children. Journal of Consulting and Clinical Psychology, 67, 435– 439. Schmidt, J. A., Shumow, L., & Durik, A., (2011). Incremental mindset and utility for science learning and engagement (IMUScLE). Grant proposal funded by the National Science Foundation, Washington, DC. Schweinle, A., Turner, J. C., & Meyer, D. K. (2006). Striking the right balance: Students' motivation and affect in upper elementary mathematics classes. Journal of Educational Research, 995, 271-293. Shih, J. C. (2013). How many classroom observations are sufficient? Empirical findings in the context of a longitudinal study. Middle Grades Research Journal, 8 (2), 41-49. Thier, M., & Daviss, B. (2002). The New Science Literacy: Using Language Skills to Help Students Learn Science. Portsmouth, NH: Heinemann. Turner, J. C., Meyer, D. K., Cox, K. E., Logan, C., DiCintio, M., & Thomas, C. T. (1998). Creating contexts for involvement in mathematics. Journal of Educational Psychology, 904, 730-745. Von Secker, C. E., & Lissitz, R. W. (1999). Estimating the impact of instructional practices on student achievement in science. Journal of Research in Science Teaching, 36(10), 1110- 1126. Wigfield, A. & Eccles, J. (2000). Expectancy-value theory of achievement motivation. Contemporary Educational Psychology, 25, 68-81. Zuzanek, J. (1999). Experience sampling method: Current and potential research applications. Paper presented at the workshop on time-use measurement and research, National Research Council, Washington, DC.