Innovative Assessment in Large Classes Author(s): Richard W. Buchanan and Martha Rogers Source: College Teaching, Vol. 38, No. 2 (Spring, 1990), pp. 69-73 Published by: Taylor & Francis, Ltd. Stable URL: http://www.jstor.org/stable/27558399 . Accessed: 26/03/2014 11:46 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. . Taylor & Francis, Ltd. is collaborating with JSTOR to digitize, preserve and extend access to College Teaching. http://www.jstor.org This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AM All use subject to JSTOR Terms and Conditions Innovative in Assessment Large Richard W. Buchanan w e would some like to offer to useful solve suggestions some of the assessment prob with large lems frequently encountered classes. 'Targe classes" will be defined here as those with eighty students or more. Although is some this definition what arbitrary, it has been our experi ence that eighty students is the break where are no traditional teaching and longer workable techniques new ones must be tried. This breaking in the noticeable point is particularly ing point area of assessment. We've watched many of our colleagues struggle along with traditional approaches, such as es to say examinations, up points where class enrollments exceed eighty. Then they normally collapse from overwork, as to lower-level assessment delegate or new start for ap sistants, looking proaches. This paper will to three problems: some show solutions to offer 1. How in large students to be assessed in classes an opportunity an essay format without straining the resources available for grading to deal with students who 2. How miss a required examination Richard W. Buchanan marketing Zealand. professor at the Massey Martha of Rogers marketing is senior lecturer in University is an at Bowling in New assistant Green State University in Bowling Green, Ohio. Vol. 38/No. 2 69 Classes and Martha Rogers to generate 3. How large numbers of new, relevant examination questions on a regular basis It is useful to begin by stressing that this paper is not, and was never intend ed to be, an elegant scientific examina tion of all the factors within its focus. It is our intention to share techniques that have worked for us in sections between 50 and 350 stu numbering dents. One author typically teaches be tween two and three thousand students per year. We have had only one graduate assis tant assigned to each of us for a period of five to ten hours per week, and thus of dealing with mass finding a means numbers became a matter of survival. Virtually all of the solutions suggested by this article were the result of trial and-error. As such, this paper cannot lay claim to having tested all possible we In addition, solutions. although have kept reasonably accurate records to test the effectiveness of various solu we no have made tions, attempt to pre sent them as anything other than ap proximations. Our three assessment solutions will be presented and should be used simul taneously, as a total system. This is in that it is keeping with our experience best to treat instructional design as a than to treat individual system?rather parts in isolation. To do otherwise of ten causes the solution to one problem to exacerbate another. Therefore, this paper will not only relate those parts of the system designed to deal with select ed problems but will also mention some solutions for problems created by the new system itself. Objective Tests?Imperfect but Unavoidable Although people teaching large class es often try to avoid multiple-choice/ true-false that tests, we have found such efforts seem to be appreciated by almost no one. Although colleagues may criticize the limitations of anything other than essay tests, they usually are if more willing to accept an alternative than fifty students are involved. Admin istrators may make noises about the de but, in sirability of essay examinations, our experience, they are rarely willing to trade the time it takes to grade them for a lack of participation in either matters or research/publica of administration tion. Finally, students are not nearly so to the fond of them as their comments contrary might For all these suggest. reasons we are assum the basis for assessment will be objective This questions. a storm unleashes assumption normally to the effect that of student complaints "I just don't do well on objective tests." Although this may be the case for some, we have found that, general ly, the belief just doesn't hold true. the years we have often Through ing that primarily it a point to offer both essay and to stu final examinations objective dents who have been tested up to that made This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AM All use subject to JSTOR Terms and Conditions time in an objective format. Those who have taken the essay options have been graded on the basis of their examina our first checking to see tion without their performance had been on what test items. Only rarely has objective their letter grade on the essay final ex amination from the let been different ter grade on previous objective tests. concurs with the find This observation and Hubbard ings of Cowles (1952), and and Bracht Thompson (1965), Hopkins (1970). A study by Warren (1979) indicates that it may actually be easier for students to get high marks with multiple-choice than with essay tests (also see Hogan [1981]). is not This rule-of-thumb, however, even if it true for all students. And, were for true, it will not be useful if they students' quieting objections think it is not true for them. For this to found it necessary reason, we've as some to for students be way provide sessed in an essay format?while still enor ourselves the protecting against mous time investment required to eval uate all students in this manner. Some idea of how great a time in vestment may be involved can be deter a hypothetical mined by considering a more or less that example. Suppose standard ten-question, short-answer test intended to be taken in fifty min utes were to be given. Assuming that it of two to three min takes a minimum utes to grade each question means that assessing each paper in the most mini mal fashion requires a total of from twenty to thirty minutes. Multiplying this figure by a not uncommon student load of six hundred students produces a figure of from two hundred to three Even if instructors hundred hours. were to spend all of their time grading week papers on a forty-hour basis, each exam would take from five to seven weeks to process. Some might argue that this situation could be alleviated by the use of grad ers, but this technique has problems of its own. Among them are coordination/ of the graders, variability management among graders, and the fact that stu like to have their dents don't normally work assessed by someone other than the instructor. All of these factors argue for a solu tion that offers students a chance to be in an essay format but that assessed will limit the number of students so as sessed to reasonable numbers. Self-Selective Essay Exams We found that the only system that would fit into the preceding constraints had to be based on what many would term a "cafeteria" approach. The phil osophical basis of this approach (which em is frequently used in structuring ployee benefit plans) is to offer "con a number of options sumers" from which they can select the combination of items they prefer. the Students are, therefore, offered (1) four objec following three options: tive concept tests only, (2) four objec tive concept tests and an optional final, or (3) three objective concept tests and an optional final. In options one and three, each test is worth 25 percent of their course grade; in option two, each test is worth 20 percent. Those students electing to take the optional final are told 1. their current grade prior to the fi nal (i.e., Should they quit while they're ahead?); can 2. that the final examination hurt them as well as help them (i.e., a concept test some can?under circum stances?be but a final can dropped, not be dropped if attempted); 3. the approximate of percentage students taking the final examination and the fraction of these improving their grades over the years; will 4. that the final examination consist of either a fifty-question objec tive test or a ten-question short-answer essay?both covering the entire course; 5. that students will have to decide prior to taking the final which version they will attempt (i.e., they could not look at both and decide which version was easier); and 6. that most in the past students have preferred the objective version be cause it loads their risk into small (two points each) components large (ten-point) "hunks." When the options them in this manner, rather than are presented to to 10 15 per only in large cent of the students enrolled courses have elected to attempt the fi nal. Of those taking the final, no more than 20 percent chose the essay version ?and, typically, only six or seven in a class of three hundred students. These numbers, though manageable have been distilled even fur enough, ther by a refinement of the system that was produced to meet what proved to be a product of the authors' teaching styles. When large classes, teaching we've found it useful to make sure that the lectures contain enough material not covered text to in the supporting it worthwhile for students to at make tend lectures. We tell the students that this material will be both presented and the subject of examination questions (i.e., at least 30 percent of a test's items will not be found in the book). to it is generally Because impossible the lectures, those students videotape miss who many classes have a very real they could miss at problem, although least one concept test without penalty. cov if the final examination However, ers both the text and the lecture, they are still at risk for those topics covered For this reason during their absences. we decided to make ver the objective cover the sion of the final examination text the only while from both the the lectures Generally, of ed to applications or facts, to definitions drawn that these applications in an essay is essay version text and lecture. are more orient than knowledge and we believe are better tested format. this refinement was made, the percentage of students taking the final exam remained about the same, but the number electing the essay version has to a fraction of 1 percent. dropped Still, it has always been there if anyone to complain about not doing wanted tests. To the best of well on objective no complaints our knowledge, about of essay tests have the unavailability ever been made about our large classes. It may also be useful to know that Once the percentage of students attempting the final usually falls over time, pos the grapevine sibly because eventually spreads the word that the final is not a soft option. At any rate, particularly the ceiling on the people attempting it 70 COLLEGE TEACHING This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AM All use subject to JSTOR Terms and Conditions seems to be about 10 to 15 percent of leave little total time invested would those enrolled. time for doing anything else. Beyond tests should be How many concept this, we have felt totally helpless to de Students if termine which excuses are truthful, jus administered? complain there are fewer than four concept tests, tified, or both. three or fewer because Even if all the absentees could be ac administering exams causes the amount of material to commodated, numbers their sheer to arrange a time be covered on each one to be unman it impossible make exam that they and place for a makeup ageable. Having more than four seems can all attend. Finally, if a makeup test the re impractical because itmultiplies a point of di sources needed beyond it is allowed, there is no way to make returns. If anyone is al fair for all concerned. minishing The basis of this system is in direct lowed to take the test prior to the regu contrast to what seems to be an aca lar class, then someone is bound to feel of placing tradition demic that those taking the makeup will pass relatively if a greater emphasis on the final examina questions on to their friends. And, tion than on others such as the concept totally different test is given as a make tests. However, it is not our intention up, someone will argue that it is harder a to load most of student's evaluation (easier) than the regular test. on only into his or her performance it would this point have been At one day of the term. tempting to surrender the entire matter Makeup Abolishing Exams Once the "cafeteria" style is adopt it then becomes ed, possible to use it to solve other problems such as makeup exams. students absent from a re Having exam sit is never a comfortable quired uation. Professors dread the inconven a makeup exam ience of constructing and find distasteful the thought of serving as judge, jury, and executioner excuses are ac in determining whether ceptable. At the same time, students don't like having their integrity ques often in tioned by an unpredictable, sensitive system that they frequently suspect of being punitive. These more or less standard complaints explode in their intensity when multiplied by the enrollments of a large class. of Before the problem tackling tests, we realized that 15 to 25 makeup of students might be absent percent from any given examination. When ap plied to a class enrollment of 80 to 350, and multiplied by several sections, the total number of students likely to be in volved is beyond the scope of tradition al methods for handling them. The first problem the is processing flood of individuals who show up at an instructor's either prior to or door an after with their examination shortly excuses for being absent. If only five minutes is spent with Vol. 38/No. 2 each person, the and decide to accept absolutely cuses except those that conform versity policy and are supported no ex to uni by ap student documentation (i.e., center doctor's excuse, etc.). sense suggests that this But, common some per overlook limitation would proved health and this would fectly valid situations, lead to further conflict. Although such in smaller conflict may be permissible class settings, it definitely is not for large ones. One thing that large classes teach their instructors is never to tolerate any situation that strikes a large number of univer students as unfair. Reasonable are to administrators used discard sity ing the opinions of what they may per stu ceive as a handful of disgruntled dents. They are much more likely to take action if fifty or a hundred gather outside their door. After all the problems considering associated with makeup exams, we de cided to offer the students the option (previously discussed) of being assessed on the basis of three concept tests and an optional final examination that be comes mandatory if a student misses one of the concept tests. At the time the students are informed of this op tion they are also told that 1. they do not need to inform the in structor or get permission to miss a test; they are by taking this option, giving up the ability to drop a low is (i.e., what really is happening students are given the ability to a bad per drop a low test score?either 2. also test that on formance a test taken or no per formance on one they missed); test or the 3. if they miss another final they will fail the course; and, most important, 4. no makeups will be given for any reason to anyone. all this they are also told the of the final examination, specifics which have already been introduced in a preceding section. Besides This system has had remarkable re sults. Only a handful of students come to the office door each year to ask Be about the possibility of a makeup. of students yond this, the percentage test to miss any concept electing given 5 perccent of has averaged around those enrolled. And we are relatively certain that any who do miss a test under these circumstances have reasons that they think are justifiable. Limitations of this part of the system should be mentioned. Most important, an exam, that when a student misses student has not been assessed on a sig nificant percentage of course material. we have not yet tried it, one Although solution to this drawback would be to give more weight on the final exam to on those items assessing the material the missed exam. This weighting would be procedurally simple. Each student taking the final exam will do so either as a fifth exam or to make voluntarily exam. The student's up for a missed record will reveal which is the case, and, if the latter, which exam was missed. It is then a relatively to simple matter ex the items from the missed weight ams more heavily. in a few rare cases, a Additionally, student has tried to test the system either by challenging it or by missing two examinations. In the first cate gory, an entire hockey team had their coach call, first, a department chair, and then the dean, trying to get an ex cused absence. These matters were eas ily dealt with as soon as both the spe cifics, the rationale behind the system. 71 This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AM All use subject to JSTOR Terms and Conditions to the clarity of the presentation students at the outset of the course were explained to the administration. and two examina When students miss tions, we've found it easy to deal with on them a case-by-case basis. Fre two exams those who miss quently never even bother to come in and sim ply accept their failing grades. Test Generating Items true that generating It is certainly test questions is not particularly easy for any course. However, large class One sizes produce unique pressures. problem is introduced by the sheer vol ume of the class. The students are like avail ly to fill the largest auditorium able, or at least a large amphitheater classroom. Thus, when tests are given there is no way to spread students out a seat between each of them. with must be at least two (and pos sibly more) versions of each test given for each examination. We find that re this. ordering the items accomplishes sections are taking If two or more There at different the examination times, each ''sitting" will probably need en as well. examinations tirely different about the ex information Otherwise, am will flow from the earlier class to the later one. The most obvious way is if copies of the test this can happen are pilfered and removed from the ex room. amination even However, with is not the only stringent security, this once We way for exams to "get out." an section a in student learned of early who had taken the exam with a tape re corder in his pocket. He apparently sat at the back of the room and mouthed into the recorder; then, the questions looked up the an he left the room, swers, and gave copies of the tests to also heard that so his friends. We've indi directed have cial organizations to members vidual carry questions from the test in memory (i.e., "you do one to five and she will do six through ten," etc.). Finally, large class sizes usually de mand that all new tests be constructed for each test each year. Large classes, tend to be entry-level courses, which the for are tempting develop targets ment of files that can be passed on from year to year once students learn that exams may be repeated. The thing that makes all of these fears more real is seemingly paranoid that a large class size escalates the value or information of misappropriated copies of exams. A graduate assistant, forces in a campus security caught raid, had apparently been selling copies of exams for $100 each. that a All of these concerns mandate large number of test items be devel is The only problem oped continually. that the instructor of a large course in it after a tend to specialize may and Since the same textbooks relatively similar lectures are used year after year, the instructor may find a diminishing ability to generate new ob while. jective test items. A popular solution to this problem, test banks supplied by textbook com panies, may fail on two counts. One is that the test bank has ques difficulty tions that apply only to the text. As al ready stated, we find it desirable that lecture content and textual material be If this is the case, there may different. a that will be large body of information come from not be tested if questions students figure test banks only. Once this out (and they will), lecture atten dance will fall. with our experience Furthermore, test banks has been mixed. textbook seem poorly Some of the questions or irrelevant. worded, ambiguous, the resource that is the same as enough, Ironically can solve this problem the one that causes most of the other large size of the class. problems: Student-Generated Test Items in a of students a source aid of class represents large seldom recognized by teachers. Chan the sum neled in the right direction, The sheer volume talents within a large class is more than equal to its chal usually a small class may have lenges. Where students, only four or five outstanding big ones may have fifty or more. We have designed a system that en total of ables this resource to be put to work. In a handout issued before the first ex am, students are told that they can sub in mit potential examinations questions a format. The motivations specified for students' writing test items are that of (1) they can have the satisfaction used with seeing their own questions their names attached (i.e., the instruc tor will identify the author of the ques tion on the exam if the student wishes it); (2) if they submit the question they presumably will get it correct on the ex am; and (3) the teacher agrees to "pay" credit for them two points additional same as each each question chosen (the on an is worth examination/ question total based upon system grading points). the correct format students Telling for submitting questions has proved to the instructor be crucial, as otherwise can be deluged with pieces of paper to process. For that are very difficult this reason we insist that students can submit up to ten questions per exam, that all questions must be on a standard 5"-x-7" card, that each question must be either typed or legibly printed on a separate card, and that information the source giving the correct answer, (i.e., page of the text, date of lecture, etc.), and the identity of the author must be provided for each question. the years students Over operating have provided under these constraints many of our test items. We have been happily surprised by the quality of the as many as ten Although questions. one to sifted be may get good questions (and even this one usually requires rewriting), we believe that those select ed have been of a caliber at least as good as many of those in test banks and are often less trivial and more con one ceptual. students' reaction is always dif to assess. We've tried to be par to any dissatisfac ticularly attentive had we've occasionally tion. Although that "the exam (grade) the complaint the doesn't reflect how much I know," student comments from mandatory of course and instructor evaluations of those have been fairly repetitive The ficult received about questions we've gener The only complaint ated traditionally. unique to this system is that "the in structor shouldn't be so lazy as to let and these others write his exams," rare to lack and seem to be passion. 72 This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AM All use subject to JSTOR Terms and Conditions COLLEGE TEACHING from colleagues and uni Reactions have been diffi versity administrators of them cult to assess because most seem oblivious to the system. Although we've been careful to get administra tive approval of this approach, permis sion to use it has proved easy to get. has been granted, we Once permission have yet to hear much about the whole to concept from anyone not connected the course, presumably because there have been no complaints. At any rate, the system does seem to generate questions that are good to excellent once they have been filtered and rewritten. We believe it's impor tant to make every effort not to include that measure mere simplistic questions ly rote memorization. Our experience has shown that on from the average there are normally one to one-and-one-half questions sub mitted per test per student enrolled. Thus if three hundred students are en rolled in the class, the instructor may to be expect from 300-450 questions submitted per test, and more for mul tiple sections. Usually the total number of questions climbs with each succes sive test, as some students discover that their grades and they can increase others realize that they are now in aca demic points. trouble and the need students Furthermore, bonus repeat is an edly tell us that writing questions effective way to review for exams. The total number of questions sub mitted may frighten some teachers, but it shouldn't because a number of tech niques can make the job of processing them easier. First, their sheer numbers mandate that having some remote loca tion for their deposit, like a faculty Public a good is probably mailbox, idea. Once the questions are all in one large stack, we suggest making an out line of topics to be covered on the ex am. Then the items that "fit" can be used until the exam covers all the nec room essary topics. If an item looks promis In ing it is kept; if not, it is discarded. order to reward as many individual stu dents as possible, we accept no more than two questions per student. This is fairly easy to keep track of as students in batches that are submit the questions or banded together. paper-clipped It should be noted that it is not nec sub essary to read all the questions In fact, we find that it's best to mitted. be honest about telling the students that we will simply reach into the stack and draw out questions until we have the right mix of good ones to create the exam desired. They seem to accept this com lottery approach without much plaint. this procedure, we've found Using can a test standard be that fifty-item two to in from three hours constructed per test. This certainly compares favor to create ably with the time necessary And the oneself. since ques questions tions selected are all typed on standard correct answers at sized cards with tached, this system is also usually pop ular with the word-processing depart it is an easy matter for them to ment; turn a standard title page and a rubber banded stack of fifty questions into a finished examination. Once their work is complete, they can then pass the stack of questions on to a person creat ing an answer key for grading pur poses. Finally, copies of this key with the page number where test items are located can be passed back to students with their answer sheets so that they can check to see that an answer they missed really does exist. For those faced with the responsibili ties of teaching large classes, this arti cle was intended to resolve some practi of assess cal aspects of the problems ment. It is sometimes hard to depart from traditional methods without much the inno about whether soul-searching vations are somehow a dilution of the quality of the original. The fact is, we to guide have little empirical evidence we must us, and accept the changes that resources at hand dictate in a way the seems for best that everyone. REFERENCES G. Bracht, K. D. 1970. Hopkins. of essay and achievement. tests objective Educational and Psychological Measurement 30 (Sum 359-64. mer): J. T., Cowles, J. P. Hubbard. and 1952. A study of essay and objective for medical students. Jour comparative examinations nal ofMedical Education, Part 2. 27:14 17. T. Hogan, free P. 1981. Washington, Education. between Relationship and response achievement: tests choice-type review of the A D.C.: National (ERIC Document of literature. Institute of Reproduc tion No. ED 224 811) R. Thompsen, E. 1965. A study of the com parative predictive validities of the essay and trance en sections of the college objective examination board advanced place ment examination Educational opment Warren, choice Testing 65-4. Report G. tests. Television's 1990 A|TH 73 This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AM All use subject to JSTOR Terms and Conditions in physics. Service. Princeton: Test Devel versus Essay multiple Journal in Sci of Research 1979. ence Teaching YEAR OF THE ENVIRONMENT Vol. 38/No. 2 and H., Communality of academic 16(November): 563-67.