© © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. An Ethics Transfer Case Assessment Tool for Measuring Ethical Reasoning Abilities of Engineering Students Using Reflexive Principlism Approach Justin L. Hess1*, Jonathan Beever2+, Andrew Iliadis2,3*, Lorraine G. Kisselburgh3*, Carla B. Zoltowski4*, Matthew J. M. Krane5*, & Andrew O. Brightman6* 1 2 3 Engineering Education, Philosophy, Communication, 4EPICS, 5Materials Engineering, 6Biomedical Engineering *Purdue University, West Lafayette, IN, USA, +Penn State University, State College, PA, USA Abstract— This work in progress paper presents initial results on the development and testing of a novel assessment tool utilizing an ethics transfer case methodology targeted at measuring the ethical reasoning ability of engineering students employing reflexive principlism. This work evaluates the reliability and transferability of a rubric-based assessment of students’ responses to a transfer case study employed at Purdue University in the Spring of 2014. The scoring rubric was developed to assess students’ ability to apply the reasoning components of reflexive principlism including: (a) identification, (b) specification, (c) empathic perspective-taking, (d) justification, and (e) reflectivity. To determine reliability of the scoring rubric, two raters independently scored 19 students’ precourse responses through 3 iterations of the rubric’s development, until 85% overall inter-rater agreement was reached. Two additional scorers, normed on the coding framework, then provided feedback on wording and applied the rubric to the same 19 student responses. Initial results from this analysis and discussion of the assessment tool are presented. Keywords— engineering ethics; reflexive principlism; transfer case; principles; ethical reasoning assessment; perspective taking I. INTRODUCTION Professional codes are helpful for resolving many ethical dilemmas, but they are insufficient when encountering novel ethical issues which have no precedent. As Flanagan and Clark wrote, “Regulations are designed to remedy old problems not anticipate new ones” (p. 493) [1]. Due to the rapid emergence of new technologies the chance of encountering a novel ethical dilemma within engineering practice is today greater than ever before. Following such encounters, new codes may emerge to address such issues, but this slow, ex post facto code development offers little assistance for the engineer to reason through novel issues and reach morally acceptable and socially responsible solutions when faced with a dilemma for which there is no existing code. A coherent framework for ethical reasoning that engineers may use when faced with novel ethical dilemmas has not yet been adopted within engineering ethics education. Furthermore, assessment of such framework is also needed if engineering educators are to determine their pedagogical effectiveness. A. Ethics Transfer Case Methodology We propose that use of an ethics transfer case, a case study external to the context of the course, may be an effective means of assessing development of ethical reasoning in the 978-1-4799-3922-0/14/$31.00 ©2014 IEEE context of engineering ethics education. Transfer of knowledge is the concept that students will take what they learn from within the course and implement it in some context outside of the course [2]. Thiel and others suggested, “Cases have been useful in promoting transfer in an ethics context because they provide models for addressing ethical problems faced in one’s work” (p. 268) [3]. The theory here is that cases offer students an opportunity to work through real ethical issues which may be relevant to their future work. Largely lacking within engineering ethics education are clear and effective methods of assessing transfer of learning regarding ethical reasoning processes. To address this need we designed an ethics transfer case that students complete before and after an engineering ethics course. The intent of developing this new assessment tool is twofold: first, we want to determine whether students transfer the knowledge gained during the course to a context outside of the course; second, by presenting the ethics transfer case pre- and post- course, we want to assess whether students show significant change in their ability to apply this ethical reasoning methodology to an engineering moral dilemma. In this ethics transfer case study, students independently work through a real-world ethical dilemma, which is a case not taught or referred to during the course. The case study we designed and implemented deals with the engineering design and distribution of wood stoves in light of more stringent Environmental Protection Agency (EPA) regulations (see Appendix A). Students are asked to take the perspective of an engineer who works for a top-wood stove manufacturer and who is also a consultant for the EPA, and then reason through the case to determine the most ethical course of action. Students are asked to create a visualization and a written description of their reasoning process. Transfer of knowledge can be described as near, when knowledge is transferred to a similar context, or far, when knowledge is transferred to a dissimilar context [2]. The context of the transfer case methodology can also be described as near or far based upon its temporal and spatial distance from the learning activity [2]. Because students will complete this activity at the end of the course, immediately following learning activities, this transfer activity is considered near and therefore only a preliminary indicator of the potential transfer of reflexive principlism into the real world. 2014 IEEE Frontiers in Education Conference 2734 B. Reflexive Principlism as an Ethical Reasoning Approach Reflexive principlism is a version of a method for ethical reasoning developed in the context of biomedical ethics, but particularly relevant to the context of engineering, that specifies and balances common normative principles in the context of a particular case [4; 5]. Reasoning components of reflexive principlism include (1) identification of ethical principles, (2) specification of those principles within the context of a given case (i.e. ethical dilemma), (3) considering and evaluating the perspectives of multiple stakeholders, (4) finding balance and coherence among those principles to help resolve that dilemma, and (5) reflective analysis of the suitability of a proposed solution. Reflexive principlism considers the repetition of the reasoning process through multiple cases to be essential for developing ethical reasoning. C. Research Purpose This work in progress paper presents initial results on the development and testing of a new assessment tool designed to measure the ability of engineering students to employ the reflexive principlism ethical reasoning framework. The primary purpose of this study was to develop a valid and reliable rubric that may be used to assess students’ usage of reflexive principlism in response to an engineering ethics case study. Additionally, our goal is to use this rubric to examine how well students apply reflexive principlism to an ethical problem after participating in our designed ethics course [5]. II. METHODS AND RESULTS Nineteen graduate students were participants of study, including a broad range of ethnicities and engineering disciplines. Students were required to participate in a 1-credit hour ethics course to meet curriculum requirements. Students were presented with the case study (described in Appendix A) before beginning any other course activities, and asked to describe and make a diagram of their thinking and decision-making processes as they developed their response to the case. Students were also asked to identify information needed to improve and support their decision. Students were expected to spend roughly 60 minutes on the activity, but were allowed to complete the activity at their own pace. The final response was uploaded through an online survey tool (Qualtrics). Students’ pre-course responses were evaluated to refine the instrument while seeking inter-rater reliability. These results are presented in the following sections. We do not present findings from the post-course responses in this paper. A. Construct Validity The scoring rubric is intended to measure the ability and tendency to apply reflexive principlism. To achieve construct validity, that is, assuring that the scoring rubric measures what we believe it measures [6], our team of experts from Philosophy, Engineering Education, and Biomedical Engineering worked to develop a rubric based on the five reasoning components of reflexive principlism: identification, specification and balancing, empathic perspective-taking, justification and coherence, and reflectivity. Each expert from our research team independently designed the rubric items nearest their area of expertise. Afterwards, this group of experts met and discussed any possible interpretative difficulties along individual items developed by the others. After minor changes in the rubric format, item descriptions, and the scoring system, the rubric was disseminated to the larger research team, which included professors from the Schools of Materials Engineering and Communication and partners in the biomedical device industry. This group provided feedback on the overall quality of the developed measures, made suggestions on the scope and explicitness of specific items, and encouraged developers to articulate their rubric items in concise language. B. Instrument Reliability In order to determine reliability of the scoring rubric measures, the research team worked through several iterations. After each iteration, the level of agreement between the scorers was calculated by summing the total points possible less the magnitude of differences between scorers divided by the total points possible, times 100%. [7]. In the first three iterations, two scorers applied the rubric to the 19 pre-course submissions and calculated their level of reliability following each iteration until reaching 85% agreement. Scorer 1 was a doctoral candidate in the School of Engineering Education with a B.S. in Civil Engineering and Philosophy minor. Scorer 2 was a postdoctoral researcher in ethics with a Ph.D. in Philosophy. The scorers coded each survey individually and then compared their level of agreement on each item, category, and overall. 1) Iteration 1: In the first iteration, the rubric contained 25 items in 5 categories. The raters’ agreement level was very low, where 8 items had a level of agreement in the 50-70% range, 13 in the 70-80% range, and only 2 items were greater than 80%. To address the largest discrepancies, the raters discussed the responses with the lowest level of agreement. A third member of the research team met with the raters during this discussion to provide recommendations for interpreting responses. Next, the raters attempted to reconcile any divergent interpretations of the rubric items where the agreement rate was below 70%. 2) Iteration 2: In the next iteration, the raters individually recoded the responses. At the end of this phase, the level of agreement between the two raters was 80.3%. While this level is acceptable, it is not sufficient as several codes were below 70%. Therefore, during this phase the coders presented their results to the entire research team, who provided recommendations for changes to the items showing the lowest levels of agreement. 3) Iteration 3: In the next iteration, the raters individually re-coded the responses, and removed the 3 lowest-scoring items, yielding 85% agreement between raters. Table 1 shows the inter-rater reliability for each category. Tables 2-6 show the items corresponding to each reasoning category. 4) Iteration 4: To determine if the scoring rubric could be implemented by others, the scorers normed two additional scorers using an example case study where their initial scores disagreed on only 1 of the 25 items. During this 2-hour session, the groups of scorers reviewed several of the codes from Iteration 3 whose level of agreement was below 80%. To This work was made possible by grants from the National Science Foundation (1237868) and the NSF Graduate Research Fellowship Program (DGE1333468). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily in Education Conference reflect the views of the National Science Foundation. 2014 IEEE Frontiers2735 increase validity, wording on several codes was revised and scores on a few codes were parsed. Two codes were removed as they were determined to be redundant. Scorer 3 was a professor of Biomedical Engineering and Scorer 4 was a doctoral student in Philosophy and Communication. At the conclusion of these iterations, 24 codes in 5 categories were retained. Table 1 shows the results from this final iteration. Table 1: Iteration 3 and 4 Inter-Rater Reliability by Category Category Name Justification Identification Perspective Taking Specification Reflectivity Overall IRR-3* 90.0% 86.5% 88.4% 86.2% 81.6% 86.3% IRR-4** 95.3% 96.3% 96.9% 94.7% 97.6% 96.2% *Inter-rater reliability of iteration 3 from original 2 scorers **Inter-rater reliability of iteration 4 from 2 normed scorers III. DISCUSSION Here we present the final rubric design, discussing each category along with the specific items used to measure the reflexive principlism components. In general, responses could be scored from 0-3, but a few codes were binary, scoring 0 or 1, and a few items ranged to 4 points. The highest score attainable on the activity using the finalized rubric was 53 points. A. Identification To reason through an ethics case, at a minimum, one must successfully identify the implications of four principles upon which reflexive principlism is based: (a) beneficence, (b) nonmaleficence, (c) justice, and (d) respect for autonomy. These four principles are intended to provide a broad and binding normative framework to the process of ethical decisionmaking. Table 2 presents the rubric items pertaining to identification of principles. Table 2: Rubric items corresponding to Identification Autonomy (1 point) Beneficence (1 point) Justice (1 point) Nonmaleficence (1 point) Conflicts of Principles (3 points) Central conflict (3 points) Students are able to explicitly identify 2 or more components of respect for autonomy (e.g. supporting goals, cultures, upholding values, freedom of decisionmaking, valuing views) Students are able to explicitly identify 2 or more components of beneficence (e.g. making money, right thing, doing good) Students are able to explicitly identify 2 or more components of justice (e.g. fairness, equality, what is fair, due, or owed) Students are able to explicitly identify 2 or more components of non-maleficence (e.g. safety, protection of environment, health, avoiding harms) Does the response identify a central value conflict and how that influences their decision process? Does the response identify a central value conflict and how that influences their decision process? B. Specification and Balancing According to Beauchamp and Childress, following Richardson [8], specification of principles is “a process of reducing the indeterminate character of abstract norms and generating more specific, action-guiding content” [9]. When sufficiently specified in the context of a particular case, the principles are determinate enough that it is clear where, when, why, how, by what means, and to whom they apply [8]. Balancing principles is the process by which specified principles are brought into coherence with one another and external facts and values at stake in the case. While all four principles can be relevantly applied, one or another might play a more central role in terms of framing and relieving ethical tension. Table 3 presents the rubric items corresponding to specification and balancing of the principles. Table 3: Rubric items corresponding to Specification Core value(s) (4 points) Level and accuracy of specificity (3 points) Rationale for balancing (3 points) Is at least 1 value specified and prioritized in terms of the four principles as critical to the decision? (e.g. minimizing non-maleficence by preventing air pollution) Specification is highly detailed, including a consideration of each of the 4 principles within the case constraints (e.g. explicitly defines autonomy given case constraints and describes how it differs from some other possible specification) Student recognizes balancing of all 4 principles is necessary and makes a reasonable assessment of that prioritization based on the details of the case and level of impact. C. Empathic Perspective-Taking Throughout the reflexive principlism process, one must actively consider the potential stakeholders, especially those with the highest stakes. Cognitively, empathetic relating is accomplished through imaginative perspective-taking, where the engineer either (a) imagines themselves in the other’s position or (b) imagines how the other experiences or sees the situation. The rubric items along this category are in Table 4. Table 4: Rubric items corresponding to Perspective-Taking Breadth (3 points) Most Impacted (1 point) Users’ Needs (3 points) Dual-perspectives (1 point) Impartiality (1 point) Moral framework (1 point) Other-centric (1 point) Seeking feedback (1 point) Several stakeholders identified Does the student specify which stakeholder(s) have the most at stake? 3 or more external stakeholders' needs used to inform decision Does the student reason back and forth between external stakeholder values and their own? Does the student consider the viewpoint of other stakeholders as equal to their own (weighs others fairly against own)? Are 2 or more principles used explicitly as a basis to reason from other stakeholders' perspectives? (e.g. beneficence appears to be most important for x because of y). Does the student take into account that different stakeholders will have different values or weigh principles differently? (identifying at least one alternative value perspective) Does the response indicate a need for direct feedback from any external stakeholders identified as important to the decision? D. Justification and Coherence In order to come to a decision, one must establish coherence among the four principles, other relevant moral beliefs, facts of the matter, and other relevant epistemic commitments held by stakeholders [9]. This building of coherence is done reflectively; that is, through a dynamic process of testing against the well-established moral beliefs the principles denote. Table 5 shows the rubric items pertaining to Justification. 2014 IEEE Frontiers in Education Conference 2736 Table 5: Rubric items corresponding to Justification Decision Made (4 points) Coherence (3 points) Range of Implications (3 points) Was there an explicit decision proposed? There is evaluation of the relationship among multiple stakeholders, principles, and relevant codes/regulations From the decision, several long- and short- term ethical implications or thought-experiments considered. E. Reflectivity Reflectivity, the process of conscious deliberation on decisions and reasons, is essential to ethical decision making via reflexive principlism. Shifting epistemic and ethical conditions, from new scientific knowledge or engineering design options to social, economic, or personal constraints, demand that the decision-maker engage in a constant process of reflection that parallels the design process familiar to engineers. Over time and by habituation, the conscious process of reflection can become internalized and reflexive; yet, novel cases and conditions constantly push back against a merely reflexive process. Table 6 shows the rubric items pertaining to these ideas. Table 6: Rubric items corresponding to Reflectivity Feedback Loops (4 points) Plans for Reevaluation (3 points) Inclusion of Strengths (2 points) Inclusion of Objections (2 points) The visual or written response clearly indicates the final solution will feedback into earlier considerations and have to be reevaluated through the proposed framework The response suggests several decisions or considerations will need to be reevaluated based on gathering additional information Strengths assessed from 2 or more external stakeholders' perspectives Weaknesses assessed from 2 or more external stakeholders' perspectives IV. CONCLUSION & FUTURE WORK We have developed a new assessment tool for measuring the ethical reasoning abilities of students, specifically their ability to apply reflexive principlism to an ethics transfer case. This study has focused on the development of a valid and reliable rubric to evaluate students’ responses. The analysis focused on the reliability between raters who iteratively developed and applied the rubric to students’ pre-course responses in an engineering ethics course. The scoring reliability of the raters involved in the initial developmental phase improved with each iteration, and the final rubric yielded inter-rater reliability levels of 95% or higher across 5 dimensions: Justification, Identification, Perspective-taking, Specification, and Reflectivity. This demonstrates the scoring rubric is reliable and has potential for expanded utilization in other courses. The limitations of this study are the small number of test cases and the specificity of the tool toward a particular framework of ethical reasoning (reflexive principlism). The applicability of this tool and of the ethics transfer case methodology in other frameworks of ethical reasoning, and in non-engineering STEM discipline contexts is still to be determined. This work in progress report will be expanded in future studies with the completion of the study using the new assessment tool to evaluate students’ post-course responses along with changes in students’ ethical reasoning abilities. We anticipate this tool will fill a gap in the assessment of ethical reasoning for ethics educators across all engineering disciplines and invite testing and validation within other disciplines. APPENDIX A: ETHICS TRANSFER CASE Heating with wood is a time-honored and practical tradition in forested areas and has been making a comeback in Maine. A greater percentage of homes in Maine use wood as their primary heat source – 14 percent – than any state other than Vermont. An estimated 50 percent of Maine homes also use wood as a supplemental heat source. The trend is helpful for cutting expensive oil bills, but not for increasing air quality. Typical wood stoves emit more of the pollution that aggravates asthma and other respiratory conditions than the oil and gas heating systems they are meant to supplement or replace. Twenty-six years ago the U.S. Environmental Protection Agency (EPA) set emission standards for wood heaters at 7.5 grams per hour. Some states have already set stricter standards, such as Washington’s 4.5 grams per hour. Several states, not including Maine, have filed a notice to sue the EPA for failing to revise its outdated standards for residential wood heat. As a result, the EPA has proposed a new standard for 2019; 1.3 grams per hour. This is even lower than the level achieved by one of the top stove designers in Maine who has just completed an extensive redesign for efficiency and air quality on a new wood stove, which still emitted 2.3 grams per hour. There are at least 7 million older-technology stoves currently being used throughout the United States. This past year, fewer than 74,000 new units were sold across the country. A well-built wood stove lasts for generations, so even if the EPA does decide to double down on the regulations, switching out all the old-style stoves with cleaner models will take some time. In addition, one wood stove manufacturer estimated that it will cost nearly $1 million to re-engineer its stoves to meet the 2019 standards and could drive up the cost of a stove by 25%. Another option proposed to the EPA by this wood stove manufacturer representative is to implement a wood stove change-out program. During the summer of 2013, some wood stove dealers offered $300 credits to people who exchanged their old stove for a new one, which sells for between $1,000 and $3,000. These buyers also gained a $300 federal rebate. This federal rebate is expired as of 2014, although some rebates are still offered at the state level, such as the $250 Efficiency Maine rebate. Both engineers and policy-makers face complex ethical decisions in this case. Imagine that you are the lead engineer with one of the top wood-stove manufacturers in the State of Maine and a consultant with the EPA. How would you reason through advising your company on the most ethical course of action? For more information on this case visit: www.pressherald.com/news/tougher-pollution-limits-forwood-stoves-might-just-backfire.html?pagenum=full Task: Create a diagram or flowchart of your thought process that led to and supports your conclusion. (1) Along with your visualization, describe each of the steps you used to come to your decision. (2) Provide a brief explanation of why you used these steps to make your decision. (3) Would you need any other information to improve your decision? If so, what is it and how would you obtain it? (4) Please identify any external sources that you used to inform and support your decision and how you obtained these materials. 2014 IEEE Frontiers in Education Conference 2737 REFERENCES [1] [2] [3] [4] [5] Flanagan, J., & K. Clarke. 2007. "Beyond a Code of Professional Ethics: A Holistic Model of Ethical Decision-Making for Accountants." Abacus, Vol. 43(4), pp. 488-518. Barnett, S. M., & S. J. Ceci. 2002. "When and where do we apply what we learn?: A taxonomy for far transfer." Psychological bulletin, Vol. 128(4), pp. 612-637. Thiel, C. E., S. Connelly, L. Harkrider, L. D. Devenport, Z. Bagdasarov, J. F. Johnson, & M. D. Mumford. 2013. "Case-Based Knowledge and Ethics Education: Improving Learning and Transfer Through Emotionally Rich Cases." Science and Engineering Ethics, Vol. 19(1), pp. 265-286. Beauchamp, T. L. "The ‘four principles’ approach to health care ethics." In R. Ashcroft, A. Dawson, H. Draper & J. McMillan (eds.). 2007. Principles of health care ethics. (2nd ed. West Sussex, UK: Wiley, pp. 310. [6] [7] [8] [9] Kisselburgh, L., C. Zoltowski, J. Beever, J. L. Hess, M. Krane, & A. Brightman. 2013. "Using scaffolded, integrated, and reflexive analysis (SIRA) of cases in a cyber-enabled learning infrastructure to develop moral reasoning in engineering students." In Institute of Electrical and Electronics Engineers. Frontiers in Education. Vol. Oklahoma City, OK, pp. 1561-1563. Goodwin, L. D., & N. L. Leech. 2003. "The Meaning of Validity in the New Standards for Educational and Psychological Testing: Implications for Measurement Courses." Measurement & Evaluation in Counseling & Development (American Counseling Association), Vol. 36(3). Gwet, K. 2001. "Handbook of inter-rater reliability." Gaithersburg, MD: STATAXIS Publishing Company, Vol., pp. 223-246. Richardson, H. S. 2000. "Specifying, Balancing, and Interpreting Bioethical Principles." Journal of Medicine and Philosophy, Vol. 25(3), pp. 285-307. Beauchamp, T. L., & J. F. Childress. 2013. Principles of Biomedical Ethics. 7th ed. New York: Oxford University Press. 2014 IEEE Frontiers in Education Conference 2738