QUEST Honors Program Learning Outcomes Assessment Fall 2015 Results Fall 2015 Learning Outcomes Assessment Results Summary without Client Evaluations: Summary of Fall 2015 data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 LO1.1 Tool Selection LO1.2 Fit LO1.3 Tool Use LO1.4 Solution Evaluation LO2.1 Problem Identification LO2.2 Idea Generation, Screening, Evaluation LO2.3 Prototyping, testing, & integrating feedback LO2.4 Analysis of the innovations feasibility LO3.1 Qualitative Data Analysis LO3.2 Quantitative Data Analysis LO3.3 Multi-Methods Synthesis LO3.4 Methodology Choice LO4.1 Problem Identification LO4.2 Methodology LO4.3 Analysis LO4.4 Recommendations LO5.3 FA Conflict Resolution LO5.4 FA Coherence Around Common Mission LO6.1 Organization LO6.2 Audience Engagement & Professionalism LO6.3 Credibility LO6.4 Effective use of content LO7.1 Objective and tone LO7.2 Conventions of Professional Writing LO7.3 Argument and Evidence LO7.4 Perspectives LO8.1 Parsing Complex Tasks LO8.2 Project Definition LO8.3 FA Project Resource Allocation LO8.4 Risk Management LO9.1 CL Listening LO9.1 FA Listening LO9.2 CL Communication LO9.2 FA Communication LO9.3 CL Attire LO9.3 FA Attire LO9.4 Ethics Unacceptable Developing Proficient Advanced * The incomplete bars in this chart indicate the proportion of assignments where a reviewer declined to evaluate the element. 1 Notes about the summary: Overall, we are doing very well, as indicated by the abundance of the blue and green on the chart. However, in the spirit of continuous improvement, we need to understand more about the red areas which indicate that the work was unacceptable in the corresponding element. The greatest area needing improvement continues to be in data analysis (Learning Outcome 3). Of the Unacceptable ratings, slightly more than half were for 190 reports (the remaining ones were for 490 reports; 390 papers were not evaluated for this learning outcome), as seen in the chart below. LO3 Data Analysis Comparison 4 3.5 3 2.5 2 1.5 1 0.5 0 LO3.1 Qualitative Data Analysis LO3.2 Quantitative LO3.3 Multi-Methods LO3.4 Methodology Data Analysis Synthesis Choice 190H 490H Part of the issue may be that students are not told to provide the level of detail of their analysis that the assessment rubric requires; this might be rectified in the future by requiring an appendix that provides the level of detail described in the rubic. We are also looking at options for making sure that students are learning the necessary data analysis concepts in the program. A new area of concern is in Learning Outcome 4 (Evaluate, analyze and recommend solutions to realworld problems). This is the first semester that we assessed 190H and 390H papers for this learning outcome. The two unacceptable ratings for LO4.2 Methodology were both from 390H assignments. For LO4.3 Analysis, there was one unacceptable for a 190H assignment and one for a 390H paper, and for LO4.4 Recommendations, all three were from 390H papers. LO4 Problem Solving Comparison 3.5 3 2.5 2 1.5 1 0.5 0 LO4.1 Problem Identification LO4.2 Methodology 190H 390H LO4.3 Analysis 490H LO4.4 Recommendations The remaining red areas are as follows: LO2.3 Prototype and test – 1 unacceptable rating LO6.1 Organziation – 1 unacceptable rating LO6.2 Audience Engagement & Professionalism – 3 unacceptable ratings LO6.4 Effective use of content – 2 unacceptable ratings LO7.2 Professional Writing – 1 unacceptable rating LO7.4 Perspectives – 1 unacceptable rating LO9.4 Ethics – 1 unacceptable rating All of these instances point to a problem with calibration, since the specific assignment that received these ratings were rated higher by other reviewers. Summary of Client Evaluations Client Evaluations 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 CLI 1 Overall QUEST stakeholder perspective CLI FA Overall QUEST stakeholder perspective LO4.1 CL Problem Identification LO4.4 CL Recommendations LO9.1 CL Listening LO9.2 CL Communication LO9.3 CL Attire Unacceptable Developing Proficient Advanced Declined As indicated by the low number of observations (far right in the above table), we continue to have difficulty getting responses from all of our clients. They are surveyed mid-project and at the end of the project, so ideally we would have 20 responses for the overall assessment and LO9.1-3. The LO4 elements are only asked once for each client, so the most we can have for these is 10 observations. For comparison, the faculty advisors’ overall evaluations are included (second row from top). 1 Inter-rater Reliability Calibration of the assessments continues to be an issue. The Fall 2015 assessments indiate that there needs to be greater agreeement among reviewers. The reliability results for learning outcomes 1, 2, and 3 are shown below. Percent Element LO1.1 Tool Selection LO1.2 Fit LO1.3 Tool Use LO1.4 Solution Evaluation overall Element LO2.1 Problem Identification LO2.2 Idea Generation, Screening, Evaluation, and Selection LO2.3 Prototyping, modeling, testing, and integrating feedback LO2.4 Analysis of the innovation's feasibility overall Element LO3.1 Qualitative Data Analysis LO3.2 Quantitative Data Analysis LO3.3 Multi-Methods Synthesis LO3.4 Methodology Choice overall Agreement 100 50 44 80 59 Obs (N) 5 8 16 5 34 Test Kappa2 sq Kappa2 lnr Value 0.432 0.369 p 0.0037 0.0023 N 34 34 Agreement moderate fair Percent Agreement 40 Obs (N) 15 Test Kappa2 sq Value 0.0251 p 0.812 N 60 Agreement very slight 27 15 Kappa2 lnr -0.051 0.538 60 none 47 15 47 40 15 60 Percent Agreement Obs (N) Test Value p N Agreement 52.2 23 Kappa2 sq 0.356 0.0042 62 fair 45 20 Kappa2 lnr 0.252 0.003 62 fair 66.7 10 45.2 9 10 62 We looked at the percentage agreement at the element level and at the learning outcome level. At the element level, we simply used percent agreement since there were so few observations. The results show that agreement is rarely above 50%, indicating the need for calibration training for the reviewers. At the outcomes level, we used Cohen’s weighted kappa (squared and linear weights) to examine interrater reliability since there were two raters and ordinal data. The Agreement column on the far right is based on the guidelines of Landis and Koch (1977) who characterized values < 0 as indicating no agreement and 0–0.20 as slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1 as nearly perfect agreement. Based on these results, we see that there is only fair agreement for learning outcomes 1 and 3; and no agreement for learning outcome 2. For Learning Outcome 7 (written communications), we had three reviewers, so slightly different testing was needed. Pairwise comparison of reviewers for overall agreement only shows slight agreement. The Fleiss Kappa statistic (for more than two raters) comfirms this finding. Percent Test Obs Element Agreement (N) raters 1&2 55 80 raters 1&3 52.5 80 raters 2&3 46.2 80 Value Kappa.fleiss raters 1&2 kappa2 sq raters 1&3 kappa2 sq raters 2&3 kappa2 sq 0.112 0.265 0.220 0.174 p 0.0396 0.0119 0.0403 0.0758 N Agreement 80 80 80 80 slight slight slight slight Reliability was not assessed for learning outcome 4 due to a lack of data. The remaining learning outcomes (5, 6, 8 and 9) were not assessed for reliability since there were many different reviewers. Comparison of outcomes for 190H, 390H and 490H The tables below compare the average scores for each learning outcome element for assignments completed in the three required QUEST courses. For learning outcomes 5, 8 and 9, data was only received for BMGT/ENES 490H assignments, so no comparison could be made. LO1 Quality Management Comparison 3.7 3.6 3.5 3.4 3.3 3.2 3.1 3 2.9 2.8 2.7 LO1.1 Tool Selection LO1.2 Fit 190H LO1.3 Tool Use 490H (note: 190H assignments were only assesed for LO1.3) LO1.4 Solution Evaluation LO2 Product Development Comparison 3.7 3.6 3.5 3.4 3.3 3.2 3.1 3 2.9 2.8 2.7 LO2.1 Problem Identification LO2.2 Idea Generation, Evaluation, and Selection LO2.3 Prototyping LO2.4 Analysis of the and testing innovation's feasibility 190H 390H LO4 Problem Solving Comparison 3.5 3 2.5 2 1.5 1 0.5 0 LO4.1 Problem Identification LO4.2 Methodology 190H 390H LO4.3 Analysis LO4.4 Recommendations 490H LO6 Oral Communications Comparison 4 3.5 3 2.5 2 1.5 1 0.5 0 LO6.1 Organization LO6.2 Audience Engagement & Professionalism 190H 390H LO6.3 Credibility 490H LO6.4 Effective use of content LO7 Written Communications Comparison 4 3.5 3 2.5 2 1.5 1 0.5 0 LO7.1 Objective and LO7.2 Conventions of LO7.3 Argument and LO7.4 Perspectives tone Professional Writing Evidence 190H 390H 490H References: Landis, J.R.; Koch, G.G. (1977). "The measurement of observer agreement for categorical data". Biometrics 33 (1): 159–174. Viera, Anthony J.; Garrett, Joanne M. (2005). "Understanding interobserver agreement: the kappa statistic". Family Medicine 37 (5): 360–363.