評估(2) 不同類型的評估 紙筆考核 口試 專題研習 口頭報告 角色扮演 學習歷程檔案 實驗 模型製作 討論分享 辯論 信度和效度 你認為在剛才各種不同類型的評估中,那種信度比較 高呢? 而那一種效度比較高呢? 信度和效度 普遍社會上信認受的考核(公開考試)真的可信嗎? 高分低能的問題 完全不可信嗎? 信度和效度 恆久已用的方法,可能是與真理背道而馳 分數反映什麼? 名次又是什麼? 中 英 數 總分 小明 46 47 98 191 志強 67 52 35 154 芳芳 75 73 4 152 杜鵑 48 54 87 189 反思 我們為什麼需要考核 – 木人巷? 考試是服務於教學,還是教學服務於考試呢? Skill of Assessment (評估技巧) In the educational process, there are variety of way to assess students' abilities and skills. However, to select a suitable method, there are several factors needed to be concerned. They are; objectives of the assessment (評估的目的) available time and resources (時間和資源) students’ ability (學生的能力) Skill of Assessment (評估技巧) Commonly used assessment methods: formative evaluation (形成性評量) summative evaluation(總結性評量) criterion-referenced evaluation (標準參照評量) norm-referenced evaluation (常模參照評量) Skill of Assessment (評估技巧) Commonly used assessment methods also include: Test (測驗) Used to assess students’ recognition (認知的能力) Checklist (核對用的清單) Used to assess students’ mastered skills (掌握的技巧) Rating scale (評核量表) Also used to assess students’ mastered skills, attitude and value judgment Written Test (測驗) Test items may include: Short answer(短答題) True/False(是非題) Matching(配合題) Multiple choice(選擇題) Essay(文字題) (or Long Question) Discussion For those test items as below, could you state the characteristics of them? Short answer True/False Matching Multiple choice Essay (or long question) Note to Design Questions (Short-answer Items) The answer should be concise, specific and only one Avoid directly copying from textbook or handouts Questions should be direct, rather than as an incomplete sentence The blanks to be filled should be having the similar length Avoid too many blanks in a question The answer should not be too long Note to Design Questions (True/False Items) Items need absolutely to be true or false The number of words in the questions should be similar The number of true and false question also should be similar Have to avoid items appear to be two identical ideas use negative proposition especially double negative (重覆負向) use patterned question use general description (概括性的敘述) such as "usually", "General", "sometimes", etc… Note to Design Questions (Matching Items) Try to use identical types of data in all matching items The number of stems (前題) and responses (反應) can be different Responses can be used one, more than one or even not used The relationship between the stem and responses needs to be clearly identified Type of Multiple-choice Items one correct answer variety (單一正確答案題型) best answer variety (最佳答案題型) multiple response variety (多重答案題型) negative variety (否定題型) Note to Design Questions (Multiple-choice Items) The question itself should be meaningful Avoid negative question The grammar, style and length of all choices should be consistent Prompt to select the best choice for best answer variety Only one choice is provided to one question Clearly state now many choices required in multiple response variety The choice appeared should not be traceable Carefully use the choice of “All of Above”「以上皆 是」or “None of Above”「以上皆非」 Note to Design Questions (Essay Items or Long Question) Question must be clear and concise The questions needs to be covered the range that you have prepared to be tested The question needs to be limited in the range of idea answer, provide hints for critical area If possible, design a suggest solution, or defining some standard for answering the questions, such as key words, coverage etc… Questions are independent to each others, that means grouping the homogeneous items to one question A question may be broken-down to some manageable parts of sub-question Concept of Checklist Items (核對用清單) It is a series of standard to measure the performance of products A method systematically to check a personal performance against to some crucial behaviors Or, measuring do some characteristics existed in a product Can be criterion-referenced evaluation (標準參照評量) or norm-referenced evaluation (常模參照評量) Note to Design Questions (Checklist Items) Coherence on the important features and/or behavior Adding common mistake as negative hints The items need to convergence and not too many Focusing on the critical areas, which may be observed by some characteristics or signals Avoid irrelevant list items Concept of Rating Scale (評核量表) Similar to checklist but with scales concerned Used when scale on the question is come to meaningful therefore how many scales is used to be important Mainly for quality evaluation Make use of deciding whether some characteristics are existed and how level is existed Note to Design Questions (Rating Scale) Similar to design of checklist but need to consider the number of scales Often subject to statistical concept, use five or seven scales The use of scales need to be available for analysis 參考資料 http://www.ied.edu.hk/apfslt/v3_issue2/chengmh/index.ht m#contents 分組討論 (1) 課題:科學探索 – 水的循環 評量學生對『水的循環』過程的認知,雲和雨的形成, 環境污染等相關事項 可以建議使用什麼種類的評估? 分組討論 (2) 課題:近代中國的經濟發展 評量學生對近代中國的經濟發展的認知、時序、特色 及個別重要事件 可以建議使用什麼種類的評估? 分組討論 (3) 課題:全球化對文化及經濟的影響 評量學生對『全球化』的認知、個人的觀點 可以建議使用什麼種類的評估?