Confidence Experiment Münster, EWICS Workshop, 28/04/04

Confidence Experiment Münster, EWICS Workshop, 28/04/04 Eugenio Alberdi, Meine van der Meulen, Robin Bloomfield, Bev Littlewood, Peter Ayton DIRC Easter Meeting – 16 March 2005 Motivation  ‘Dependability Cases’ TA uncertainty in quantitative probabilistic claims for dependability – always present  need for a rigorous and formal understanding of the role of confidence in dependability arguments    Goals of the exercise:  Investigate how experts express confidence on dependability judgements     to support a claim at a particular quantitative level AND at a particular level of confidence elicit distributions underlying experts’ own beliefs look at a distribution of a population of experts See impact of empirical data on modelling work Psychology:  studies of human confidence on probabilistic judgments    e.g. overconfidence and incoherent judgments influence of how information is presented not normally studied with experts Procedure   Presentation from Meine on the application Participants asked to:  choose the ‘pfd interval’ in which the described functionality fits best  confidence in different ‘pfd intervals’ for that application  4 phases:  I: after hearing Meine’s presentation  II: after they get an answer for 1 question they are allowed to ask  III: after they hear Meine’s answers to all questions asked by other participants  IV: after a “Delphi”-like interaction amongst participants Application Domain Hoist Nuclear material Transfer port Safety function: If the safety pushbutton is pressed, the carousel will stop moving Goal: Assess pfd of (new) software that controls the safety function Information about: test reports, safety analyses, formal proof Procedure   Presentation from Meine on the application Participants asked to:  choose the ‘pfd interval’ in which the described functionality fits best  confidence in different ‘pfd intervals’ for that application  4 phases:  I: after hearing Meine’s presentation  II: after they get an answer for 1 question they are allowed to ask  III: after they hear Meine’s answers to all questions asked by other participants  IV: after a “Delphi”-like interaction amongst participants Procedure   Presentation from Meine on the application Participants asked to:  choose the ‘pfd interval’ in which the described functionality fits best  confidence in different ‘pfd intervals’ for that application  4 phases:  I: after hearing Meine’s presentation  II: after they get an answer for 1 question they are allowed to ask  III: after they hear Meine’s answers to all questions asked by other participants  IV: after a “Delphi”-like interaction amongst participants Participants  12 delegates at EWICS meeting: 3 left early  Remaining 9 participants:  Self-reported experience in the assessment of safety software  3 “not very experienced” • 2 researchers; 1 security & business development  3 “fairly experienced” (5-10 years) • 2 researchers; 1 researcher and assessor  3 “very experienced” (10-30 years) • 1 researcher; 1 assessor; 1 researcher and assessor  Countries: Austria, Germany, Norway, Poland, UK Mean Confidence Values (9 participants) 25% 20% 15% Mean Confidence % PHASE I PHASE II PHASE III PHASE IV 10% 5% 0% >10-1 (highest) 10-1 to 10-2 10-2 to 10-3 10-3 to 10-4 pfd intervals 10-4 to 10-5 <10-5 (lowest) Chosen Interval - Confidence 7 >10-1 10-1 10-2 to to 6 10-2 10-3 10-3 to 10-4 10-4 to 10-5 <10-5 75% 70% 90% 60% 60% 60% 70% 60% 66% 5 3 60% 60% 0% 4 95% 45% 50% 0% 0% 65% 50% 65% 0% 65% 45% 29% 29% 40% 40%, 40% 30% 29% 2 1 0 PHASE II 2 6 7 8 9 10 50% 29% PHASE I 1 PHASE III PHASE IV 11 12 Chosen Interval - Confidence 7 >10-1 10-1 to 6 10-2 10-2 to 10-3 10-3 to 10-4 10-4 to 10-5 <10-5 75% 70% 90% 60% 60% 60% 70% 60% 66% 5 4 3 95% 60% 60% 1 45% 50% 65% 50% 65% 50% 65% 45% 30% 2 40% 40% 40% 1 0 PHASE I PHASE II PHASE III PHASE IV 2 6 8 9 11 45% 40% More experienced (3 participants) 35% 30% 25% Mean Confidence % 20% 15% 10% 5% 0% >10-1 10-1 to 10-2 10-2 to 10-3 10-3 to 10-4 10-4 to 10-5 highest pfd <10-5 lowest pfd 45% 40% Less experienced (3 participants) 35% 30% 25% Mean Confidence % 20% 15% 10% 5% 0% >10-1 highest pfd 10-1 to 10-2 10-2 to 10-3 10-3 to 10-4 10-4 to 10-5 <10-5 lowest pfd Some conclusions  Feasible task: mixture of cooperation and resistance  Considerable variability among participants:  Expertise levels, backgrounds, assessments, confidence levels, proneness to change their mind…  Less experienced participants: less likely to change their minds  More experienced participants : tend to perceive the system as being less reliable (higher pfd)  Increasing scepticism about system’s reliability as the study evolved and more information was available  Most likelihood to change judgement in the 4th ("Delphi") phase of experiment Future Work  Further analysis of collected data: Correspondence between “qualitative” & “quantitative” confidence  Implications for modelling   Further data collection: Better screened & larger set of experts  Control - different groups getting:     Refine some of the procedures    different types of information? (e.g. better safety arguments, less “ideal” system) or information presented in different ways? collection of think-aloud protocols? more open ended? how questions are asked… Psychological literature on people’s confidence in uncertain judgments

Confidence Experiment Münster, EWICS Workshop, 28/04/04

Related documents

Products

Support

Confidence Experiment Münster, EWICS Workshop, 28/04/04

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib