IWBF 2014 ENFSI Monopoly Programme 2011 Improving Forensic Methodologies across Europe (IFMAE) Methodological guidelines for semi-automatic and automatic speaker recognition for case assessment and interpretation Dr. Andrzej Drygajlo Speech Processing and Biometrics Group Swiss Federal Institute of Technology Lausanne (EPFL) School of Criminal Justice – University of Lausanne 1 European Network of Forensic Science Institutes Forensic Speech and Audio Analysis Working Group 2 ENFSI Monopoly Programme 2011 • Improving Forensic Methodologies across Europe (IFMAE) – (IFMAE) concerns broad activities including best practices, validation studies, proficiency tests and collaborative exercises – The validation of forensic methods remains an area where a lot of work needs to be done. This applies across all forensic fields and particularly for comparative biometric methods and techniques. – It is a critical element in the mutual recognition of forensic investigative and evaluative results across the European Union (EU) 3 Monopoly Programme 2011 • Monopoly 2011 – Applicants need to be aware that the currently active monopoly programmes for 2009 and 2010 (MP2009 and MP2010) include 4 projects of direct relevance: MP2009/P4 – The development of guidelines for the validation of analytical and comparative methods in forensic science MP2009/P5 – The development of guidelines for conducting proficiency tests and collaborative exercises in forensic science MP2010/M1 – ENFSI standard for the formulation of evaluative reports in forensic science MP2010/M7 – The evaluation of computer forensic proficiency tests within computer forensics 4 List of Monopoly Projects (MP2011 Programme Bid) ENFSI Project No. Z1 Allocated Funding Project Title €150,000 Virtual Mobile Forensic Laboratory (VMFL) €90,000 Internet Accessible Database on Textile Fibres Z3 €120,000 Methodological Guidelines for Semiautomatic and Automatic Speaker Recognition for Case Assessment and Interpretation Z4 €50,000 Z5 €85,000 Project Leader (ENFSI Laboratory) Hans-Jürgen Stenger LKA, Germany Kornelia Nehse Z2 Georg Jochem LKA Berlin, Germany BKA, Germany Andrzej Drygajlo CFLPP, Poland International Cooperation for Testing, Validation and Application of Ink Dating Methods Jürgen Bügler LKA, Bavaria, Germany Standardization of Forensic Image and Video Enhancement (S-Five) Patrick De Smet NICC, Belgium 5 FSAAWG proposal submission – 1 March 2011 • “Methodological guidelines for semi-automatic and automatic speaker recognition for case assessment and interpretation” – Project leader: Dr. Andrzej Drygajlo, Chair of ENFSI FSAAWG – Beneficiary: Central Forensic Laboratory of the Polish Police (CFLPP), Warsaw, Poland (Financial and Management Coordinator) – Financial support: € 120 000 (36 months) 6 Brief Project Overview (specific issues) 7 • This project aims at introducing methodological guidelines that provide a coherent way of quantifying and presenting recorded voice as scientific evidence. • Two main issues are addressed in this project: – The first is building basic methodological support for semi- automatic and automatic speaker recognition, corresponding to challenges found in real casework and modern communication networks – The second is creating a common methodology for semi-automatic and automatic speaker recognition in forensic applications within the framework of the support to the criminal-justice system, to evaluate this methodology in different real-world applications and to spread this methodology in police and forensic environments Brief Project Overview (key objectives) • The goal of this project is the development of a standard approach for automatic and semiautomatic forensic speaker recognition (FSR) based on scientifically approved methods for calculation and interpretation of forensic evidence • The four main objectives of the project are as follows: 1. 2. 3. 4. Best Practice Manuals for forensic semi-automatic and automatic speaker recognition Methodological guidelines for implementation of semiautomatic and automatic tools Validation studies and evaluation protocols for semiautomatic and automatic speaker recognition technology in forensic environment – Proficiency tests Comparative study of forensic speaker recognition performance of phonetic experts and automated methods – Collaborative exercises 8 Forensics and Biometrics • Forensics (Forensic science) refers to the applications of scientific principles and technical methods to the investigation of criminal activities, in order to demonstrate the existence of a crime, and to determine the identity of its author(s) and their modus operandi. – Forensic (adj.) means the use of science or technology in the investigation and establishment of facts or evidence in the court of law. • Biometrics is the science of establishing identity of individuals based on their biological and behavioral characteristics 9 Forensic Speaker Recognition Casework Trace Suspect Questioned recording Forensic speaker recognition (FSR) is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). 10 Forensic expert perspective ( ENFSI Standard”) 11 • The expert should base his opinion upon four principles: – Balance – the expert should address at least two competing propositions (adversary system) – Logic – the expert should address the probability of the evidence given the proposition and relevant background information and not the probability of the proposition given the evidence and background information. – Robustness – the expert should provide opinion that is capable of scrutiny by other experts and crossexamination. – Transparency – the expert should be able to demonstrate how he came to his conclusion in way that is suitable for a wide audience (i.e. participants in the justice system) Forensic specificity 12 The forensic expert’s role is to testify to the worth of the evidence by using, if possible a quantitative measure of this worth. It is up to the judge and/or the jury to use this information as an aid to their deliberations and decision. • The role of forensic science is the provision of opinion to help answer questions of importance to investigators and to courts of law • Respective duties of the actors involved in the judicial process: jurists, forensic experts, judges, etc. Forensic specificity A criminal trial is a process for decision making in the face of uncertainty Probability theory is the calculus of reasoning in the face of uncertainty 13 Bayesian Interpretation of Forensic Evidence 14 The odds form of Bayes’ theorem prior background knowledge New Data posterior knowledge on the issue P(H 0 , I ) P(E | H 0 , I ) P(H 0 | E, I ) × = P(H1 , I ) P(E | H1 , I ) P(H1 | E, I ) Prior odds province of the court Likelihood Ratio (LR) Posterior odds province of the forensic expert province of the court I – Background Information Bayesian Interpretation of Forensic Evidence 15 • H0 – the suspected speaker is the source of the questioned recording • H1 – the speaker at the origin of the questioned recording is not the suspected speaker P(H 0 , I ) P(E | H 0 , I ) P(H 0 | E, I ) × = P(H1 , I ) P(E | H1 , I ) P(H1 | E, I ) similarity Likelihood ratio P(E | H 0 , I ) P(E | H1 , I ) Strength of evidence typicality Evidence evaluation and its value? Relevance and the formulation of propositions? Univariate (Scoring) Method 16 Univariate (Scoring) Method 17 Strength of Evidence - Likelihood Ratio 18 A likelihood ratio of 9.16 obtained means that it is 9.16 times more likely to observe the score (E) given the hypothesis H0 (the suspect is the source of the questioned recording) than given the hypothesis H1 (that another speaker from the relevant population is the source of the questioned recording). Interpretation of Biometric Evidence Multivariate (Direct) Method 19 Evaluation of the Strength of Evidence Principle – Estimation and comparison of likelihood ratios that can be obtained from the evidence E: – when the hypothesis H0 is true: The suspected speaker truly is the source of the questioned recording (trace) – when the hypothesis H1 is true: The suspected speaker is truly not the source of the questioned recording (trace) 20 Estimated Probability Tippett plots: measures of LR accuracy Likelihood Ratio (LR) LR = 1 21 Old and New Paradigm • Old Paradigm – Individual expertise • New Paradigm – Mathematical modelling – Categoric identification – Databases – Reliability assessed – Quantified weight by: false positives, false negatives and incoclusives of evidence on a continuous scale – Calibration 22 Project tasks 1. Best Practice Manuals for forensic semi-automatic and automatic speaker recognition – The main objective of this task is to define a complete set of interpretation methods based on Bayes' approach to be used in the forensic speaker recognition domain independently of the baseline speaker recognition system 2. Methodological guidelines for implementation of semi-automatic and automatic tools – The main objective of this task is to establish a robust methodology for forensic speaker recognition based on statistical and probabilistic methods 23 Project tasks 3. Validation studies and evaluation protocols for semi-automatic and automatic speaker recognition technology in forensic environment – Proficiency tests – The main goal of this task is assessment of the methodology developed and related speaker recognition techniques, and wide dissemination of the results of this project through proficiency tests 4. Comparative study of forensic speaker recognition performance of phonetic experts and automated methods – Collaborative exercises – The main goal of this task is comparison of the inference of identity of source by phonetic experts with that of automated systems 24 Project outputs and deliverables • The 5 specific deliverables are methodological documents to be published in electronic and printed form: – – – – – D1. Best Practice Manuals (Task 1) D.2. Methodological guidelines (Task 2) D.3. Assessment procedures and proficiency tests (Task 3) D.4. Collaborative exercises (Task 4) D.5. Final report (Tasks 1, 2, 3 and 4) • The 3 specific outputs are dissemination events (seminars and workshop) open to the European forensic community – O1. Opening Seminar (Tasks 1, 2, 3 and 4) – O2. Workshop (Tasks 1 and 2) – O3. Dissemination Seminar (Tasks 3 and 4) 25 Project specific activities 26 • Year 2013 – Opening Seminar, 21-22 May 2013, Lausanne, 2 days (output O1) – Project team meeting, 23-24 May 2013, Lausanne, 2 days (drafting of deliverables D1 and D2) – Project team meeting, September 2013, Helsinki, 3 days (drafting of deliverables D1 and D2) • Year 2014 – Project team meeting, May 2014, Paris, 3 days (drafting of deliverables D3 and D4) – Project team meeting, September 2014, Wiesbaden, 2 days (finalization of deliverables D1 and D2) – Workshop, September 2014, Wiesbaden, 2 days (output O2) • Year 2015 – Project team meeting, April 2015, The Hague, 3 days (drafting of deliverables D3 and D4) – Project team meeting, September 2015, Warszawa, 2 days (finalization of deliverables D3 and D4) – Dissemination Seminar, September 2015, Warszawa, 2 days (output O3) – Editing final report (deliverable D5) Best Practice Manual and Guidelines 1. 2. 3. 4. 5. 6. 7. 8. Aims Scope Methodology Case Assessment Evaluative Reporting Quality Assurance References Glossary 27 3.Methodology 28 • Methodology, commonalities and differences of • • • • • • FASR and FSASR Automatic and semi-automatic speaker recognition Pre-processing Features Modeling, scoring and further components System performance testing Database collection 4. Case Assessment • • • • • • • Acceptance criteria for forensic audio material Application of methods in case work Information of requirements Expert intervention Recordings Prioritisation and sequence of examinations Practices applicable to Forensic ASR and SASR examinations • Laboratory examinations • Analysis in detail • Analysis Protocols 29 5. Evaluative Reporting • Evaluation • Interpretation • Reporting 30 6. Quality Assurance • Personnel • Competence requirements (Qualifications, • • • • • • • competence and experience, Training and Assessment) Maintenance of competence Proficiency Testing (PT) and Collaborative Exercise (CE) Documentation Equipment Validation Accommodation Audit 31 Conclusions 32 • Statistical evaluation, and particularly Bayesian methods such as calculation of likelihood ratios based on automatic (deterministic and statistical) pattern recognition methods, have been criticized, but they are the only demonstrably rational means of quantifying and evaluating the value of biometric evidence available at the moment. • The data-driven based methodology provides a coherent way of assessing and presenting the biometric evidence of questioned recording. • The future methods to be developed for interpretation of voice as forensic evidence should combine the advantages of automatic signal processing and pattern recognition objectivity with the methodological transparency solicited in forensic investigations. Prevention of and Fight Against Crime 33