Mark K. Hinders Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint Mark K. Hinders Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint 123 Mark K. Hinders Department of Applied Science William & Mary Williamsburg, VA, USA ISBN 978-3-030-49394-3 ISBN 978-3-030-49395-0 https://doi.org/10.1007/978-3-030-49395-0 (eBook) © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Squirrels are the bane of human existence.1 Everyone accepts this. As with deer, many more squirrels exist today than when Columbus landed in the New World in 1492. Misguided, evil squirrel lovers provide these demons with everything they need to survive and thrive—food (bird feeders), water (birdbaths), and shelter (attics). Wisdom from “7 Things You Must Know About Stupid Squirrels” by Steve Bender https://www. notestream.com/streams/56e0ec0c6e55e/ Also: “Tapped: A Treasonous Musical Comedy” A live taping at Theater Wit in Chicago, IL from Saturday, July 25, 2016. See https://www.youtube.com/ watch?v=JNBd2csSkqg starting at 1:59. 1 Preface Wavelet Fingerprints Machine learning is the modern terminology for what we’ve always been trying to do, namely, make sense of very complex signals recorded by our instrumentation. Nondestructive Evaluation (NDE) is an interdisciplinary field of study which is concerned with the development of analysis techniques and measurement technologies for the quantitative characterization of materials, tissues, and structures by non-invasive means. Ultrasonic, radiographic, thermographic, electromagnetic, and optic methods are employed to probe interior microstructure and characterize subsurface features. Applications are in non-invasive medical diagnosis, intelligent robotics, security screening, and online manufacturing process control, as well as the traditional NDE areas of flaw detection, structural health monitoring, and materials characterization. The focus of our work is to implement new and better measurements with both novel instrumentation and machine learning that automates the interpretation of the various (and multiple) imaging data streams. Twenty years ago, we were facing applications where we needed to automatically interpret very complicated ultrasound waveforms in near real time. We were also facing applications where we wanted to incorporate far more information than could be presented to a human expert in the form of images. Time-scale representations, such as the spectrogram, looked quite promising, but there’s no reason to expect that a boxcar FFT would be optimal. We had been exploring non-Fourier methods for representing signals and imagery, e.g., Gaussian weighted Hermite polynomials, and became interested wavelets. The basic idea is that we start with a time-domain signal and then perform a time-scale or time–frequency transformation on it to get a two-dimensional representation, i.e., an image. Most researchers then immediately extracted a parameter from that image to collapse things back down to a 1D data stream. In tissue characterization with diagnostic ultrasound, one approach was to take a boxcar FFT of the A-scan line, which returned something about the spectrum as a function of anatomical depth. The slope, mid-point value, and/or intercept gave parameters that vii viii Preface seemed to be useful to differentiate healthy from malignant tissues. B-scans are made up of a fan of A-scan lines, so one could use those spectral parameters to make parameter images. When Jidong Hou tried that approach using wavelets instead of windowed FFTs, he rendered them as black and white contour plots instead of false-color images. Since they often looked like fingerprints, we named them wavelet fingerprints. Since 2002, we have applied this wavelet fingerprint approach to a wide variety of different applications and have found that it's naturally suited to machine learning. There are many different wavelet basis functions that can be used. There are adjustable parameters that allow for differing amounts of de-noising and richness of the resulting fingerprints. There are all manner of features that can be identified in the fingerprints, and innumerable ways that identified features can be quantified. In this book, we describe several applications of this approach. We also discuss stupid squirrels and flying saucers and tweetstorms. You may find some investment advice. Williamsburg, Virginia April 2020 Mark K. Hinders Acknowledgements MKH would especially like to thank his research mentors, the late Profs. Asim Yildiz and Guido Sandri, as well as their research mentors, Julian Schwinger, and J. Robert Oppenheimer. Asim Yildiz (DEng, Yale) was already a Professor of Engineering at the University of New Hampshire when Prof. Schwinger at Harvard told him that he was “doing good physics” already so he should “get a union card.” Schwinger meant that Yildiz should get a Ph.D. in theoretical physics with him at Harvard, which Yildiz did while still keeping his faculty position at UNH and mentoring his own engineering graduate students, running his own research program, etc. He also taught Schwinger to play tennis, having been a member of the Turkish national team as well as all the best tennis clubs in the Boston area. When Prof. Yildiz died at age 58, his genial and irrepressibly jolly BU colleague took on his orphaned doctoral students, including yours truly, even though the students’ research areas were all quite distant from his own. Prof. Sandri had done postdoctoral research with Oppenheimer at the Princeton Institute for Advanced Study and then was a senior scientist at Aeronautical Research Associates of Princeton for many years before spending a year in Milan at Instituto Di ix x Acknowledgements Mathematica del Politecnico and then joining BU. In retirement, he helped found Wavelet Technologies, Inc. to exploit mathematical applications of novel wavelets in digital signal processing, image processing, data compression, and problems in convolution algebra. Pretty much all of the work described in this book benefited from the talented efforts of Jonathan Stevens, who can build anything a graduate student or a startup might ever need, whether that’s a mobile robot or an industrial scanner or an investigational device for medical ultrasound, or a system to stabilize and digitize decaying magnetic tape recordings. The technical work described in this volume is drawn from the thesis and dissertation research of my students in the Applied Science Department at The College of William and Mary in Virginia, which was chartered in 1693. I joined the faculty of the College during its Tercentenary and co-founded the new Department of Applied Science at America’s oldest University. Our campus abuts the restored eighteenth-century Colonial Williamsburg where W&M alum Thomas Jefferson and his compatriots founded a new nation. It’s a beautiful setting to be pushing forward the boundaries of machine learning in order to solve problems in the real world. Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 5 8 9 14 22 24 25 26 30 30 31 34 35 37 40 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction to Lamb Waves . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Simulation Methods for SHM . . . . . . . . . . . . . . . . . . . . . . . 2.4 Signal Processing for Lamb Wave SHM . . . . . . . . . . . . . . . 2.5 Wavelet Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Wavelet Fingerprinting . . . . . . . . . . . . . . . . . . . . . . . 2.6 Machine Learning with Wavelet Fingerprints . . . . . . . . . . . . 2.7 Applications in Structural Health Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 45 48 49 51 52 54 58 64 1 Background and History . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Has Science Reached the End of Its Tether? . . . . . . . . 1.2 Did the Computers Get Invited to the Company Picnic? 1.2.1 But Who Invented Machine Learning, Anyway? 1.3 You Call that Non-invasive? . . . . . . . . . . . . . . . . . . . . 1.4 Why Is that Stupid Squirrel Named Steve? . . . . . . . . . . 1.5 That’s Promising, but What Else Could We Do? . . . . . 1.5.1 Short-Time Fourier Transform . . . . . . . . . . . . . 1.5.2 Other Methods of Time–Frequency Analysis . . . 1.5.3 Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6 The Dynamic Wavelet Fingerprint . . . . . . . . . . . . . . . . 1.6.1 Feature Creation . . . . . . . . . . . . . . . . . . . . . . . 1.6.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . 1.6.3 Edge Computing . . . . . . . . . . . . . . . . . . . . . . . 1.7 Will the Real Will West Please Step Forward? . . . . . . . 1.7.1 Where Are We Headed? . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi xii Contents 2.7.1 Dent and Surface Crack Detection in Aircraft Skins . 2.7.2 Corrosion Detection in Marine Structures . . . . . . . . 2.7.3 Aluminum Sensitization in Marine Plate-Like Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.4 Guided Wave Results—Crack . . . . . . . . . . . . . . . . 2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... .... . . . . . . . . . . . . 65 73 . 82 . 90 . 98 . 102 . . . . . . . 115 . . . . . . . 115 . . . . . . . 118 3 Automatic Detection of Flaws in Recorded Music . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Digital Music Editing Tools . . . . . . . . . . . . . . . . . . . . 3.3 Method and Analysis of a Non-localized Extra-Musical Event (Coughing) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Errors Associated with Digital Audio Processing . . . . . 3.5 Automatic Detection of Flaws in Cylinder Recordings Using Wavelet Fingerprints . . . . . . . . . . . . . . . . . . . . . 3.6 Automatic Detection of Flaws in Digital Recordings . . . 3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 132 137 140 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Classification: A Binary Classification Algorithm . . 4.6.1 Binary Classification Algorithm Examples . . 4.6.2 Dimensionality Reduction . . . . . . . . . . . . . . 4.6.3 Classifier Combination . . . . . . . . . . . . . . . . 4.7 Results and Discussion . . . . . . . . . . . . . . . . . . . . . 4.8 Bland–Altman Statistical Analysis . . . . . . . . . . . . . 4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 143 145 147 150 152 155 157 158 159 159 163 166 166 167 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 . . . . . . . 126 5 Spectral Intermezzo: Spirit Security Systems . . . . . . . . . . . . . . . . . . 173 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 6 Classification of Lamb Wave Tomographic Rays in to Distinguish Through Holes from Gouges . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 181 183 185 Contents xiii 6.3.1 Apparatus . . . . . . . . . . . . . . . . . . . . 6.3.2 Ray Path Selection . . . . . . . . . . . . . . 6.4 Classification . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Feature Extraction . . . . . . . . . . . . . . 6.4.2 DWFP . . . . . . . . . . . . . . . . . . . . . . . 6.4.3 Feature Selection . . . . . . . . . . . . . . . 6.4.4 Summary of Classification Variables . 6.4.5 Sampling . . . . . . . . . . . . . . . . . . . . . 6.5 Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Results and Discussion . . . . . . . . . . . . . . . . 6.6.1 Accuracy . . . . . . . . . . . . . . . . . . . . . 6.6.2 Flaw Detection Algorithm . . . . . . . . 6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 187 188 188 189 190 191 192 192 196 196 197 203 204 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 207 209 209 212 215 216 222 223 223 225 227 231 231 233 234 237 238 243 244 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot . . . . . . . . . . . . . . . . . 8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 rMary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Sensor Modalities for Mobile Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 247 249 249 7 Classification of RFID Tags with Wavelet Fingerprinting . . . . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . 7.2 Classification Overview . . . . . . . . . . . . 7.3 Materials and Methods . . . . . . . . . . . . 7.4 EPC Extraction . . . . . . . . . . . . . . . . . . 7.5 Feature Generation . . . . . . . . . . . . . . . 7.5.1 Dynamic Wavelet Fingerprint . . 7.5.2 Wavelet Packet Decomposition . 7.5.3 Statistical Features . . . . . . . . . . 7.5.4 Mellin Features . . . . . . . . . . . . 7.6 Classifier Design . . . . . . . . . . . . . . . . . 7.7 Classifier Evaluation . . . . . . . . . . . . . . 7.8 Results . . . . . . . . . . . . . . . . . . . . . . . . 7.8.1 Frequency Comparison . . . . . . . 7.8.2 Orientation Comparison . . . . . . 7.8.3 Different Day Comparison . . . . 7.8.4 Damage Comparison . . . . . . . . 7.9 Discussion . . . . . . . . . . . . . . . . . . . . . 7.10 Conclusion . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv Contents 8.3 Investigation of Sensor Modalities Using rMary . . . . . 8.3.1 Thermal Infrared (IR) . . . . . . . . . . . . . . . . . . . 8.3.2 Kinect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.3 Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.4 Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Pattern Classification . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Compiling Data . . . . . . . . . . . . . . . . . . . . . . . 8.4.2 Aligning Reflected Signals . . . . . . . . . . . . . . . 8.4.3 Feature Creation with DWFP . . . . . . . . . . . . . 8.4.4 Intelligent Feature Selection . . . . . . . . . . . . . . 8.4.5 Statistical Pattern Classification . . . . . . . . . . . . 8.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 Proof-of-Concept: Acoustic Classification of Stationary Vehicles . . . . . . . . . . . . . . . . . . 8.5.2 Acoustic Classification of Oncoming Vehicles . 8.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 251 252 256 261 265 265 266 271 273 275 278 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 280 290 293 9 Cranks and Charlatans and Deepfakes . . . . . . . . . . . . . . . . 9.1 Digital Cranks and Charlatans . . . . . . . . . . . . . . . . . . . 9.2 Once You Eliminate the Possible, Whatever Remains, No Matter How Probable, Is Fake News . . . . . . . . . . . 9.3 Foo Fighters Was Founded by Nirvana Drummer Dave Grohl After the Death of Grunge . . . . . . . . . . . . . . . . . 9.4 Digital Imaging Is Why Our Money Gets Redesigned so Often . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Social Media Had Sped up the Flow of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6 Discovering Latent Topics in a Corpus of Tweets . . . . . 9.6.1 Document Embedding . . . . . . . . . . . . . . . . . . . 9.6.2 Topic Models . . . . . . . . . . . . . . . . . . . . . . . . . 9.6.3 Uncovering Topics in Tweets . . . . . . . . . . . . . . 9.6.4 Analyzing a Tweetstorm . . . . . . . . . . . . . . . . . . 9.7 DWFP for Account Analysis . . . . . . . . . . . . . . . . . . . . 9.8 In-Game Sports Betting . . . . . . . . . . . . . . . . . . . . . . . . 9.9 Virtual Financial Advisor Is Now Doable . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 . . . . . . . 297 . . . . . . . 302 . . . . . . . 305 . . . . . . . 310 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 314 315 316 320 321 331 335 337 340 Chapter 1 Background and History Mark K. Hinders Abstract Machine learning is the modern lingo for what we’ve been trying to do for decades, namely, to make sense of the complex signals in radar and sonar and lidar and ultrasound and so forth. Deep learning is fashionable right now and those sorts of black-box approaches are effective if there is a sufficient volume and quality of training data. However, when we have appropriate physical and mathematical models of the underlying interaction of the radar, sonar, lidar, ultrasound, etc. with the materials, tissues, and/or structures of interest, it seems odd to not harness that hard-won knowledge. We explain the key issue of feature vector selection in terms of autonomously distinguishing rats from squirrels. Time–frequency analysis is introduced as a way to identify dynamic features of varmint behavior, and the dynamic wavelet fingerprint is explained as a tool to identify features from signals that may be useful for machine learning. Keywords Machine learning · Pattern classification · Wavelet · Fingerprint · Spectrogram 1.1 Has Science Reached the End of Its Tether? Radar blips are reflections of radio waves. Lidar does that same thing with lasers. Sonar uses pings of sound to locate targets. Autonomous vehicles use all of these, plus cameras, in order to identify driving lanes, obstructions, other vehicles, bicycles and pedestrians, deer and dachshunds, etc. Whales and dolphins echolocate with sonar. Bats echolocate with chirps of sound that are too high in frequency for us to hear, which is called ultrasound. Both medical imaging and industrial nondestructive testing have used kHz and MHz frequency ultrasound for decades. Infrasounds are frequencies that are too low for us to hear, which hasn’t found a practical application yet. It all started with an unthinkable tragedy. M. K. Hinders (B) Williamsburg Machine Learning Algorithmics, Williamsburg, VA 23187-0767, USA e-mail: hinders@wmla.ai URL: http://www.wmla.ai © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_1 1 2 M. K. Hinders Not long after its maiden voyage in 1912 when the unsinkable Titanic struck an iceberg and sank [1] Sir Hiram Maxim self-published a short book and submitted a letter to Scientific American [2] entitled, “A New System for Preventing Collisions at Sea” in which he said: The wreck of the Titanic was a severe and painful shock to us all; many of us lost friends and acquaintances by this dreadful catastrophe. I asked myself: “Has Science reached the end of its tether? Is there no possible means of avoiding such a deplorable loss of life and property? Thousands of ships have been lost by running ashore in a fog, hundreds by collisions with other ships or with icebergs, nearly all resulting in great loss of life and property.” At the end of four hours it occurred to me that ships could be provided with what might be appropriately called a sixth sense, that would detect large objects in their immediate vicinity without the aid of a searchlight. Much has been said, first and last, by the unscientific of the advantages of a searchlight. Collisions as a rule take place in a fog, and a searchlight is worse than useless even in a light haze, because it illuminates the haze, making all objects beyond absolutely invisible. Some have even suggested that steam whistle or siren might be employed that would periodically give off an extremely powerful sound, that is, a veritable blast, an ear-piercing shriek, and then one is supposed to listen for an echo, it being assumed that if any object should be near, a small portion of the sound would be reflected back to the ship, but this plan when tried proved an absolute failure. The very powerful blast given off by the instrument is extremely painful to the ears and renders them incapable of hearing the very feeble echo which is supposed to occur only a few seconds later. Moreover, sirens or steam whistles of great power are extremely objectionable on board passenger ships; they annoy the passengers and render sleep impossible. It is, therefore, only too evident that nothing in the way of a light or noise producing apparatus could be of any use whatsoever. Maxim knew that bats used some form of inaudible sound—outside the range of human hearing—to echolocate and feed, but he thought it was infrasound rather than ultrasound. He then proceeded to describe an extremely low-frequency directional steam whistle or siren that could be used to (echo)locate icebergs during foggy nights when collisions were most likely to occur. Whether his patented apparatus would have been effective at preventing collisions at sea is a question that’s a little like whether Da Vinci’s contraptions would have flown. He got the general idea right, and can be credited with stimulating the imaginations of those who subsequently worked out all the engineering details. His figure reproduced below is quite remarkable. The key idea is that the time delay of the echoes determines distance because the speed of sound is known, but more importantly the shape of the echoes gives information about the object which is returning those echoes. Analysis of those echo waveforms can, in principle, tell the difference between a ship and an iceberg, and also differentiate large and small icebergs. He even illustrates how clutter affects the echoes differently from backscattering targets. Science has not reached the end of its tether, even after a century of further development. This is exactly how radar and sonar and ultrasound work (Fig.1.1). Maxim’s suggested apparatus embodies a modified form of “siren,” through which high-pressure steam can be made to flow in order to produce sound waves having about 14 to 15 vibrations per second, and consequently not coming within the frequency range of human hearing. These waves, it is correctly asserted, would be 1 Background and History 3 Fig. 1.1 The infrasonic echo waves would be recorded by a stretched membrane that the infrasound waves would vibrate, and those membrane vibrations could jingle attached bells or wiggle pens tracing lines on paper as Maxim illustrated [3]. Maxim’s concept was discussed in Nature, a leading scientific journal yet today [4] capable of traveling great distances, and if they struck against a body ahead of the ship they would be reflected back toward their source, “echo waves” being formed [3]. This self-published pamphlet was discussed in [4]. Well more than a century before Maxim, Lazzaro Spallanzani performed extensive experiments on bats, and concluded that bats possess some sort of sixth sense, in that they use their ear for detecting objects, if not actually seeing them [5]. His blinded bats got around just fine in either light or dark, and bats he deafened flew badly and hurtled themselves into objects in both dark and light. The problem was that no technology existed until the early twentieth century that could measure the ultrasonic screeches bats were using to echolocate. Prof. G.W. Pierce in the Physics Department at Harvard happened to build such a device, and in 1938 an undergraduate zoology student, who was banding bats to study their migration patterns, asked if they could use that (ultra)sonic detector apparatus to listen to his bats. Young Donald Griffin brought a cage full of bats to Prof. Pierce’s acoustics laboratory wherein humans first heard their down-converted “melody of raucous noises from the loudspeaker” of the device [6, 7]. Donald Griffin and a fellow student found that bats could use reflected sounds to detect objects (Fig. 1.2). What Griffin discovered about bats just before WWII, other scientists discovered about dolphins and related mammals just after WWII. Curiously, unequivocal demonstration of echolocation in dolphins wasn’t accomplished until 1960. Nevertheless, the first underwater sonar apparatus was constructed in order to detect submarines in 1915, using the concepts sketched out by Maxim. By WWII sonar was an essential part of antisubmarine warfare, spurred on by Nazi U-boats sinking merchant ships transporting men and materiel to European allies [10]. 4 M. K. Hinders Fig. 1.2 Donald R. Griffin: 1915–2003 [8, 9]. Dr. Griffin coined the term echolocation to describe the phenomenon he continued to study for 65 years Practical radar technology was ready just in time to turn the tide during the Battle of Britain, thanks to the cavity magnetron [11] and amateur scientist and Wall Street tycoon Alfred Lee Loomis, who personally funded significant scientific research at his private estate before leading radar research efforts during World War II [12, 13]. The atomic bomb may have ended the war, but radar and sonar won it.1 During the entirety of the Cold War, uncounted billions have been spent continuing to refine radar and sonar technology, countermeasures, counter-countermeasures, etc. with much of that mathematically esoteric scientific work quite highly classified. The result is virtually undetectable submarines that patrol the world’s oceans standing ready to assure mutual destruction as a deterrent to sneak attack. More visibly, but highly stealthy, radar evading fighters and bombers carrying satellite-guided preci1 Time magazine was all set to do a cover story about the importance of radar technology in the impending Allied victory, but the story got bumped off the August 20, 1945 cover by the A-bombs dropped on Japan and the end of WWII: “The U.S. has spent half again as much (nearly $3 billion) on radar as on atomic bombs. As a military threat, either in combination with atomic explosives or as a countermeasure, radar is probably as important as atomic power itself. And while the peacetime potentialities of atomic power are still only a hope, radar already is a vast going concern—a $2 billion-a-year industry, six times as big as the whole prewar radio business.” 1 Background and History 5 sion weapons can destroy any fixed target anywhere on the planet with impunity while minimizing collateral damage. It’s both comforting and horrifying at the same time. What humans have been trying to figure out since Maxim’s 1912 pamphlet, is how best to interpret the radar blips and sonar pings and lidar reflections and ultrasound images [14]. The fundamental issue is that the shape, size, orientation, and composition of the object determines the character of the scattered signal, so an enormous amount of mental effort has gone into mathematical modeling and computer simulation to try to understand enough about that exceedingly complex physics in order to detect navigation hazards, enemy aircraft and submarines, tumors, structural flaws, etc. Much of that work has been in what is called by mathematical physics forward scattering, where I know what I transmit, and I know the size & shape & materials & location & orientation of the scattering object. I then want to predict the scattered field so I’ll know what to look for in my data. The real problem is mathematically much more difficult, called inverse scattering, wherein I know what I transmit, and I measure some of the scattered field. I then want to estimate the size & shape & materials & location & orientation of the scatterer. Many scientific generations have been spent trying to solve inverse scattering problems in radar and sonar and ultrasound. In some special cases or with perhaps a few too many simplifying assumptions, there has been a fair amount of success [15]. 1.2 Did the Computers Get Invited to the Company Picnic? Once upon a time “computer” was a job description [16, 17]. They were excellent jobs for mathematically talented women at a time when it was common for young women to be prevented from taking math classes in school. My mother describes being jealous of her high school friend who was doing math homework at lunch, because she wasn’t allowed to take math. As near as I can tell it wasn’t her widower father who said this, but it was instead her shop-keeper/school-teacher aunt who was raising her. This was the 1950s, when gender roles had regressed to traditional forms after the war. Recently, a mathematically talented undergraduate research student in my lab was amused that her MATLAB code reproduced instantly plots from a 1960 paper [18] that had required “the use of two computers for two months” until I suggested that she check the acknowledgements where the author thanked those two computers by name. That undergraduate went on to earn a degree with honors— double majoring in physics and anthropology, with a minor in math and research in medical ultrasound—and then a PhD in medical imaging. My mother became a teacher because her freshman advisor scoffed when she said she wanted to be a freelance writer and a church organist: you’re a woman, you’ll be either a teacher 6 M. K. Hinders Fig. 1.3 Computers hard at work in the NACA Dryden High Speed Flight Station “Computer Room” in 1949 [19]. Seen here, left side, front to back, Mary (Tut) Hedgepeth, John Mayer, and Emily Stephens. Right side, front to back, Lilly Ann Bajus, Roxanah Yancey, Gertrude (Trudy) Valentine (behind Roxanah), and Ilene Alexander or a nurse. She started teaching full time with an associate degree and earned BA, MA, and EdD while working full time and raising children. I remember her struggles with inferential statistics and punch cards for analyzing her dissertation data on an electronic computer (Fig. 1.3). Hedy Lamarr was the face of Snow White and Catwoman. In prewar Vienna she was the bored arm-candy wife of an armaments manufacturer. In Hollywood, she worked 6 days a week under contract to MGM, but kept equipment in her trailer for inventing between takes. German U-boats were devastating Atlantic shipping, and Hedy thought a radio-controlled torpedo might help tip the balance. She came up with a brilliant, profoundly original jamming-proof way to guide a torpedo to the target: frequency hopping. The Philco Magic Box remote control [20] probably inspired the whole thing. The American film composer George Antheil said, “All she wants to do is stay home and invent things.” He knew how to synchronize player pianos, and thought that pairs of player-piano rolls could synchronize the communications of a ship and its torpedo, on 88 different frequencies of course. They donated their patented invention to the Navy, who said, “What do you want to do, put a player piano in a torpedo? Get out of here.” The patent was labeled top secret and the idea hidden away until after the war. They also seized her patent because she was an alien. The Navy just wanted Hedy Lamarr to entertain the troops and sell war bonds. She sold $25 M worth of them. That could have paid for quite a few frequency-hopping torpedoes (Fig. 1.4). 1 Background and History 7 Fig. 1.4 Hedy Lamarr, bombshell and inventor [21]. You may be surprised to know that the most beautiful woman in the world made your smartphone possible Computers have been getting smaller and more portable since their very beginning. First, they were well-paid humans in big rooms, then they were expensive machines in big rooms, then they got small enough and cheap enough to sit on our desktops, then they sat on our laps and we got used to toting them pretty much everywhere, and then they were in our pockets and a teenager’s semi-disposable supercomputer comes free with an expensive data plan. Now high-end computer processing is everywhere. We might not even realize we’re wearing a highly capable computer until it prompts us to get up and take some more steps. We talk to computers sitting on an end table and they understand just like in Star Trek. In real life, the artificially intelligent machines learn about us and make helpful suggestions on what to watch or buy next and where to turn off just up ahead. With the advent of machine learning we can take a new approach to making sense of radar and sonar and ultrasound. We simply have to agree to stipulate that the physics of that scattering is too complex to formally solve the sorts of inverse problems we are faced with, so we accept the more modest challenge of using forward scattering models to help define features to use in pattern classification. Intelligent feature selection is the current incarnation of the challenge that was laid down by Maxim. In this paradigm, we use advanced signal processing methods to proliferate candidate “fingerprints” of the scattering object in the time traces we record. Machine learning then formally sorts out which combinations of those signal features are most useful for object classification. 8 M. K. Hinders 1.2.1 But Who Invented Machine Learning, Anyway? Google [22, 23] tells me that machine learning is a subset of artificial intelligence where computers autonomously learn from data, and they don’t have to be programmed by humans but can change and improve their algorithms by themselves. Today, machine learning algorithms enable computers to mine unstructured text, autonomously drive cars, monitor customer service calls, and find me that thing I didn’t even know I needed to buy or watch. The enormous amount of data is the fuel these algorithm need, surpassing the limitations of calculation that existed prior to ubiquitous computing (Fig. 1.5). Probably machine learning has its roots at the dawn of electronic computation when humans were encoding their own algorithmic processes into machines. Alan Turing [25] described the “Turing Test” in 1950 to determine if a computer has real intelligence via fooling a human into believing it is also human. Also in the 1950s, Arthur Samuel [26] wrote the first machine learning program to play the game of Fig. 1.5 A mechanical man has real intelligence if he can fool a redditor into believing he is also human [24] 1 Background and History 9 checkers, and as it played more it noted which moves made up winning strategies and incorporated those moves into its program. Frank Rosenblatt [27] designed the first artificial neural network, the perceptron, which he intended to simulate the thought processes of the human brain. Marvin Minsky [28] convinced humans to start using the term Artificial Intelligence. In the 1980s, Gerald Dejong [29] introduced explanation-based learning where a computer analyzes training data and creates a general rule it can follow by discarding unimportant data, and Terry Sejnowski [30] invented NetTalk, which learns to pronounce words the same way a baby does. 1980s style expert systems are based on rules. These were rapidly adopted by the corporate sector, generating new interest in machine learning. Work on machine learning then shifted from a knowledge-driven approach to a data-driven approach in the 1990s. Scientists began creating programs for computers to analyze large amounts of data and learn from the results. Computers started becoming faster at processing data with computational speeds increasing 1,000-fold over a decade. Neural networks began to be fast enough to take advantage of their ability to continue to improve as more training data is added. In recent years, computers have beaten learned human at chess (1997) and Jeopardy (2011) and Go (2016). Many businesses are moving their companies toward incorporating machine learning into their processes, products, and services in order to gain an edge over their competition. In 2015, the non-profit organization OpenAI was launched with a billion dollars and the objective of ensuring that artificial intelligence has a positive impact on humanity. Deep learning [31] is a branch of machine learning that employs algorithms to process data and imitate the thinking process of carbon-based life forms. It uses layers of algorithms to process data, understand speech, visually recognize objects, etc. Information is passed through each layer, with the output of the previous layer providing input for the next layer. Feature extraction is a key aspect of deep learning which uses an algorithm to automatically construct meaningful “features” of the data for purposes of training, learning, and understanding. The data scientist had heretofore been solely responsible for feature extraction, especially in cases where an understanding of the underlying physical process aids in identifying relevant features. In applications like radar and sonar and lidar and diagnostic ultrasonography, the hard-won insight of humans and their equation-based models are now working in partnership with machine learning algorithms. It’s not just the understanding of the physics the humans bring to the party, it’s an understanding of the context. 1.3 You Call that Non-invasive? It almost goes without saying that continued exponential growth in the price performance of information technology has enabled the gathering of data on an unprecedented scale. This is especially true in the healthcare industry. More than ever before, data concerning people’s lifestyles, data on medical care, diseases and treatments, 10 M. K. Hinders and data about the health systems themselves are available. However, there is concern that this data is not being used as effectively as possible to improve the quality and efficiency of care. Machine learning has the potential to transform healthcare by deriving new and important insights from the vast amount of data generated during the delivery of healthcare [32]. There is a growing awareness that only a fraction of the enormous amount of data available to make healthcare decisions is being used effectively to improve the quality and efficiency of care and to help people take control of their own health [33]. Deep learning offers considerable promise for medical diagnostics [34] and digital pathology [35–45]. The emergence of convolutional neural networks in computer vision produced a shift from hand-designed feature extractors to automatically generated feature extractors trained with backpropagation [46]. Typically, substantial expertise is needed to implement machine learning, but automated deep learning software is becoming available for use by naive users [47] sometimes resulting in coin-flip results. It’s not necessarily that there’s anything wrong with deep learning approaches, it’s typically that there isn’t anywhere near enough training data. If your black-box training dataset is insufficient, your classifier isn’t learning, it is merely memorizing your training data. Naive users will often underestimate the amount of training data that’s going to be needed by a few (or several) orders of magnitude and then not test their classifier(s) on different enough datasets. We are now just beginning to see AI systems that outperform radiologists on clinically relevant tasks such as breast cancer identification in mammograms [48], but there is a lot of angst [49, 50] (Fig. 1.6). Many of the things that can now be done with machine learning have been talked about for 20 years or more. An example is a patented system for diagnosis and treatment of prostate cancer [52] that we worked on in the late 1990s. A talented NASA gadgeteer and his business partner conceived of a better way to do ultrasoundguided biopsy of the prostate (US Patent 6,824,516 B2) and then arranged for a few million dollars of Congressional earmarks to make it a reality. In modern language, it was all about machine learning but we didn’t know to call it that back then. Of course, training data has always been needed to do machine learning; in this case, it was to be acquired in partnership with Walter Reed Army Medical Center in Washington, DC. The early stages of a prostate tumor often go undetected. Generally, the impetus for a visit to the urologist is when the tumor has grown large enough to mimic some of the symptoms of benign prostatic hyperplasia (BPH) which is a bit like being constipated, but for #1 not #2. Two commonly used methods for detecting prostate cancer have been available to clinicians. Digital rectal examination (DRE) has been used for years as a screening test, but its ability to detect prostate cancer is limited. Small tumors often form in portions of the prostate that cannot be reached by a DRE. By the time the tumor is large enough to be felt by the doctor through the rectal wall, it is typically a half inch or larger in diameter. Considering that the entire prostate is normally on the order of one and a half inches in diameter, this cannot be considered early detection. Clinicians may also have difficulty distinguishing between benign abnormalities and prostate cancer, and the interpretation and results of the examination may vary with the experience of the examiner. The prostate- 1 Background and History 11 Fig. 1.6 A server room in a modern hospital [51] specific antigen (PSA) is an enzyme measured in the blood that may rise naturally as men age. It also rises in the presence of prostate abnormalities. However, the PSA test cannot distinguish prostate cancer from benign growth of the prostate and other conditions of the prostate, such as prostatitis. If the patient has a blood test that shows an elevated PSA level, then that can be an indicator, but the relationship between PSA level and tumor size is not definitive, nor does it give any indication of tumor location. PSA testing also fails to detect some prostate cancers—about 20% of patients with biopsy-proven prostate cancer have PSA levels within normal range. Transrectal ultrasound (TRUS) is sometimes used as a corroborating technique; however, the images produced can be ambiguous. In the 1990s, there were transrectal ultrasound scanners in use, which attempted to image the prostate through the rectal wall. These systems did not produce very good results [53]. Both PSA and TRUS enhance detection when added to DRE screening, but they are known to have relatively high false positive rates and they may identify a greater number of medically insignificant tumors. Thus, PSA screening might lead to treatment of unproven benefit which then could result in impotence and incontinence. It’s part of what we currently call the overdiagnosis crisis. When cancer is suspected because of an elevated PSA level or an anomalous digital rectal exam, a fan pattern 12 M. K. Hinders Fig. 1.7 Prostate cancer detection and mapping system that “provides the accuracy, reliability, and precision so additional testing is not necessary. No other ultrasound system can provide the necessary precision.” The controlling computer displays the data in the form of computer images (presented as level of suspicion (LOS) mapping) to the operating physician as well as archiving all records. Follow-on visits to the urologist for periodic screenings will use this data, together with new examination data, to determine changes in the patient’s prostate condition. This is accomplished by a computer software routine. In this way, the physician has at his fingertips all of the information to make an accurate diagnosis and to choose the optimum protocol for treatment of biopsies is frequently taken in an attempt to locate a tumor and determine if a cancerous condition exists. Because of the crudeness of these techniques, the doctor has only a limited control over the placement of the biopsy needle, particularly in the accuracy of the depth of penetration. Unless the tumor is quite large, the chance of hitting it with the biopsy needle is not good. The number of false negatives is extremely high. In the late 1990s, it was estimated that 80% of all men would be affected by prostate problems with advancing age [54, 55]. Prostate cancer was then one of the top three killers of men from cancer, and diagnosis only occurred in the later stages when successful treatment probabilities are significantly reduced and negative side effects from treatment are high. The need was critical for accurate and consistently reliable, early detection and analysis of prostate cancer. Ultrasound has the advantage that it can be used to screen for a host of problems before sending patients to more expensive and invasive tests and procedures that ultimately provide the necessary diagnostic 1 Background and History 13 Fig. 1.8 The ultrasound scan of the prostate from within the urethra is done in lockstep with the complementary, overlapping ultrasound scan from the TRSB probe which has been placed by the doctor within the rectum of the patient. The two ultrasound scanning systems face each other and scan the same volume of prostate tissue from both sides. This arrangement offers a number of distinct advantages over current systems precision. The new system (Fig. 1.7) was designed consisting of three complimentary subsystems: (1) transurethral scanning stem (TUScan), (2) transrectal scanning, and (3) slaved biopsy system, referred to together as TRSB. One of the limitations of diagnostic ultrasound is that while higher frequencies give better resolution, they also have less depth of penetration into body tissue. By placing one scanning system within the area of interest and adding a second system at the back of the area of interest (Fig. 1.8) the necessary depth of penetration is halved for each system. This permits the effective use of higher frequencies for better resolution. The design goal was to develop an “expert” system that uses custom software to integrate and analyze data from several advanced sensor technologies, something that is beyond the capacity of human performance. In so doing, such a system could enable detection, analysis, mapping, and confirmation of tumors of a size that is as much as one-fourth the size of those detected by traditional methods. 14 M. K. Hinders Fig. 1.9 The advanced prostate imaging and mapping system serves as an integrated platform for diagnostics and treatments of the prostate that integrates a number of ultrasound technologies and techniques with “expert system” software to provide interpretations, probability assessment, and level of suspicion mapping for guidance to the urologist That’s all very nice, of course, but the funding stream ended before clinical data was acquired to develop the software expert system. Machine learning isn’t magic; training data is always needed. Note in Fig. 1.9 that in the 1990s we envisioned incorporating medical history into the software expert system. That became possible after the 2009 Stimulus Package included $40 billion in funding for healthcare IT [56] to “make sure that every doctor’s office and hospital in this country is using cutting-edge technology and electronic medical records so that we can cut red tape, prevent medical mistakes, and help save billions of dollars each year.” Real-time analysis, via machine learning, of conversations between humans is now a routine part of customer service (Fig. 1.10). 1.4 Why Is that Stupid Squirrel Named Steve? Context matters. Machine learning is useful precisely because it helps us to draw upon a wealth of data to make some determination. Often that determination is easy for humans to make, but we want to off-load the task onto an electronic computer. Here’s a simple example: squirrel or rat? Easy, but now try to describe what it is about squirrels and rats that allow you to tell one from another. The context is that rats are nasty and rate a call to the exterminator while squirrels are harmless and even a little cute unless they’re having squirrel babies in my attic in which case they rate a call to a roofer to install aluminum fascias. The standard machine learning approach to distinguishing rats from squirrels is to train the system with a huge library of images that have been previously identified as rat or squirrel. But what if you wanted to articulate specific qualities that distinguish rats from squirrels? We’re going to need to select some features. 1 Background and History 15 Fig. 1.10 The heart of the system is software artificial intelligence that develops a level of suspicion (LOS) map for cancer throughout the prostate volume. Rather than depending on a human to search for suspect features in the images, the multiple ultrasonic, 4D Doppler, and elastography datasets are processed in the computer to make that preliminary determination. A false color LOS map is then presented to the physician for selection of biopsy at those areas, which are likely cancer. The use of these two complementary ultrasound scanners within the prostate and the adjacent rectum provides all of the necessary information to the system computer to enable controllable, precise placement of the integral slaved biopsy needle at any selected point within the volume of the prostate that the doctor wishes. Current techniques have poor control over the depth of penetration, and only a limited control over the angle of entry of the biopsy needle. With current biopsy techniques, multiple fans of 6 or more biopsies are typically taken and false negatives are common Fig. 1.11 It’s easy to tell the difference between a rat and a squirrel 16 M. K. Hinders Fig. 1.12 Only squirrels climb trees. I know this because I googled “rats in trees” and got no hits Fig. 1.13 Squirrels always have bushy tails. Yes I’m sure those are squirrels 1 Background and History 17 Fig. 1.14 Rats eat whatever they find in the street. Yes, that’s the famous pizza rat [57] Here are some preliminary assessments based on the images in Figs. 1.11, 1.12, 1.13, 1.14, and 1.15: • If squirrels happen to have tails, then bushiness might work pretty well, because rat tails are never bushy. • Ears, eyes, and heads look pretty similar, at least to my eye. • Body fur may or may not be very helpful. • Feet and whiskers look pretty similar. • Squirrels are more likely to be up in trees, while rats are more likely to be down in a gutter. • Both eat junk food, but squirrels especially like nuts. • Rats are nasty and they scurry whereas squirrels are cute and they scamper. We can begin to quantify these qualities by drawing some feature spaces. For example, any given rat or squirrel can be characterized by it’s up in a tree or down in the gutter or somewhere in between to give an altitude number. The bushiness of the tail can be quantified as well (Fig. 1.16). Except for unfortunate tailless squirrels, most of them will score quite highly in bushiness while rats will have a very low bushiness score unless they just happen to be standing in front of something that looks like a bushy tail. Since we’ve agreed that both rats and squirrels eat pizza, but squirrels have 18 M. K. Hinders Fig. 1.15 Rats are mean ole fatties. Squirrels are happy and healthy Fig. 1.16 Feature space showing the distribution of rats and squirrels quantified by their altitude and bushiness of their tails a strong preference for nuts, maybe plotting nuttiness versus bushiness (Fig. 1.17) would work. But recall that squirrels scamper and rats scurry. If we could quantify those, my guess is that would give well-separated classes, like Fig. 1.18. The next step in pattern classification is to draw a decision boundary that divides the phase space into the regions with the squirrels and the rats, respectively. Sometimes this is easy (Fig. 1.19) but it will never be perfect. Once this step is done, any new image to be classified gets a (scamper, scurry) score which defines a point in the phase space relative to the decision boundary. Sometimes drawing the 1 Background and History 19 Fig. 1.17 Feature space showing the distribution of rats and squirrels quantified instead by their nuttiness and bushiness of their tails Fig. 1.18 An ideal feature space gives tight, well-separated clusters, but note that there always seems to be a squirrel or two mixed in with the rats and vice versa Fig. 1.19 Easy decision boundary decision boundary is tricky (Fig. 1.20) and so the idea is to draw it as best you can to separate the classes. It doesn’t have to be a straight line, of course, but it’s important not to overfit things to the data that you happen to be using to train your classifier (Fig. 1.21). 20 M. K. Hinders Fig. 1.20 Trickier decision boundary Fig. 1.21 Complicated decision boundary These phase spaces are all 2D, but there’s no apparent reason to limit things to two or three or more dimensions. In 3D, the decision boundary line becomes a plane or some more complicated surface. If we consider more features then the decision boundary will be a hypersurface that we can’t draw, but that we can define mathematically. Maybe then we define a feature vector consisting of (altitude, bushiness, nuttiness, scamper, scurry). We could even go back and consider a bunch of other characteristics of rats and squirrels and add them. Don’t go there. Too many features will make things worse. Depending upon the number of classes you want to differentiate, there will be an optimal size feature vector and if you add more features the classifier performance will degrade. This is another place were human insight comes into play (Fig. 1.22). If instead of trying to define more and more features we think a bit more deeply about features that look promising, we might be able to do more with less. Scamper and scurry are inherently dynamic quantities, so presumably we’ll need multiple image frames rather than snapshots, which shouldn’t be a problem. A game I play when out and about on campus, or in a city, is to classify the people and animals around me according to my own personal (scamper, scurry) metrics. Try it sometime. Scamper implies a lightness of being. Scurry is a bit more emotionally dark and urgent. Scamper almost rhymes with jumper. Scurry actually rhymes with 1 Background and History 21 Fig. 1.22 The curse of dimensionality is surprising and disappointing to everybody who does pattern classification [58] hurry. Let’s agree to stipulate that rats don’t scamper.2 In a sequence of images, we could simply track the center of gravity of the varmint over time, which will easily allow us to differentiate scampering from scurrying. A scampering squirrel will have a center of gravity that goes up and down a lot over time, whereas the rat will have a center of gravity that drops a bit at the start of the scurry but then remains stable (Fig. 1.23). If we also calculate a tail bushiness metric for each frame of the video clip we could form an average bushiness in order to avoid outliers due to some extraneous bushiness from a shrubbery showing up occasionally in an image. We then plot these as (CG Stability, Average Bushiness) as sketched in Fig. 1.24 in case you haven’t been following along with my phase space cartoons decorated with rat-and-squirrel clipart. 2 I’ve watched the videos with the title “Rats Scamper Outside Notre Dame Cathedral as Flooding Pushes Rodents Onto Paris Streets” (January 24, 2018) but those rats are clearly scurrying. Something must have gotten lost in translation. I wonder if rats are somehow to blame for the Notre Dame fire? Surely it wasn’t squirrels nesting up in the attic! 22 M. K. Hinders Fig. 1.23 Motion of the center of gravity is the key quantity. The blue dashed line shows the center of gravity while sitting. The green line shows the center of gravity lowered during a scurry. The red line shows the up and down of the center of gravity while scampering Fig. 1.24 Use multiple frames from video to form new more sophisticated feature vectors while avoiding the curse of dimensionality 1.5 That’s Promising, but What Else Could We Do? We might want to use the acoustic part of the video, since it’s going to be available anyway. We know that • • • • • Rats squeak but squirrels squawk. Rats giggle at ultrasound frequencies if tickled. Squirrel vocalizations are kuks, quaas, and moans. If frequencies are different enough, then an FFT should provide feature(s). Time-scale representations will allow us to visualize time-varying frequency content of complex sounds. Many measurements acquired from physical systems are one-dimensional timedomain signals, commonly representing amplitude as a function of time. In many cases, useful information can be extracted from the signal directly. Using the waveform of an audio recording as an example, the total volume of the recording at any point in time is simply the amplitude of the signal at that time point. More in-depth analysis of the signal could show that regular, sharp, high-amplitude peaks are drum hits, while broader peaks are sustained organ notes. Amplitude, peak sharpness, and 1 Background and History 23 peak spacing are all examples of features that can be used to identify particular events occurring in the larger signal. As signals become more complicated, such as an audio recording featuring an entire orchestra as compared to a single instrument or added noise, it becomes more difficult to identify particular features in the waveform and correlate them to physical events. Features that were previously used to differentiate signals then no longer do so reliably. One of the most useful, and most common, transformations we can make on a time-domain signal is the conversion to a frequency-domain spectrum. For a real signal f (x), this is accomplished with the Fourier transform 1 F(ω) = √ 2π ∞ −∞ f (t) e−iωt dt. (1.1) The resultant signal F(ω) is in the frequency domain, with angular frequency ω related to the natural frequency ξ (with units cycles per second) by ω = 2π ξ . An inverse Fourier transform will transform this signal back to the time domain. Since this is the symmetric formulation of the transform, the inverse transform can be written as ∞ 1 F(ω) eiωt dω. (1.2) f (t) = √ 2π −∞ Since the Fourier transform is just an extension of the Fourier series, looking at this series is the best way to understand what actually happens in the Fourier transform. The Fourier series, discovered in 1807, decomposes any periodic signal into a sum of sines and cosines. This series can be expressed as the infinite sum ∞ f (t) = a0 + an cos(nt) + bn sin(nt), 2 n=1 (1.3) where the an and bn are the Fourier coefficients. By finding the values of these coefficients that best describe the original signal, we are describing the signal in terms of some new basis functions: sines and cosines. The relation to the complex exponential given in the Fourier transform comes from Euler’s formula, e2πiθ = cos 2π θ + i sin 2π θ. In general, any continuous signal can be represented by a linear combination of orthonormal basis functions (specifically, the basis functions must define a Hilbert space). Sines and cosines fulfill this requirement and, because of their direct relevance to describing wave propagation, provide a physically relatable explanation for what exactly the decomposition does—it describes the frequency content of a signal. In practice, since real-world signals are sampled from a continuous measurement, calculation of the Fourier transform is accomplished using a discrete Fourier transform. A number of stable, fast algorithms exist and are staples of any numerical signal processing analysis software. As long as the Nyquist–Shannon sampling theorem is 24 M. K. Hinders respected, sampling rate f s must be at least twice the maximum frequency content present in the signal, no information about the original signal is lost. 1.5.1 Short-Time Fourier Transform While the Fourier transform allows us to determine the frequency content of a signal, all time-domain information is lost in the transformation. The spectrum of the audio recording tells us which frequencies are present but not when those notes were being played. The simple solution to this problem is to look at the Fourier transform over a series of short windows along the length of the signal. This is called the short-time Fourier transform (STFT), and is implemented as 1 STFT { f (t)} (τ, ω) ≡ F(τ, ω) = √ 2π ∞ −∞ f (t) w̄(t − τ ) e−iωt dt, (1.4) where w̄(t − τ ) is a windowing function that is nonzero for only a short time, typically . Since this a Hann window, described in the discrete domain by w̄(n) = sin2 Nπn −1 is an invertible process it is possible to recreate the original signal using an inverse transform, but windowing of the signal makes inversion more difficult. Taking the squared magnitude of the STFT (|F(τ, ω)|2 ) and displaying the result as a color-mapped image with frequency on the vertical axis and time on the horizontal axis shows the evolution of the frequency spectrum as a function of time. These plots are often referred to as spectrograms, an example of which is shown in Fig. 1.25. It is important to note that this transformation from the one-dimensional time domain to a joint time–frequency domain creates a two-dimensional representation of the signal. Adding a dimension to the problem gives us more information about our signal at the expense of more difficult analysis. Fig. 1.25 The spectrogram (bottom) of the William and Mary Alma Mater, performed by the William and Mary Chorus, provides information about the frequency content of the signal not present in the time-domain waveform (top) 1 Background and History 25 The more serious limitation of the STFT comes from the uncertainty principle known as the Gabor limit, 1 (1.5) Δt Δω ≥ , 2 which says that a function cannot be both time and band limited. It is impossible to simultaneously localize a function in both the time domain and the frequency domain, which leads to resolution issues for the STFT. A short window will provide precise temporal resolution and poor frequency resolution, while a wide window has the exact opposite effect. 1.5.2 Other Methods of Time–Frequency Analysis The development of quantum mechanics in the twentieth century ushered in a number of alternative time–frequency representations because the mathematics are similar in the position-momentum and time–frequency domains. One of these is the Wigner– Ville distribution, introduced in 1932, which maps the quantum mechanical wave function to a probability distribution in phase space. In 1948, Ville wrote a time– frequency formulation, W (τ, ω) = ∞ f −∞ τ+ t 2 t f∗ τ − e−iωt dt, 2 (1.6) where f ∗ (t) is the complex conjugate of f (t). This can be thought of as the Fourier transform of the autocorrelation of the original signal f (t), but because it is not a linear transform, cross-terms occur when the input signal is not monochromatic. Gabor also tried to improve the resolution issues with the STFT by introducing the transform ∞ 2 e−π(t−τ ) e−iωt f (t) dt, (1.7) G(τ, ω) = −∞ which is basically the STFT with a Gaussian window function. Like the STFT, this is a linear transformation and there is no problem with cross-terms. By combining the Wigner–Ville and Gabor transforms, we can mitigate the effects of the crossterms and improve the resolution of the time–frequency representation. One possible representation of the Gabor–Wigner transform is D(τ, ω) = G(τ, ω) × W (τ, ω). (1.8) The spectrogram (STFT) is rarely the optimal time–frequency representation, but there are others such as the Wigner (Fig. 1.26) and positive transforms (Fig. 1.27). We can also use wavelet transforms to form analogous time-scale representations. There are many mother wavelets to choose from. 26 M. K. Hinders Fig. 1.26 Time-domain waveform is shown (bottom) and its power spectrum (rotated, left) along with the Wigner transform as a false color time–frequency image 1.5.3 Wavelets The overarching issue with any of the time–frequency methods is that the basis of the Fourier transform is chosen with the assumption that the signals to be analyzed are periodic or infinite in time. Most real-world signals are not periodic but change character over time. This problem becomes even more clear when looking at finite signals with sharp discontinuities. Approximating such signals as linear combination of sinusoids creates overshoot at the discontinuities. The well-known Gibbs phenomenon is illustrated in Fig. 1.28. Instead we can use a basis of finite signals, called wavelets [59], to better approximate real-world signals. The wavelet transform is written as 1 Background and History 27 Fig. 1.27 Time-domain waveform is shown (bottom) and its power spectrum (rotated, left) along with the positive transform as a false color time–frequency image Fig. 1.28 Attempting to approximate a square wave using Fourier components (sines and cosines) creates large oscillations near the discontinuities. Known as the Gibbs phenomenon, this overshoot increases with frequency (as more sums are added to the Fourier series) but eventually approaches a finite limit 28 M. K. Hinders Fig. 1.29 A signal s(t) is decomposed into approximations (A) and details (D), corresponding to low- and high-pass filters, respectively. By continually decomposing the approximation coefficients in this manner and removing the first several levels of details, we have effectively applied a low-pass filter to the signal 1 W(τ, s) = √ s ∞ −∞ f (t) ψ t −τ s dt. (1.9) A comparison to the STFT (1.4) shows that this transform decomposes the signal not into linear combinations of sines and cosines, but into linear combinations of wavelet functions ψ(τ, s). We can relate this to the Fourier decomposition (1.3) by defining the wavelet coefficients c jk = W k2− j , 2− j . (1.10) Here, τ = k2− j and is referred to as the dyadic position and s = 2− j and is called the dyadic dilation. We are decomposing our signal in terms of a wavelet that can move (position τ ) and deform by stretching or shrinking (scale s). This transforms our original signal into a joint time-scale domain, rather than a frequency domain (Fourier transform) or joint time–frequency domain (STFT). Although the wavelet transform doesn’t provide any direct frequency information, scale is related to the inverse of frequency, with low-scale decompositions relating to high frequency and vice versa. This relationship is often exploited to de-noise signals by removing information at particular scales (Fig. 1.29). In addition to representing near-discontinuous signals better than the STFT, the dyadic (factor-of-two) decomposition of the wavelet transform allows an improvement in time resolution at high frequencies (Fig. 1.30). 1 Background and History 29 Fig. 1.30 The STFT has similar time resolution at all frequencies, while the dyadic nature of the wavelet transform affords better time resolution at high frequencies (low-scale values) In the time domain, wavelets are completely described by the wavelet function (mother wavelet ψ(t)) and a scaling function (father wavelet φ(t)). The scaling function is necessary because stretching the wavelet in the time domain reduces the bandwidth, requiring an infinite number of wavelets to accurately capture the entire spectrum. This is similar to Zeno’s paradox, in which trying to get from point A to point B by crossing half the remaining distance each step is logically fruitless. The scaling function is an engineering solution to this problem, allowing us to get close enough for all practical purposes by covering the rest of the spectrum. In order to completely represent a continuous signal, we must make sure that our wavelets form an orthonormal basis. Since as part of the decomposition we are allowed to scale and shift our original wavelet, we only need to ensure that the mother wavelet is continuously differentiable and compactly supported. For our analysis, we typically use the wavelet definitions and transform algorithms included in MATLAB.3 The Haar wavelet is the simplest example of a wavelet—a discontinuous step function with uniform scaling function. The Haar wavelet is also the first (db1) of the Daubechies family of wavelets abbreviated dbN, with order N the number of vanishing moments. Historically, these were the first compactly supported orthonormal set of wavelets and were soon followed by Daubechies’ slightly modified and least asymmetric Symlet family. The Coiflet family, also exhibiting vanishing moments, was also created by Daubechies at the request of other researchers. The Meyer wavelet has both its scaling and wavelet functions defined in the frequency domain, but is not technically a wavelet because its wavelet function is not compactly supported. However, ψ → 0 as x → ∞ fast enough that the pseudowavelet is infinitely differentiable. This allows the existence of good approximations for use in discrete wavelet transforms, and we often consider the Meyer and related Discrete Meyer functions as wavelets for our analysis. Both the Mexican hat and Morlet wavelets are explicitly defined and have no scaling function. The Mexican hat wavelet is proportional to the second derivative function of the Gaussian probability density function, while the Morlet wavelet is 2 defined as ψ(x) = Ce−x cos(5x), with scaling constant C. 3 See, for example https://www.mathworks.com/products/wavelet.html. 30 M. K. Hinders 1.6 The Dynamic Wavelet Fingerprint While alternative time–frequency transformations can improve the resolution limits of the STFT, they often create their own problems such as the cross-terms in the Wigner–Ville transform. Combinations of transforms can reduce these effects while still offering increased resolution, but this then comes at the cost of computational complexity. Wavelets offer an alternative basis for decomposition that is more suited to finite real-world signals, but without the direct relationship to frequency. One of the issues with time–frequency representations of signals is the added complexity of the resultant time–frequency images. Just as displaying a onedimensional signal requires a two-dimensional image, viewing a two-dimensional signal requires a three-dimensional visualization method. Common techniques include three-dimensional surface plots that can be rotated on a computer screen or color-mapped two-dimensional images where the value at each point is mapped to a color. While these visualizations work well for human interpretation of the images, computers have a difficult time distinguishing between those parts of the image we care about and those that are just background clutter. This difficulty with image segmentation is especially true for noisy signals. The human visual system is evolutionarily adapted to be quite good at this4 but computers lack such an advantage. Automated image segmentation methods work well for scenes where a single object is moving in a predictable path across a mostly stationary background. We have developed an alternative time–frequency representation called the dynamic wavelet fingerprint (DWFP) that we have found useful to reveal subtle features in noisy signals. This technique takes a one-dimensional time-domain waveform and converts it to a two-dimensional time-scale image [60] generating a presegmented binary image that can be analyzed using image processing techniques. 1.6.1 Feature Creation The DWFP process first filters a one-dimensional signal using a stationary discrete wavelet transform. This decomposes the signal into wavelet components at a set number of levels, removes the chosen details, and then uses the inverse stationary wavelet transform to recompose the signal. The number of levels, details to remove, and wavelet used for the transform are all user-specified parameters. A Tukey window can also be applied to the filtered signal at this point to smooth out behavior at the edges. Next, the wavelet coefficients are created using a continuous wavelet transform. The normalized coefficients form a three-dimensional surface, and can be thought of as “peaks” or “valleys” depending on if the coefficients are positive or negative. 4 Those us. ancient humans who didn’t notice that tiger behind the bush failed to pass their genes on to 1 Background and History 31 Slicing this surface (both slice thickness and number of slices are user parameters) and projecting the slices to a plane generate a two-dimensional binary image. The vertical axis of this image is scale (inversely related to frequency), and the horizontal axis remains time, allowing direct comparison to the original one-dimensional timedomain signal. The image often resembles a set of fingerprints (hence the name), but most importantly the image is pre-segmented and can be easily analyzed by standard image processing techniques. Since the slicing process does not distinguish between peak (positive coefficients) and valleys (negative coefficients) we can instead do the slicing operation in two steps, keeping the peak and valley projections separate. This generates two fingerprint images for each signal—one for peaks and one for valleys— which can be analyzed separately or combined into a (still segmented) ternary image. A number of additional features can be extracted from this fingerprint image. Some of the features we extract are functions of time, for example, a simple count of the number of ridges at each time point. However, many of the features that we want to extract from the image are tied to a particular individual fingerprint, requiring us to first identify and consecutively label the individual fingerprints. We use a measure of nearly connectedness, in which pixels of the same value within a set distance of each other are considered connected, to label each individual fingerprint. This measure works well as long as each individual fingerprint is spatially separated from its neighbor, something that is not necessarily true for the ternary fingerprint images. For those cases, we actually decompose the ternary image into two separate binary images, label each one individually, and then recombine and relabel the two images (Fig. 1.31). In some cases, the automated labeling will classify objects as a fingerprint even though they may not represent our expectation of a fingerprint. While this won’t affect the end results because such fingerprints won’t contain any useful information, it can slow down an already computationally intensive process. To reduce these false fingerprints, an option is added to restrict the allowed solidity range for an object to be classified as an individual fingerprint. 1.6.2 Feature Extraction Once the location and extent of each individual fingerprint has been determined, we apply standard image processing libraries included in MATLAB to extract features from the image. The resemblance of our images to fingerprints, for which a large image recognition literature already exists, can be exploited in this process. These parameter waveforms are then linearly interpolated to facilitate a direct comparison to the original time-domain signal. Typically, about 25 one-dimensional parameter waveforms are created for each individual measurement. Some of these features are explained in more detail below. 32 M. K. Hinders Fig. 1.31 To consecutively label the individual fingerprints within the fingerprint image, the valleys (top left) and peaks (top right) images are first labeled individually and then combined into an overall labeled image (bottom) A number of features are extracted from both the raw signal and the wavelet fingerprint image using the MATLAB image processing toolbox regionprops analysis to create an optimized feature vector for each instance. 1. Area: Number of pixels in the region 2. Filled Area: Number of pixels in the bounding box (smallest rectangle that completely encloses the region) 3. Extent: Ratio of pixels to pixels in bounding box, calculated as Area Area of bounding box 4. Convex Area: Area of the convex hull (the smallest convex polygon that contains the area) 5. Equivalent Diameter: Diameter of a circle with the same area as the region, calculated as 4·Area π 6. Solidity: Proportion of pixels in the convex hull to those also in the region, calculated as Area Convex Area 7. xCentroid: Center of mass of the region along the horizontal axis 8. yCentroid: Center of mass of the region along the vertical axis 9. Major Axis Length: Pixel length of the major axis of the ellipse that has the same normalized second central moments as the region 1 Background and History 33 Table 1.1 List of user parameters in DWFP creation and feature extraction process Setting Options Description Wavelet filtering filtmethod wvtpf numlvls swdtoremove Wavelet transform wvt ns normconstant numslices slicethickness Feature extraction saveimages fullorred solidity_range Filt, filtandwindow, window, none wavelet name Z+ [Z+ ] How to filter data Filtering wavelet Number of levels to filter Details to remove wavelet name Z+ Z+ Z+ R+ Transform wavelet Number of scales for transform Normalization constant Number of slices Thickness of each slice Binary switch Full, reduced [R ∈ [0, 1], R ∈ [0, 1]] Save fingerprint images? Require certain solidity Allowable solidity range 10. Minor Axis Length: Pixel length of the minor axis of the ellipse that has the same normalized second central moments as the region 11. Eccentricity: Eccentricity of the ellipse that has the same normalized second central moments as the region, computed as the ratio of the distance between the foci of the ellipse and its major axis length 12. Orientation: Angle (in degrees) between the x-axis and the major axis of the ellipse that has the same second moments as the region 13. Euler Number: Number of objects in the region minus the number of holes in those objects, calculated using 8-connectivity 14. Ridge count: Number of ridges in the fingerprint image, calculated by looking at the number of transitions between pixels on and off at each point in time The user has control of a large number of parameters in the DWFP creation and feature extraction process (Table 1.1), which affect the appearance of the fingerprint images, and thus the extracted features. There is no way to tell a priori which combination of parameters will create the ideal representation for a particular application. Past experience with analysis of DWFP images helps us to avoid an entirely brute force implementation for many applications. However, in some cases, the signals to be analyzed are so noisy that humans are incapable of picking out useful patterns in the fingerprint images. For these applications, we use the formal machine learning language of pattern classification and a computing cluster to run this feature extraction process in parallel for a large number of parameter combinations. We first have to choose a basis function, which could be sinusoidal, wavelet, or whatever. Then we have to identify something(s) in the resulting images that can 34 M. K. Hinders be used to make features. The tricky bit is automatically extracting those image features: it’s computationally expensive. Then we have to somehow downselect the best image features from all the ones that are promising. You’ll be unsurprised that the curse of dimensionality comes into play here. All we wanted to do was distinguish rats from squirrels and somehow we’re now doing wavelet transforms to tell the difference between scampering-squawking squirrels and scurrying-squeaking rats. It would seem simpler just to brute force the problem by training a standard machine learning system with a large library of squirrel versus rat images. Yes, that would be simpler conceptually. Go try that if you like. We’ll wait. 1.6.3 Edge Computing Meanwhile, the world is pushing hard into the Internet of Things (IoT) right now. The sorts of low-power sensor hubs that are in wearables have an amazing amount of capability, both in terms of sensors and local processing of sensor data which enables what’s starting to be called distributed machine learning, edge computing, etc. For the last 30 years, my students and I have been modeling the scattering of radar, sonar, and ultrasound waves from objects, tissues, materials, and structures. The reflection, transmission, refraction and diffraction of light, and the conduction of heat are also included in this body of work. As computers have become more and more capable, three-dimensional simulations of these interactions have become a key aspect of sorting out very complex behaviors. Typically, our goal has been to solve inverse problems where we know what the excitation source is, and some response of the system is measured. Success is being able to automatically and in (near) real time deduce the state of the object(s), tissue(s), material(s), and/or structure(s). We also want quantitative outputs with a resolution appropriate for that particular case. The mathematical modeling techniques and numerical simulation methods are the same across a wide range of physical situation and field of application. Why the sky is blue [61–65] and how to make a bomber stealthy [66–69] both utilize Maxwell’s equations. Seismic waves and ultrasonic NDE employ identical equations once feature sizes are normalized by wavelength. Sonar of whales and submarines is identical to that of echolocating bats and obstacle-avoiding robots. In general, inverse scattering problems are too hard, but bats do avoid obstacles and find food. Radar detects inbound threats and remotely estimates precipitation. Cars park themselves and (usually) stop before impact even if the human driver is not paying close attention and I’m jaywalking with a cup of WaWa coffee trying not to spill. Medical imaging is now so good that we find ourselves in an overdiagnosis dilemma. Computed tomography is familiar to most people from X-ray CT scanners used in medicine and baggage screening. In a different configuration, CT is a workhorse method for seismic wave exploration for natural resources. We adapted these methods for ultrasound-guided wave characterization of flaw in large engineered structures like aircraft and pipelines. By extracting features from signals that have gone through the region of interest in many directions, the recon- 1 Background and History 35 struction algorithms output a map of the quantities of interest, e.g., tissue density or pipe wall thickness. The key is understanding which features to extract from the signals for use in tomographic reconstruction, but that understanding comes from an analysis of the signal energy interacting with the tissue, structure, or material variations that matter. For many years now, we’ve been developing the underlying physics to enable robots to navigate the world around them. We tend to focus on those sensors and imagers where the physics is interesting, and then develop the machine learning algorithms to allow the robots to autonomously interpret its sensors and imagers. For autonomous vehicles, we now say IoV instead, and there the focus is sensors/interpretation, but also communication among vehicles which is how an autonomous vehicle sees around corners and such. The gadgets used for computed tomography tend to be large and expensive and often immobile, but their primary constraint is measurement geometries that can be overly restrictive. Higher and higher fidelity reconstructions require more and more ray paths, each of which is a signal from which features must be extracted in (near) real time. Our primary effort over the last two decades has been signal processing methods to extract relevant features from these sorts of signals, and we eventually began calling these sets of algorithms narrow-AI expert systems. They’re always carefully tailored and finely tuned to the particular measurement scenario of interest at that time, and they use whatever understanding of the appropriate scattering problem we can bring to bear. 1.7 Will the Real Will West Please Step Forward? In colonial India, British bureaucrats had a problem. They couldn’t tell one native pensioner from another, so when one died another person would show up and continue to collect that pension. Photo ID wasn’t a thing yet, and illiterate Indians couldn’t sign for their cash. Fingerprints provided the answer, because there is this belief that they are unique to each individual. In Victorian times, prisons switched over from punishment to rehabilitation, but if you’re going to give harsher sentences to repeat offenders you have to be able to tell whether someone has been to prison before. Again, without photography and hence no ability to store and transmit images, it would be pretty simple to avoid harsher and harsher sentences by giving a fake name every subsequent time you were arrested. The question then becomes: What features of individuals can be measured to uniquely identify them? Hairstyle or color won’t work. Eye color would work, but weight wouldn’t. Scars, tattoos, etc. should work if there are any. Alphonse Bertillon [71] had the insight that bones don’t change after you’ve stopped growing, so a list of your particulars including this kind of measurement should allow you to be uniquely identified. His system included 11 measurements: 36 M. K. Hinders Fig. 1.32 In 1903, a prisoner named Will West arrived at Leavenworth. William West had been serving a life sentence at Leavenworth since 1901. They had identical Bertillon measurements. Fingerprints were distinct, though. See: [70] • Height, • Stretch: Length of body from left shoulder to right middle finger when arm is raised, • Bust: Length of torso from head to seat taken when seated, • Length of head: Crown to forehead, • Width of head: temple to temple, • Length of right ear, • Length of left foot, • Length of left middle finger, • Length of left cubit: elbow to tip of middle finger, • Width of cheeks, and • Length of left little finger. This can be pretty expensive to do, and can’t distinguish identical twins: Will West and William West were both in Leavenworth and had identical Bertillon measurements, Fig. 1.32. Objections to this system included the cost of instruments employed and their liability to become out of order; the need for specially instructed measurers of superior education; errors frequently crept in when carrying out the processes and were all but irremediable; and modesty for women. The last objection was a problem in trying to quantify recidivism among prostitutes. The consensus was that a small number of repeat offenders were responsible for a large proportion of crimes, and that these “female defective delinquents” spread STDs. Bertillon anthropometry required a physical intimacy between the operator and the prisoner’s body that was not deemed appropriate for female prisoners. This was a time when doctors did make house calls, but typically wouldn’t have their female patients disrobe at all and may not even lay on hands because such things would be immodest. It’s no wonder that a common diagnosis was “women’s troubles.” Also, a problem for Bertillonage was that Bouffant hairstyles throw off height measures. When people complained to Bertillon, he responded that of course you can’t make it work, you’re not French. Fingerprints seemed like a better solution all around. They also have the advantage of being cheap and easy, and the Henry fingerprint classification system allowed 1 Background and History 37 fingerprint information to be written down and filed for later use, and also sent by telegraph or letter.: • • • • • • Basic patterns: arch, loop, whorl, and composite. Fingers numbered 1–10 starting at left pinkie. Primary classification a fraction with odd number fingers over even number fingers. Add 1, 2, 4, 8, 16 when whorls appeared. 1/1 is zero whorls, 32/32 is ten whorls. Filing cabinets had 32 rows and 32 columns. Secondary classification indicated arches, tented arches, radial loops, or ulnar loops, with right-hand numerators and left denominators. Further subclassification of whorls was done by ridge tracing since whorls originate in two deltas. Loops were subclassified by ridge counting, i.e., the number of ridges between delta and core. Hence, 5/17 R/U OO/II 19/8 for the author of a surprisingly interesting history of fingerprinting [72] who has whorls in both thumbs; high-ridge-count radial loops in the right index and middle fingers; low-ridge-count ulnar loops in the left index and middle fingers; nineteen-ridge-count loop in the right little finger; and eight-ridgecount loop in the left little finger. This pattern classification system worked pretty well until the database got large enough that searching it became impractical. There was also a terrific battle for control over the official fingerprint repository between New York and the FBI. You can guess who won that. The key message is that the way biometrics have always worked is not by comparing images, but by comparing features extracted from them. In fingerprint analysis that’s the way it works to this day, with the language being the number of points that match. People still believe that fingerprints are unique to the individual, and don’t realize that extracting features from fingerprints can be quite subjective. I wonder what a jury would say if the suspect’s cubit (length from elbow to fingertip) was used as proof of guilt (Fig. 1.33). 1.7.1 Where Are We Headed? So you see, machine learning is the modern lingo for what we’ve always been trying to do. Sometimes we’ve called it pattern classification or expert systems, but the key issue is determining what’s what from measured data streams. Machine learning is usually divided into supervised and unsupervised learning. Supervised learning requires training data with known class labels, and unless one has a sufficient amount of relevant training data these methods will return erroneous classifications. Unsupervised learning can reveal structures and inter-relations in data with no class labels required for the data. It can be used as a precursor for supervised learning, but can also uncover the hidden thematic structure in collections of documents, phone conversations, emails, chats, photos, videos, etc. This is important because more than 90% of all data in the digital universe is unstructured. Most people have an idea that every customer service phone call is now monitored, but that doesn’t mean that a supervisor is listening in, it means 38 M. K. Hinders Fig. 1.33 From top left to bottom right: loop, double loop, central pocket loop, plain whorl, plain arch, and tented arch [72] that computers are hoovering up everything and analyzing the conversations. Latent topics are the unknown unknowns in unstructured data, and contain the most challenging insights for humans to uncover. Topic modeling can be used as a part of the human–machine teaming capability that leverages both the machine’s strengths to reveal structures and inter-relationships, and the human’s strengths to identify patterns and critique solutions using prior experiences. Listening to calls and/or reading documents will always be the limiting factors if we depend on humans with human attention spans, but topic modeling allows the machines to plow through seemingly impossible amounts of data to uncover the unknown unknowns that could lead to actionable insights. Of course, we don’t want to do this in a truly unsupervised fashion, because we’ve been trying to mathematically model and numerically simulate complicated systems (and their responses) for decades. Our current work is designed to exploit those decades of work to minimize the burden of getting a sufficiently large and representative set of training data with assigned class labels. We still have both supervised and unsupervised machine learning approaches working in tandem, but now we invoke insight provided by mathematical models and numerical simulations to add what we sometimes call model-assisted learning. One area that many researchers are beginning to actively explore is health care, with the goal of replacing the human scribes with machine learning systems that transcribe conversations between and among patients, nurses, doctors, etc. in real time while also drawing upon the patients’ digital medical records, scans, and current bodily function monitors. Recall Fig. 1.9 where we imagined doing this 20 years ago. 1 Background and History 39 Most people have heard about Moore’s Law even if they don’t fully appreciate that they’re walking around with a semi-disposable supercomputer in their pocket and perhaps even on their wrist. What’s truly new and amazing, though, is the smarts contained in the low-power sensor hubs that every tablet, smartphone, smartwatch, etc. are built around. Even low-end fitness trackers do so much more than the old-fashioned pedometers handed out in gym class. On-board intelligence defeats attempts to “finish your PE homework” on the school-bus ride home with the old shakey-shakey, even if some of the hype around tracking sleep quality and counting calories burned is a bit overblown. These modern MEMS sensor hubs have local processors to interpret the sensor data streams in an efficient enough manner that the battery draw is reduced by 90% compared to utilizing the main (general purpose) CPU. Of course, processing sensor data streams to extract features of interest is at the heart of machine learning. Now, however, sensor hubs and processing power are so small and cheap that we can deploy lots and lots of these things—connected to each other wirelessly and/or via the Internet—and do distributed machine learning on the Internet of Things (IoT). There’s been a lot of hype about the IoT because even non-technical business-channel talking head experts understand that something new is starting to happen and there’s going to be a great deal of money to be made (and lost). The hardware to acquire, digest, and share sensor data just plummeted from $104 to $101 and that trend line seems to be extending. Music and video streaming on-demand are the norm, even in stadiums where people have paid real money to experience live entertainment. Battery fires are what make the news, but my new smoke detector lasts a decade without needing new batteries when I change the clocks twice a year which happens automatically these days anyway. Even radar has gotten small enough to Fig. 1.34 Kids today may only ever get to drive Cozy Coupes, since by the time they grow up autonomous vehicles should be commonplace. I’ll expect all autonomous vehicles to brake sharply if I step out into the street without looking 40 M. K. Hinders put on a COTS drone, and not just a little coffee can radar but a real phased array radar using meta-materials to sweep the beam in two directions with an antennalens combination about the size and weight of an iPad and at a cost already an order of magnitude less than before. For this drone radar to be able to sense-andavoid properly, though, the algorithms are going to need to improve because small drones operate in a cluttered environment. This explains why their output looks a lot like B-mode ultrasound, and while it’s true that power lines and barbed-wire fences backscatter radar similarly, the fence will always have bushes, trees, cows, etc. that confound the signal. This clutter problem is a key issue for autonomous ground robots, because there’s no clear distinction between targets and clutter and some (me) would say pedestrians are neither (Fig. 1.34). References 1. 1,340 Perish as Titanic sinks, only 886, mostly women and children, rescued. New York Tribune, New York. Page 1, Image 1, col. 1. Accessed 16 Apr 1912 2. Maxim SH (1912) Preventing collisions at sea, a mechanical application of the bat’s sixth sense. Sci Am 80–82. Accessed 27 July 1912 3. Maxim SH (1912) A new system of preventing collisions at sea. Cassel and Co., London, 147 p 4. A new system of preventing collisions at sea. Nature 89(2230):542–543 5. Dijkgraaf S (1960) Spallanzani’s unpublished experiments on the sensory basis of object perception in bats. Isis 51(1):9–20. JSTOR. www.jstor.org/stable/227600 6. Griffin DR (1958) Listening in the dark: the acoustic orientation of bats and men. Yale University Press, New Haven. Paperback – Accessed 1 Apr 1986. ISBN-13: 978-0801493676 7. Grinnell AD (2018) Early milestones in the understanding of echolocation in bats. J Comp Physiol A 204:519. https://doi.org/10.1007/s00359-018-1263-3 8. Donald R. Griffin obituary. http://www.nytimes.com/2003/11/14/nyregion/donald-r-griffin88-dies-argued-animals-can-think.html 9. Donald R. Griffin: 1915–2003. Photograph by Greg Auger. Bat Research News 45(1) (Spring 2004). http://www.batresearchnews.org/Miller/Griffin.html 10. Au WWL (1993) The sonar of dolphins. Springer, New York 11. Brittain JE (1985) The magnetron and the beginnings of the microwave age. Physics Today 38:7, 60. https://doi.org/10.1063/1.880982 12. Buderi R (1998) The invention that changed the world: how a small group of radar pioneers won the second world war and launched a technical revolution. Touchstone, Reprint edition. ISBN-13: 978-0684835297 13. Conant J (2002) Tuxedo park: a wall street tycoon and the secret palace of science that changed the course of world war II. Simon and Schuster, New York 14. Denny M (2007) Blip, ping, and buzz: making sense of radar and sonar. Johns Hopkins University Press, Baltimore. ISBN-13: 978-0801886652 15. Bowman JJ, Thomas BA, Senior, Uslenghi PLE, Asvestas JS (1970) Electromagnetic and acoustic scattering by simple shapes. North-Holland Pub. Co., Amsterdam. Paperback edition: CRC Press, Boca Raton. Accessed 1 Sept 1988. ISBN-13: 978-0891168850 16. Grier DA (2005) When computers were human. Princeton University Press, Princeton 17. The human computer project needs help finding all of the women who worked as computers or mathematicians at the NACA or NASA. https://www.thehumancomputerproject.com/ 18. Anderson VC (1950) Sound scattering from a fluid sphere. J Acoust Soc Am 22:426. https:// doi.org/10.1121/1.1906621 1 Background and History 41 19. NASA Dryden Flight Research Center Photo Collection (1949) NASA Photo: E49-54. https:// www.nasa.gov/centers/dryden/multimedia/imagegallery/Places/E49-54.html 20. Covert A (2011) Philco mystery control: the world’s first wireless remote. Gizmodo. Accessed 11 Aug 2011. https://gizmodo.com/5857711/philco-mystery-control-the-worldsfirst-wireless-remote 21. “Bombshell: the Hedy Lamarr story” Director: Alexandra Dean opened in theaters on November 24, 2017. http://www.pbs.org/wnet/americanmasters/bombshell-hedy-lamarr-story-fullfilm/10248/, https://zeitgeistfilms.com/film/bombshellthehedylamarrstory. Photo credit to https://twitter.com/Intel - Accessed 11 Mar 2016 22. Marr B (2016) A short history of machine learning – every manager should read. Forbes. Accessed 19 Feb 2016. https://www.forbes.com/sites/bernardmarr/2016/02/19/ashort-history-of-machine-learning-every-manager-should-read/65578fb215e7 23. Gonzalez V (2018) A brief history of machine learning. Synergic Partners. Accessed Jun 2018. http://www.synergicpartners.com/en/espanol-una-breve-historia-del-machine-learning 24. Johnson D (2017) Find out if a robot will take your job. Time. Accessed 19 Apr 2017. http:// time.com/4742543/robots-jobs-machines-work/ 25. Alan Turing: the enigma. https://www.turing.org.uk/ 26. Professor Arthur Samuel. https://cs.stanford.edu/memoriam/professor-arthur-samuel 27. “Professor’s perceptron paved the way for AI – 60 years too soon”. https://news.cornell.edu/ stories/2019/09/professors-perceptron-paved-way-ai-60-years-too-soon 28. Minsky M, Professor of media arts and sciences. https://web.media.mit.edu/~minsky/ 29. DeJong G (a.k.a. Mr. EBL). http://mrebl.web.engr.illinois.edu/ 30. Sejnowski T, Professor and computational neurobiology laboratory head. https://www.salk. edu/scientist/terrence-sejnowski/ 31. Foote KD (2017) A brief history of deep learning. Dataversity. Accessed 7 Feb 2017. http:// www.dataversity.net/brief-history-deep-learning/ 32. US Food and Drug Administration (2019) Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) - discussion paper and request for feedback. www.fda.gov 33. Philips (2019) Adaptive intelligence. The case for focusing AI in healthcare on people, not technology. https://www.usa.philips.com/healthcare/resources/landing/adaptive-intelligencein-healthcare 34. Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, Mahendiran T, Moraes G, Shamdas M, Kern C, Ledsam JR, Schmid MK, Balaskas K, Topol EJ, Bachmann LM, Keane PA, Denniston AK (2019) A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health 1(6):e271–e297. https://doi.org/10.1016/S2589-7500(19)30123-2 35. Krupinski EA, Graham AR, Weinstein RS (2013) Characterizing the development of visual search expertise in pathology residents viewing whole slide images. Hum Pathol 44(3):357– 364. https://doi.org/10.1016/j.humpath.2012.05.024 36. Janowczyk A, Madabhushi A (2016) Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform 7:29 37. Roy S, Kumar Jain A, Lal S, Kini J (2018) A study about color normalization methods for histopathology images. Micron 114:42–61. https://doi.org/10.1016/j.micron.2018.07.005 38. Komura D, Ishikawa S (2018) Machine learning methods for histopathological image analysis. Comput Struct Biotechnol J 16:34–42. https://doi.org/10.1016/j.csbj.2018.01.001 39. Landau MS, Pantanowitz L (2019) Artificial intelligence in cytopathology: a review of the literature and overview of commercial landscape. J Am Soc Cytopathol 8(4):230–241. https:// doi.org/10.1016/j.jasc.2019.03.003 40. Kannan S, Morgan LA, Liang B, Cheung MKG, Lin CQ, Mun D, Nader RG, Belghasem ME, Henderson JM, Francis JM, Chitalia VC, Kolachalama VB (2019) Segmentation of glomeruli within trichrome images using deep learning. Kidney Int Rep 4(7):955–962. https://doi.org/ 10.1016/j.ekir.2019.04.008 42 M. K. Hinders 41. Niazi MKK, Parwani AV, Gurcan MN (2019) Digital pathology and artificial intelligence. Lancet Oncol 20(5):e253–e261. https://doi.org/10.1016/S1470-2045(19)30154-8 42. Wang S, Yang DM, Rong R, Zhan X, Xiao G (2019) Pathology image analysis using segmentation deep learning algorithms. Am J Pathol 189(9):1686–1698. https://doi.org/10.1016/ j.ajpath.2019.05.007 43. Wang X et al (2019) Weakly supervised deep learning for whole slide lung cancer image analysis. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2019.2935141 44. Abels E, Pantanowitz L, Aeffner F, Zarella MD, van der Laak J, Bui MM, Vemuri VN, Parwani AV, Gibbs J, Agosto-Arroyo E, Beck AH, Kozlowski C (2019) Computational pathology definitions, best practices, and recommendations for regulatory guidance: a white paper from the Digital Pathology Association. J Pathol 249:286–294. https://doi.org/10.1002/path.5331 45. Tajbakhsh N, Jeyaseelan L, Li Q, Chiang J, Wu Z, Ding X (2019) Embracing imperfect datasets: a review of deep learning solutions for medical image segmentation. https://arxiv.org/abs/1908. 10454 46. Janke J, Castelli M, Popovic A (2019) Analysis of the proficiency of fully connected neural networks in the process of classifying digital images. Benchmark of different classification algorithms on high-level image features from convolutional layers. Expert Syst Appl 135:12– 38. https://doi.org/10.1016/j.eswa.2019.05.058 47. Faes L, Wagner SK, Fu DJ, Liu X, Korot E, Ledsam JR, Back T, Chopra R, Pontikos N, Kern C, Moraes G, Schmid MK, Sim D, Balaskas K, Bachmann LM, Denniston AK, Keane PA (2019) Automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study. Lancet Digit Health 1(5):e232–e242. https:// doi.org/10.1016/S2589-7500(19)30108-6 48. McKinney SM, Sieniek M, Godbole V et al (2020) International evaluation of an AI system for breast cancer screening. Nature 577:89–94. https://doi.org/10.1038/s41586-019-1799-6 49. Marks M (2019) The right question to ask about Google’s project nightingale. Slate. Accessed 20 Nov 2019. https://slate.com/technology/2019/11/google-ascension-project-nightingaleemergent-medical-data.html 50. Copeland R, Mattioli D, Evans M (2020) Paging Dr. Google: how the tech giant is laying claim to health data. Wall Str J. Accessed 11 Jan 2020. https://www.wsj.com/articles/pagingdr-google-how-the-tech-giant-is-laying-claim-to-health-data-11578719700 51. Photo from https://www.reddit.com/r/cablegore/ but it gets reposted quite a lot 52. Rous SN (2002) The prostate book, sound advice on symptoms and treatment. W. W. Norton & Company, Inc., New York. ISBN 978-0-393-32271-2 [53] 53. Imani F et al (2015) Computer-aided prostate cancer detection using ultrasound RF time series. In vivo feasibility study. IEEE Trans Med Imaging 34(11):2248–2257. https://doi.org/10.1109/ TMI.2015.2427739 54. Welch HG, Schwartz L, Woloshin S (2012) Overdiagnosed: making people sick in the pursuit of health, 1st edn. Beacon Press, Boston. ISBN-13: 978-0807021996 55. Holtzmann Kevles B (1998) Naked To the bone: medical imaging in the twentieth century, Reprint edn. Basic Books, New York. ISBN -13: 978-0201328332 56. Agarwal S, Milch B, Van Kuiken S (2009) The US stimulus program: taking medical records online. McKinsey Q. https://www.mckinsey.com/industries/healthcare-systems-and-services/ our-insights/the-us-stimulus-program-taking-medical-records-online 57. Pizza Rat is the nickname given to a rodent that became an overnight Internet sensation after it was spotted carrying down a slice of pizza down the stairs of a New York City subway platform in September 2015. https://knowyourmeme.com/memes/pizza-rat 58. Surprised Squirrel Selfie image at https://i.imgur.com/Tl1ieNZ.jpg. https://www.reddit.com/ r/aww/comments/4vw1hk/surprised_squirrel_selfie/. This was a PsBattle: a squirrel surprised by a selfie three years ago 59. Daubechies I (1992) Ten lectures on wavelets. Society for Industrial and Applied Mathematics. https://epubs.siam.org/doi/abs/10.1137/1.9781611970104 60. Hou J (2004) Ultrasonic signal detection and characterization using dynamic wavelet fingerprints. Doctoral dissertation, William and Mary, Department of Applied Science 1 Background and History 43 61. Howard JN (1964) The Rayleigh notebooks. Appl Opt 3:1129–1133 62. Strutt JW (1871) On the light from the sky, its polarization and colour. Philos Mag XLL:107– 120, 274–279 63. van de Hulst HC (1981) Light scattering by small particles. Dover books on physics. Corrected edition. Accessed 1 Dec 1981. ISBN-13: 978-0486642284 64. Kerker M (1969) The scattering of light and other electromagnetic radiation. Academic, New York 65. Bohren C, Huffman D (2007) Absorption and scattering of light by small particles. Wiley, New York. ISBN: 9780471293408 66. Knott EF, Tuley MT, Shaeffer JF (2004) Radar cross section. Scitech radar and defense, 2nd edn. SciTech Publishing, Raleigh 67. Richardson D (1989) Stealth: deception, evasion, and concealment in the air. Orion Books, London. ISBN-13: 978-0517573433 68. Sweetman B (1986) Stealth aircraft: secrets of future airpower. Motorbooks Intl, London. ISBN-13: 978-0879382087 69. Kenton Z (2016) Stealth aircraft technology. CreateSpace Independent Publishing Platform, Scotts Valley. ISBN-13: 978-1523749263 70. Mistaken identity. Futility Closet. Accessed 29 Apr 2011. http://www.futilitycloset.com/2011/ 04/29/mistaken-identity-2/ 71. Cellania M (2014) Alphonse Bertillon and the identity of criminals. Ment Floss. Accessed 21 Oct 2014. https://www.mentalfloss.com/article/59629/alphonse-bertillon-and-identitycriminals 72. Cole SA (2002) Suspect identities a history of fingerprinting and criminal identification. Harvard University Press, Cambridge. ISBN 9780674010024 Chapter 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves Mark K. Hinders and Corey A. Miller Abstract Structural health monitoring is a branch of machine learning where we automatically interpret the output of in situ sensors to assess the structural integrity and remaining useful lifetime of engineered systems. Sensors can often be permanently placed in locations that are inaccessible or dangerous, and thus not appropriate for traditional nondestructive evaluation techniques where a technician both performs the inspection and interprets the output of the measurement. Ultrasonic Lamb waves are attractive because they can interrogate large areas of structures with a relatively small number of sensors, but the resulting waveforms are challenging to interpret even though these guided waves have the property that their propagation velocity depends on remaining wall thickness. Wavelet fingerprints provide a method to interpret these complex, multi-mode signals and track changes in arrival time that correspond to thickness loss due to inevitable corrosion, erosion, etc. Guided waves follow any curvature of plates and shells, and will interact with defects and structural features on both surfaces. We show results on samples from aircraft and naval structures. Keywords Lamb wave · Wavelet fingerprint · Machine learning · Structural health monitoring 2.1 Introduction to Lamb Waves Ultrasonic guided waves are naturally suited to structural health monitoring of aerospace and naval structures, since large areas can be inspected with a relatively small number of transducers operating in various pitch-catch and/or pulse-echo scenarios. They are confined to the plate-/shell-/pipe-like structure itself and so follow its shape and curvature, with sensitivity to material discontinuities at either surface or in the interior. In structures, these guided waves are typically referred to as Lamb waves, a terminology we will tend to use here. Lamb waves are multi-modal and dispersive. M. K. Hinders (B) · C. A. Miller Department of Applied Science, William & Mary, Williamsburg, VA, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_2 45 46 M. K. Hinders and C. A. Miller Fig. 2.1 Dispersion curves for an aluminum plate. Solutions to the Rayleigh–Lamb wave equations are plotted here for both symmetric (solid lines) and antisymmetric (dashed lines) for both phase and group velocities When plotted versus a combined frequency-thickness parameter, Fig. 2.1, the phase and group velocities of the symmetric and antisymmetric families of modes are as shown for aluminum plates, although other structural materials have similar behavior. With the exception of the zeroth-order modes, all Lamb wave modes have a cutoff frequency-thickness value where their phase and group velocities tend to infinity and zero, respectively, and hence below which those modes do not propagate. 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 47 Fig. 2.2 Lamb wave tomography allows the reconstruction of thickness maps in false-color images because slowness is related to plate thickness via the dispersion curves. Each reconstruction requires many criss-cross Lamb wave measurements with arrival times of the modes of interest automatically extracted from the recorded waveforms Fig. 2.3 A typical Lamb waveform with modes of interest indicated Characteristic change of group velocity with thickness changes is what makes Lamb waves so useful for detecting flaws, such as corrosion and disbonds, that represent an effective change in thickness. We exploited these properties years ago in making Lamb wave tomographic reconstructions, as shown in false-color thickness maps in Fig. 2.2 for a partially delaminated doubler (left) a dished-out circular thinning (middle) and a circular flat-bottom hole (right). The key technical challenge to using Lamb waves effectively turns out to be automatically identifying which modes are which in very complex waveform signals. A typical signal for an aluminum plate is shown in Fig. 2.3, with modes marked according to the group velocities from the dispersion curve. In sharp contradistinction to traditional bulk-wave ultrasonics, it’s pretty clear that peak-detection sorts of approaches will fail rather badly. Although the dispersion curves tell us that the S2 mode is fastest, for this particular experimental case that mode is not generated with much energy and the S2 mode happens to have an amplitude more or less in the noise. The A0 , S0 , and A1 modes have higher amplitude, but those three modes all have about the same group velocity so they arrive jumbled together in time. Angle 48 M. K. Hinders and C. A. Miller Fig. 2.4 Lamb wave scattering from flaws is inherently a three-dimensional process, as can be seen from the screenshots of simulations of Lamb waves interacting with an attached thickness (left) and a rectangular thinning (right) blocks, comb transducers, etc. can be employed to select purer modes, or perhaps a wiser choice of frequency-thickness product can give easier to interpret signals. For example, f d = 4 MHz-mm could give a single fast S1 mode with all other modes much slower. Of course, most researchers simply choose a value below f d = 2 MHz-mm where all but the two fundamental modes are cut off. Some even go much lower in frequency where the A0 mode is nearly cut off and the S0 mode is not very dispersive, although that tends to minimize the most useful aspects of Lamb waves for flaw detection which is that the different modes each have different throughthickness displacement profiles. Optimal detection of various flaw types depends on choice of modes with displacement profiles that will interact strongly with it, i.e., scatter from it. Moreover, this scattering interaction will cause mode mixing to occur which can be exploited to better identify, locate, and size flaws. Lamb wave scattering from flaws is inherently a three-dimensional process, as can be seen from the screenshots of simulations of Lamb waves interacting with an attached thickness (left) and a rectangular thinning (right) in Fig. 2.4. 2.2 Background There is a large literature on the use of Lamb waves for nondestructive evaluation and structural health monitoring. Mathematically, they were first described by Lamb in 1917 [1] with experimental confirmation at ultrasonic frequencies published by Worlton in 1961 [2]. Viktorov’s classic book [3] still provides one of the best practical descriptions of the use of Lamb waves although the Dover edition of Graff’s book [4] is more widely available and gives an excellent discussion of the mathematical necessities. Rose’s more recent text [5] is also quite popular with Lamb wave researchers. Overviews of this research area can be found in several recent review articles [6–11] and books such as [12, 13]. There are two distinct guided wave propagation approaches available for structural health monitoring. The first is characterized by trying to simplify the signal inter- 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 49 pretation by selectively generating a single pure guided wave mode either by using more complex transducers, e.g., angle blocks or interdigitated comb transducers, or by selecting a low-enough frequency-thickness that only the first two guided wave modes aren’t cut off. The second approach aims to minimize the size and complexity of the transducer itself but to then choose a particular range of frequency-thickness values where the multi-mode guided wave signals are still manageable. The first approach also typically uses a tone-burst excitation in order to narrow the frequency content of the signals, whereas the second approach allows either a spike excitation or a walking tone-burst scheme with a broader frequency content. Which is the appropriate choice depends on the specific SHM application, although it’s important to point out that rapid advances in signal processing hardware and algorithms means that the second approach is the clear winner over the long term. The “pure mode” approach is inherently limited by trying to peak detect as is done in traditional ultrasound. Also, for scattering flaws such as cracks it’s critical both to have some bandwidth to the signals, so that the frequency dependence of scattering can be exploited to size flaws, and to be able to interpret multi-mode signals because guided wave mode conversion is the most promising approach to identify flaws without the need to resort to baseline subtraction. Many structural health monitoring approaches still use baseline subtraction [14– 21] which requires one or more reference signals recorded from the structure in an unflawed state. Pure modes [22–32] and/or dispersion and temperature correction [33–39] are also often employed to simplify the signal interpretation, although mode conversion during scattering at a crack often enables baseline-free approaches [40– 55] as does the time-reversal technique [56–64]. 2.3 Simulation Methods for SHM In optimizing guided wave SHM approaches and defining signal processing strategies, it’s important to be able to simulate the propagation of guided waves in structures and to be able to visualize how they interact with flaws. Most researchers tend to use one of the several commercially available finite element method (FEM) packages for this or the semi-analytic finite element (SAFE) technique [65–80]. Not very many researchers are doing full 3D simulations, since this typically requires computer clusters rather than desktop workstations in order to grid the full volume of the simulation space at high enough spatial resolution to accurately capture multi-mode wave behavior as well as the flaw geometry. This is an issue because guided wave interaction with flaws is inherently a 3D process, and two-dimensional analyses can give quite misleading information. The primary lure of FEM packages is their ability to model complex structures, although this is less of an advantage when simulating guided wave propagation. 50 M. K. Hinders and C. A. Miller A more attractive approach turns out to be variations of the finite-difference timedomain (FDTD) method [81, 82] where the elastodynamic field equations and boundary conditions are discretized directly and the simulation is stepped forward in time, recording the stress components and/or displacements across the 3D Cartesian grid at each time step. Fellinger et al. originally developed the basic equations of the elastodynamic finite integration technique (EFIT) along with a unique way to discretize the material parameters for ensured continuity of stress and displacement across the staggered grid in 1995 [83]. Schubert et al. demonstrated the flexibility of EFIT with discretizations in Cartesian, cylindrical, and spherical coordinates for a wide range of modeling applications [84–86]. Although commercial codes aren’t mature, these approaches are relatively straightforward to implement on either multi-core workstations or large computer clusters in order to have sufficient processing power and memory to perform high-resolution 3D simulations of realistic structures. In our experience [87–92], the finite integration technique tends to be quite a bit faster than the finite element method because the computationally intensive meshing step is eliminated by using a simple uniform Cartesian grid. For pipe inspection, a different, but only slightly more complex, cylindrical discretization of the field equations and boundary conditions is used, which optimizes cylindrical EFIT for simulating guided wave propagation in pipe-like structures. In addition to simple plate-like or pipe-like structures, EFIT can also be used to simulate wave propagation in complex built-up structures. Various material combinations, interface types, etc. can be simulated directly with cracks and delaminations introduced by merely adjusting the boundary conditions at the appropriate grid points. Simulation methods are also critical to optimizing guided wave structural health monitoring because the traditional approaches for modeling scattering from canonical flaws [93–95] fail for guided waves or are limited to artificial 2D situations. For low-enough frequency-thickness, the Mindlin plate theory [96] allows for analytical approaches to Lamb wave scattering from simple through-thickness holes and such, and can even account for some amount of mode conversion, but at the cost of assuming a simplistic through-thickness displacement profile. Most researchers have typically used boundary element method (BEM) and related integral equation approaches to simulate 2D scattering, but these are in some sense inherently lowfrequency methods. The sorts of high-frequency approaches that served the radar community so well until 3D computer simulations became viable aren’t appropriate for guided wave SHM, although there was a significant effort to begin to derive a library of diffraction coefficients three decades ago. With currently available computers, almost all researchers are now using FEM or EFIT to isolate the details of guided wave interaction with flaws. There are also a variety of experimental studies reported recently [40–110] to examine Lamb wave scattering from flaws, including Lamb wave tomography [104–113] which we worked on quite bit a number of years ago [114–125]. 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 51 2.4 Signal Processing for Lamb Wave SHM Advanced signal processing methods are necessary for guided wave SHM both because the signals are quite complex and because identifying and quantifying small flaws while covering large structures with a minimum number of sensors means propagation distances are going to be large and the “fingerprint” of the flaw scattering will usually be quite subtle. There is also the confounding issue that environmental changes, especially temperature, will affect the guided wave signals. This is a particular problem for baseline subtraction methods which assume that one or more baseline signals have been recorded with the structure in an unflawed state so that some sort of signal difference metric can be employed to indicate the presence of damage. With baseline subtraction approaches, there is also the issue of natural fluctuations (noise) in the Lamb waveforms themselves, which usually means that some sort of simplified representation of the signal, such as envelope, is rendered before subtracting off the baseline. The danger of this is that the subtle flaw fingerprints may be further suppressed by simplifying the signals. Other ways to compare signals in time domain are cross-correlation sorts of approaches, with or without stretching or multiple baselines to account for temperature variations, etc. Because scattering from flaws is frequency dependent, and because different Lamb wave modes have differing through-thickness displacement profiles and frequency-dispersion properties, the most promising signal processing approaches for Lamb wave SHM include joint time–frequency and time–scale methods. Echolocation [126] is actually quite similar to structural health monitoring with guided waves in that the time delay is used to locate flaws, while the character of the scattered echoes is what allows us to identify and quantify the flaws. A 2D color image time–frequency representation (TFR) typically has time delay on the horizontal axis and frequency on the vertical axis. The simplest way to form a spectrogram is via a boxcar FFT, where an FFT is performed inside of a sliding window to give the spectrum at a sequence of time delays. Boxcar FFT is almost never the optimal TFR, however, since it suffers rather badly from an uncertainty effect. Making the time window shorter to better localize the frequency content in time means that there often aren’t enough sample points to accurately form the FFT. Lengthening the window to get a more accurate spectrum doesn’t solve the problem, since then the time localization is imprecise. Alternative TFRs have been developed to overcome many of the deficiencies of the traditional spectrogram [127]. However, since guided wave SHM signals are typically composed of one or more relatively short wave pulses, albeit often overlapping, it is natural to explore TFRs that use basis functions with compact support. Wavelets [128] are very useful for analyzing time-series data because the wavelet transform allows us to keep track of both time and frequency, or scale features. Whereas Fourier transforms break down a signal into a series of sines and cosines in order to identify the frequency content of the entire signal, wavelet transforms keep track of local frequency features in the time domain. Ultrasonic signal analysis with wavelet transforms was first studied by Abbate in 1994 who found that if the mother wavelet was well defined there was good 52 M. K. Hinders and C. A. Miller peak detection even with large amounts of added white noise [129]. Massicotte, Goyette, and Bose then found that even noisy EMAT sensor signals were resolvable using the multi-scale method of the wavelet transform [130]. One of the strengths compared to the fast Fourier transform was that since the extraction algorithm did not need to include the inverse transform, the arrival time could be taken directly from the time–frequency domain of the wavelet transform. In 2002, Perov et al. considered the basic principles of the formulation of the wavelet transform for the purpose of an ultrasonic flaw detector and concluded that any of the known systems of orthogonal wavelets are suitable for this purpose as long as the number of levels does not drop below 4–5 [131]. In 2003, Lou and Hu found that the wavelet transform was useful in suppressing non-stationary wideband noise from speech [132]. In a comparison study between the Wigner–Ville distribution and the wavelet transform, preformed by Zou and Chen, the wavelet transform out performed the Wigner–Ville in terms of sensitivity to the change in stiffness of a cracked rotor [133]. In 2002, Hou and Hinders developed a multi-mode arrival time extraction tool that rendered the time-series data in 2D time-scale binary images [134]. Since then this technique has been applied to multi-mode extraction of Lamb wave signals for tomographic reconstruction [121, 123], time-domain reflectometry signals wiring flaw detection [135], acoustic microscopy [136], and a periodontal probing device [137]. Wavelets remain under active study worldwide for the analysis of a wide variety of SHM signals [138–153] as do various other time–frequency representations [154–159]. The Hilbert–Huang transform (HHT) [160–164] along with chirplets, Golay codes, fuzzy complex numbers, and related approaches are also gaining popularity [65–166]. Our preferred method, which we call dynamic wavelet fingerprints, is discussed in some detail below. Wavelets are often ideally suited to analyzing non-stationary signals, especially since there are a wide variety of mother wavelets that can be evaluated to find those that most parsimoniously represent a given class of signals. The wavelet transform coefficients can be rendered in an image similar to the spectrogram, except that the vertical axis will now be “wavelet scale” instead of frequency. The horizontal axis will still be time delay because the “wavelet shift” corresponds to that directly. Nevertheless, these somewhat abstract time-scale images can be quite helpful for identifying subtle signal features that may not be resolvable via other TFR methods. 2.5 Wavelet Transforms Wavelets are ideally suited for analyzing non-stationary signals. They were originally developed to introduce a local formulation of time–frequency analysis techniques. The continuous wavelet transform (CWT) of a square-integrable, continuous function s(t) can be written as C(a, b) = +∞ −∞ ∗ ψa,b (t)s(t)dt, (2.1) 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 53 where ψ(t) is the mother wavelet, ∗ denotes the complex conjugate, and ψa,b (t) is given by t −b −p . (2.2) ψa,b (t) = |a| ψ a Here, the constants a, b ∈ R, where a is a scaling parameter defined by p ≥ 0, and b is a translation parameter related to the time localization of ψ. The choice of p is dependent only upon which source in the literature is being referred to, much like the different conventions for the Fourier transform, so we choose to implement the most common value of p = 1/2. The mother wavelet can be any square-integrable function of finite energy and is often chosen based on its similarity to the inherent structure of the signal being analyzed. The scale parameter a can be considered to relate to different frequency components of the signal. For example, small values of a result in a compressed mother wavelet, which will then highlight many of the highdetail characteristics of the signal related to the signal’s high-frequency components. Similarly, large values of a result in stretched mother wavelets, returning larger approximations of the signal related to the underlying low-frequency components. To better understand the behavior of the CWT, it can be rewritten as an inverse Fourier transform, C(a, b) = 1 2π +∞ −∞ ∗ √ ŝ(ω) a ψ̂(aω) e jωb dω, (2.3) where ŝ(ω) and ψ̂(ω) are the Fourier transforms of the signal and wavelet, respectively. From Eq. (2.3), it follows that stretching a wavelet in time causes its support in the frequency domain to shrink as well as shift its center frequency toward a lower frequency. This concept is illustrated in Fig. 2.5. Applying the CWT with only a single mother wavelet can therefore be thought of as applying a bandpass filter, while a series of mother wavelets via changes in scale can be thought of as a bandpass filter bank. An infinite number of wavelets are therefore needed for the CWT to fully represent the frequency spectrum of a signal s(t); since every time the value of the scaling parameter a is doubled, the bandwidth coverage is reduced by a factor of 2. An efficient and accurate discretization of this involves selecting dyadic scales and positions based on powers of two, resulting in the discrete wavelet transform (DWT). In practice, the DWT requires an additional scaling function to act as a low-pass filter to allow for frequency spectrum coverage from ω = 0 up to the bandpass filter range of the chosen wavelet scale. Together, scaling functions and wavelet functions provide full-spectrum coverage for a signal. For each scaled version of the mother wavelet ψ(t), a corresponding scaling function φ(t) exists. Just as Fourier analysis can be thought of as the decomposition of a signal into various sine and cosine components, wavelet analysis can be thought of as a decomposition into approximations and details. These are generated through an implementation of the wavelet and scaling function filter banks. Approximations are the high-scale 54 M. K. Hinders and C. A. Miller Fig. 2.5 Frequency-domain representation of a hypothetical wavelet at scale parameter values of a = 1, 2, 4. It can be seen that increasing the value of a leads to both a reduced frequency support and a shift in the center frequency component of the wavelet toward lower frequencies. In this sense, the CWT acts as a shifting bandpass filter of the input signal (low-frequency) components of the signal revealed by the low-pass scaling function filters, while details are the low-scale (high-frequency) components revealed by the high-pass wavelet function filter. This decomposition process is iterative, with the output approximations for each level used as the input signal for the following level, illustrated in Fig. 2.6. In general, most of the information in a time-domain signal is contained in the approximations of the first few levels of the wavelet transform. The details of these low levels often have mostly high-frequency noise information. If we remove the details of these first few levels and then reconstruct the signal with the inverse wavelet transform, we will have effectively de-noised the signal, keeping only the information of interest. 2.5.1 Wavelet Fingerprinting Once a raw signal has been filtered, we then pass it through the DWFP algorithm. Originally developed by Hou [134], the DWFP applies a wavelet transform on the original time-domain data, resulting in an image containing “loop” features that resemble fingerprints. The wavelet transform coefficients can be rendered in an image similar to a spectrogram, except that the vertical axis will be scale instead of frequency. These time-scale image representations can be quite helpful for identifying subtle signal features that may not be resolvable via other time–frequency methods. Combining Eqs. 2.1 and 2.2, the CWT of a continuous square-integrable function s(t) can be written as 1 C(a, b) = √ a +∞ −∞ s(t)ψ ∗ t −b dt. a (2.4) 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 55 Fig. 2.6 The signal is decomposed into approximations (A1 ) and details (D2 ) at the first level. The next iteration then decomposes the first-level approximation coefficients into second-level approximations and details, and this process is repeated for the desired number of levels. For wavelet filtering, the first few levels of details can be removed, effectively applying a low-pass filter to the signal Unlike the DWT, where scale and translation parameters are chosen according to the dyadic scale (a = 2m , b = n2m , n, m ∈ Z2 ), the MATLAB implementation of the CWT used here utilizes a range of real numbers for these coefficients. A normal range of scales includes a = 1, . . . , 50 and b = 1, . . . , N for a signal of length N . This results in a two-dimensional array of coefficients, C(a, b), which are normalized to the range of [−1, 1]. These coefficients are then sliced in a “thick” contour manner, where the number of slices and thickness of each slice is defined by the user. To increase efficiency, the peaks (C(a, b) ≥ 0) and valleys (C(a, b) < 0) are considered separately. Each slice is then projected onto the time-scale plane. The resulting slice projections are labeled in an alternating, binary manner, resulting in a binary “fingerprint” image, I (a, b): s(t) DW F P(ψa,b ) −→ I (a, b). (2.5) 56 M. K. Hinders and C. A. Miller The values of slice thickness and number of slices can be varied to alter the appearance of the wavelet coefficients, as can changing which mother wavelet is used. The process of selecting mother wavelets for consideration is application-specific, since certain choices of ψ(t) will be more sensitive to certain types of signal features. In practice, mother wavelets used are often chosen based on preliminary analysis results as well as experience. In general, most of the information in a signal is contained in the approximations of the first few levels of the wavelet transform. The details of these low levels often have mostly high-frequency noise information. If we set the details of these first few levels to zero, when we reconstruct the signal with the inverse wavelet transform we have effectively de-noised our signal to keep only information of the Lamb wave modes of interest. In our work, we start with the filtered ultrasonic signal and take a continuous wavelet transform (CWT). The CWT gives a surface of wavelet coefficients, and this surface is then normalized between [0–1]. Then, we perform a thick contour slice operation where the user defines the number of slices to use: the more the slices, the thinner the contour slice. The contour slices are given the value of 0 or 1 in alternating fashion. They are then projected down to a 2D image where the result often looks remarkably like the ridges of a human fingerprint, hence the name “wavelet fingerprints.” Note that we perform a wavelet transform as usual, but then instead of rendering a color image we form a particular type of binary contour plot. This is illustrated in Fig. 2.7. The wavelet fingerprint has time on the horizontal axis and wavelet scales on the vertical axis. We’ve deliberately shown this at a “resolution” where the pixilated nature of the wavelet fingerprint is obvious. This is important because each of the pixels is either black or white: it is a binary image. The problem has thus been transformed from one-dimensional signal identification problem to a 2D image recognition scenario. The power of the dynamic wavelet fingerprint (DWFP) technique is that it converts the time-series data into a binary matrix that is easily stored and transferred, and is amenable to edge computing implementations. There is also robustness to the simple algorithm (Fig. 2.8) since different mother wavelets emphasize different features in the signals. The last piece of the DWFP technique is recognition of the binary image features that correspond to the waveform features of interest. We have found that different modes are represented in unique features in our applications. We’ve also found that using a simple ridge-counting algorithm on the 2D images is often a helpful way to identify some of the features of interest. In Fig. 2.9, we show a small portion of a fingerprint image, blown up so the ridge counting at each time sample can be seen. Figure 2.10 shows longer fingerprints for two waveforms with and without a flaw in the location indicated by the dashed rectangle. In this particular case, the flaw is identified by thresholding the ridge-count metric, as indicated by the bottom panel. Once such a feature has been identified in the time-scale space, we know its arrival in the time domain as well and we can then draw conclusions about its location based on our knowledge of that guided wave mode velocity. 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 57 0.5 0 -0.5 -0.2 0 Time (ms) 0.2 Scales 60 40 20 -0.2 0 Time (ms) 0.2 Fig. 2.7 A visual summary of the DWFP algorithm [134]. A time-domain signal for which a set of wavelet coefficients is generated via the continuous wavelet transform. The coefficients are then “thickly” sliced and projected onto the time-scale plane, resulting in two-dimensional binary images, shown here with white peaks and gray valleys for distinction The inherent advantage of traditional TFRs is thus to transform a one-dimensional time-trace signal into a two-dimensional image, which then allows powerful image processing methods to be brought to bear. The false-color images also happen to be visually appealing, but this turns out to be somewhat of a disadvantage when the goal is to automatically identify the features in the image that carry the information about the flaw(s). High-resolution color imagery is computationally expensive to both store and process, and segmentation is always problematic. This latter issue is particularly difficult for SHM because it’s going to be something about the shape of the flaw signals(s) in the TFR image that we’re searching for automatically via image processing algorithms. A binary image requires much less computer storage than does a greyscale or color image, and segmentation isn’t an issue because it’s a trivial matter to decide between black and white. These sorts of fingerprint images can be formed from any TFR, of course, although wavelets seem to work quite well for guided wave ultrasonics and a variety of applications that we have investigated. Figure 2.10 shows two cases with and without flaws. 58 M. K. Hinders and C. A. Miller Dynamic Wavelet Fingerprint Algorithm Nondestructive Evaluation Laboratory – William & Mary Applied Science Department The code at the bottom is a MATLAB algorithm to create a 2D fingerprint image from a 1D signal. This is the same algorithm used in the Wavelet Fingerprint Tool Box. This algorithm can easily be implemented in any programming language that can perform a continuous wavelet transform. The following table description the variables passed to the function. datain wvt ns numridges rthickness The raw 1D signal in which the wavelet fingerprint is created. The name of the mother wavelet. For example: ‘coif2’, ‘sym2’, ‘mexh’, ‘db10’. The number of scales to use in the continuous wavelet transform (start with 50). The number of ridges used in the wavelet fingerprint. (start with 5) The thickness of the ridges normalized to 1. (start with 0.12) The output variable fingerprint contains the wavelet fingerprint image. It is an array (length(rawdata) by ns) of 1’s and 0’s where the 1’s represent the ridgelines. The following is a sample call of this function. >> fingerprint = getfingerprint( rawdata, ‘coif3’, 50, 5, 0.12 ); MATLAB Wavelet Fingerprint Algorithm function [ fingerprint ] = getfingerprint( datain, wvt, ns, numridges, rthickness) cfX = cwt(datain, 1:ns, wvt); cfX = cfX./ max(max(abs(cfX))); fingerprint(1:ns,1:length(datain))=0; % get continuous wavelet transform coefficients % normalize coefficients % set image size and all values to zero % Ridge locations is an array that holds the center of each slice (ridge) rlocations = [-1:(1/numridges):-(1/numridges) (1/numridges):(1/numridges):1]; for sl = 1:length(rlocations) % Loop through each slice for y = 1:ns % Loop through cfX array for x = 1:length(rawdata) if (cfX(y,x)>=(rlocations(sl)-(rthickness/2))) & (cfX(y,x)<=(rlocations(sl)+(rthickness/2)))); fingerprint(x,y) = 1; % Set ridges to white end end end end Fig. 2.8 The simple wavelet fingerprint algorithm is only a few lines of code in MATLAB 2.6 Machine Learning with Wavelet Fingerprints As candidate TFRs are considered and binary fingerprint images are formed from them, the task is to down select those that seem to best highlight the signal features of interest while minimizing the clutter from signal features that aren’t of interest. We typically perform this in an interactive manner, since we’ve implemented the wavelet fingerprint method via a MATLAB GUI which allows us to easily read in signals, perform de-noising and windowing as needed, and then form wavelet fingerprints from any one of the 36 mother wavelets built into the MATLAB wavelet toolbox, selecting a variety of other parameters/options particular to the method. This works well because the human visual system is naturally adapted to identifying features in this sort of binary imagery (think reading messy handwriting) and we’re interested initially in naming features and describing qualitatively how the fingerprint features change as the flaws or other physical features in the measurement setup are varied. For example, a triangular feature might be the signature of a particular kind of flaw, with the number of ridges indicating the severity of the flaw and the position in time corresponding to the flaw’s physical location. A feature of interest might instead be a circle or an oval, with the “circularity” or eccentricity and orientation as quantitative measures of severity. The point is that once such features are identified it is a relatively simple matter to write algorithms to track them in fingerprint images. Indeed, there 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 59 Fig. 2.9 Example of ridge counting in a fingerprint image. The number of connected “on” pixel regions in each column corresponds to the number of ridges for that point in time Fig. 2.10 A ridge-counting metric clearly identifies the differences between the two wavelet fingerprints due to the presence of a flaw 60 M. K. Hinders and C. A. Miller is quite a large literature on fingerprint classification algorithms dating back to the telegraph days where fingerprint images were manually converted to strings of letters and numbers so they could be transmitted over large distances to attempt to identify criminal suspects [167]. For very subtle flaws, i.e., small cracks in large structures, it may not be possible to find simple, readily identifiable fingerprint features in any TFR, whether Fourier or wavelet-based. That isn’t an insurmountable problem, of course, because the transformation of one-dimensional waveforms into two-dimensional TFR images allows a variety of image characterization and pattern classification approaches to be employed. Previous researchers have made use of the wavelet transform for pattern classification applications [168, 169]. One option is to integrate wavelets directly by capitalizing on the orthogonal property of wavelets to estimate the class density functions [170]. However, most applications of wavelets to pattern recognition focus on feature extraction techniques [171]. One common method involves finding the wavelet transform of a continuous variable (sometimes a signal) and computing the spectral density, or energy, which is the square of the coefficients [172, 173]. Peaks of the spectral density or the sum of the density can be used as features and have been applied to flank wear estimation in turning processes and classifying diseased lung sounds [172] as well as to evaluating astronomical chirps and the equine gait [173]. This technique is similar to finding the cross-correlation but is only one application of wavelets to signal analysis. One alternate method is to deconstruct the signal into an orthogonal basis, such as Laguerre polynomials [174]. Another technique is the adaptive wavelet method [175–178] which stems from multiresolution analysis [179]. Multiresolution analysis applies a wavelet transform using an orthogonal basis resulting in filter coefficients in a pyramidal computation scheme, while adaptive wavelet analysis uses a generalized M-band wavelet transform to similarly achieve decomposition into coefficients and insert those coefficients into matrix form for optimization. Adaptive wavelets result in efficient compression and have the advantage of being widely applicable. Wavelet methods are also often applied for feature extraction in images, such as for shape characterization and boundary detection [180, 181]. Wavelets are particularly useful for detecting singularities, and in 2D data spaces this results in an ability to identify corners and boundaries. Shape characterization itself is a precursor to template matching in pattern classification, in which outlines of objects are extracted from an image and matched to known shapes from a library. Other techniques are similar to those described above, including multiresolution analysis, which is also similar to the image processing technique of matched filters [182–186]. Either libraries of known wavelets or wavelets constructed from the original signal are used to match the signal of interest. Pattern recognition then proceeds in a variety of ways from the deconstructed wavelet coefficients. The coefficients with minimal cost can be used as features, or the results from each sequential step can be correlated individually. To reduce dimensionality, sometimes projection transforms are precursors to decomposition. Some authors have constructed a rotationally invariant projection that deconstructs an image into sub-images and transforms the mathematical space from 2D to 1D [187, 188]. Also, constructing a set of orthonormal bases, just as in 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 61 the case of adaptive wavelets above, remains useful for images [189]. Because of the dimensionality, computing the square of the energy is cumbersome and so a library of codebook vectors is necessary for classification [190]. There have been many successful applications of pattern recognition in ultrasound, some of which include wavelet feature extraction methods. In an industrial field, Tansel et al. [191, 192] selected coefficients from wavelet decomposition for feature vectors and were able to use pattern recognition techniques to detect tool failure in drills. Learned and Wilsky [193] constructed a wavelet packet approach, in which energy values are calculated from a full wavelet decomposition of a signal, to detect sonar echoes for submarines. Wu and Du [194] also used a wavelet packet description but found that feature selection required knowledge of the physical space, in this case, drill failure. Case and Waag [195] used Fourier coefficients instead of wavelet coefficients for features that successfully identified flaws in pipes. Comparing several techniques, including features selected from wavelet, time, and spectral domains, Drai et al. [196] identified welding defects using a neural network classifier. Buonsanti et al. [197] compared ultrasonic pulse-echo and eddy current techniques to detect flaws in plates using a fuzzy logic classifier and wavelet features relevant to the physical domain. In medical diagnostics, an early example of applying classification techniques is evidenced in Momenan et al.’s work [198] in which features selected offline are used to identify changes in tissues as well as clustering in medical ultrasound images. Bankman et al. [199] classified neural waveforms successfully with the careful application of preprocessing techniques such as whitening via autocorrelation. Meanwhile, Kalayci et al. [200] detected EEG spikes by selecting eight wavelet coefficients from two different Debauchies wavelet transforms for application in a neural network. Tate et al. [201] similarly extracted wavelet coefficients as well as other prior information to attempt to identify vegans, vegetarians, and meat-eaters by their magnetic resonance spectra. Interestingly, vegetarians were more difficult to identify, with a classification accuracy of 80%. Mojsilovic et al. [202] applied wavelet multiresolution analysis to identify infarcted myocardial tissue from medical ultrasound images. Georgiou et al. [203, 204] used wavelet decomposition to calculate scale-averaged wavelet power up to a threshold and detected the presence of breast cancer in ultrasound waveforms by means of hypothesis testing. Also using multiresolution techniques, Lee et al. [205] further selected fractal features to detect liver disease. Alacam et al. [206] improved existing breast cancer characterization of ultrasonic B-mode images by adding fractional differencing and moving average polynomial coefficients as features. We have applied pattern classification techniques to mine roof-fall prediction [207] and mobile robotic navigation [208] in addition to ultrasonic identification of structural flaws via detection of abstract features in dynamic wavelet fingerprints [209–214]. The goal of pattern recognition is to classify different objects into categories (classes) based on measurements made on those objects (features). Pattern recognition is a subset of machine learning, in which computers use algorithms to learn from data. Statistical pattern recognition, however, requires a statistical characterization of the likelihood of each object belonging to a particular class. Pattern recognition 62 M. K. Hinders and C. A. Miller usually proceeds in two different ways: unsupervised pattern recognition draws distinctions or clusters in the data without taking into account the actual class labels, while supervised pattern recognition uses the known class labels along with the statistical characteristics of the measurements to identify objects with the class that reduces the error. We will focus our attention on supervised pattern classification. In applying pattern recognition techniques to finite, real-world datasets, merely classifying the data is not sufficient. To train the classifier and also evaluate its performance, we have to withhold some of the data to test the classifier. A classifier that is tested with a subset of the data it was trained on is said to be overtrained, which means the classifier may perform very well on data it was trained on but its performance on unseen data is not measured. Therefore, for finite, real-world datasets, the available observations need to be split into a subset used for training and a subset used for testing. Pattern classification is the subset of machine learning that involves taking in raw data and grouping it into categories. Many excellent textbooks exist on the subject [215–220], while several review papers have explored the topic as well [221–223]. Emerging applications in the fields of biology, medicine, financial forecasting, signal analysis, and database organization have resulted in the rapid growth of pattern classification algorithms. We focus our research here on statistical pattern recognition, but other approaches exist including template matching, structural classification, and neural networks [223]. Statistical pattern classification represents data by a series of measurements, forming a one-dimensional feature vector for each individual data point. A general approach for statistical pattern classification includes the following steps: preprocessing, feature generation, feature extraction/selection, learning, and classification. Preprocessing involves any segmentation and normalization of data that leads to a compact representation of the pattern. Feature generation involves creating the feature vector from each individual pattern, while feature extraction and selection are optional steps that reduce the dimension of the feature vector using either linear transformations or the direct removal of redundant features. The learning step involves training a given classification algorithm, which then outputs a series of decision rules based on the data supplied. Finally, new data points are supplied to the trained classifier during the classification step, where they are categorized based on their own feature vector relative to the defined decision rules. There exists a theorem in pattern classification known as the Ugly Duckling Theorem, which states that in the absence of assumptions there is no “best” feature representation for a dataset, since assumptions about what “best” means are necessary for the choice of features [224]. Appropriate features for a given problem are usually unknown a priori, and as a result, many features are often generated without any knowledge of their relevancy [225]. This is frequently the motivation behind generating a large number of features in the first place. If a specific set of features is known that completely defines the problem and accurately represents the input patterns, then there is no need for any reduction in the feature space dimensionality. In practice, however, this is often not the case, and an intelligent feature reduction technique can simplify the classifiers that are built, resulting in both increased com- 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 63 putational speed and reduced memory requirements. Aside from performance gains, there is a well-known phenomenon in pattern classification affectionately called the Curse of Dimensionality that occurs when the number of objects to be classified is small relative to the dimension of the feature vector. A generally accepted practice in classifier design is to use at least ten times as many training samples per class as the number of features [226]. The dataset becomes sparse when represented in too high of dimension feature space, degrading classifier performance. It follows that an intelligent reduction in the dimension of the feature space is needed. There are two general approaches to reducing the feature set: feature extraction and feature selection. Feature extraction reduces the feature set by creating new features through transformations and combinations of the original features. Principal component analysis, for example, is a commonly used feature extraction technique. Since we are interested in retaining the original physical interpretation of the feature set, we opt not to use any feature extraction techniques in our analysis. There are three general approaches for feature selection: wrapper methods, embedded methods, and filter methods [227]. Wrapper methods use formal classification to rank individual feature space subsets, applying an iterative search procedure that trains and tests a classifier using different feature subsets for accuracy comparison. This continues until a given stopping criterion is met [228]. This approach is computationally intensive, and there is often a trade-off among algorithms between computation speed and the quality of results that are produced [229–231]. Additionally, these methods have a tendency to over-train themselves, where data in the training set is perfectly fitted and results in poor generalization performance [232]. Similar to wrapper methods, an embedded method performs feature selection while constructing the classification algorithm itself. The difference is that the feature search is intelligently guided by the learning process itself. Filter methods perform their feature ranking by looking at intrinsic properties of the data without the input of a formal classification algorithm. Traditionally, these methods are univariate and therefore don’t account for multi-feature dependencies. Regardless of feature selection, once a feature set has been finalized, the selection of an appropriate classification algorithm depends heavily on what is known about the domain of the problem. If the class-conditional densities for the problem at hand are known, then Bayes decision theory can be applied directly to design a classifier. This, however, is rarely the case for experimental data. If the training dataset has known labels associated with it, then the problem is one of supervised learning. If not, then the underlying structure of the dataset is analyzed through unsupervised classification techniques such as cluster analysis. Within supervised learning, a further dichotomy exists based on whether the form of the class-conditional densities is known. If it is known, parametric approaches such as Bayes plug-in classifiers can be developed that estimate missing parameters based on the training dataset. If the form of the densities is unknown, nonparametric approaches must be used, often constructing the decision boundaries geometrically from the training data. Wavelets are very useful for analyzing time-series data because wavelet transforms allow us to keep track of the time localization of frequency components. Unlike the Fourier transform, which breaks a signal down into sine and cosine components to 64 M. K. Hinders and C. A. Miller identify frequency content, the wavelet transform measures local frequency features in the time domain. One direct advantage the wavelet transform has over the fast Fourier transform is that the time information of signal features can be taken directly from the transformed space without an inverse transform required. One common wavelet-based feature used in classification is generated by wavelet packet decomposition (WPD) [233]. Yen and Lin successfully classify faults in a helicopter gearbox using WPD features generated from time-domain vibration analysis signals [234]. The use of wavelet analysis for feature extraction has also been explored by Jin et al. [235], where wavelet-based features have been used for damage detection in polycrystalline alloys. Gaul and Hurlebaus also used wavelet transforms to identify the location of impacts on plate structures [236]. Sohn et al. review the statistical pattern classification models currently being used in structural health monitoring [237]. Most implementations involve identifying one specific type of “flaw”, including loose bolts and small notches, and utilize only a few specific features to separate the individual flaw classes within the feature space [238–240]. Biemans et al. detected crack growth in an aluminum plate using wavelet coefficient analysis generated from guided waves [241]. Legendre et al. found that even noisy electromagnetic acoustic transducer sensor signals were resolvable using the multi-scale method of the wavelet transform [242]. The wavelet transform has also been shown to outperform other traditional time–frequency representations in many applications. Zou and Chen compared the wavelet transform to the Wigner–Ville distribution for identifying a cracked rotor through changes in stiffness [243], identifying the wavelet transform to be more sensitive to variation and generally superior to the Wigner–Ville distribution. We have introduced the principles of using the DWFP wavelet transform technique for time-domain signal analysis. We next apply this technique to two real-world industrial structural health monitoring (SHM) applications. First, we explore how the DWFP technique can be utilized to distinguish dents and the resulting rear surface cracks generated in aircraft-grade aluminum plates. We then show how wavelet fingerprints can be used to identify flaws in steel and aluminum naval structures. 2.7 Applications in Structural Health Monitoring Originally explored by Lord Rayleigh in 1885 while investigating the propagation of surface waves in a solid, e.g., as in earthquakes, the study of guided waves has continued over the years [244]. The key technical challenge to using Lamb waves effectively is automatically identifying which modes are in very complex waveform signals. There are usually several guided wave modes present at any given frequency-thickness value, often having overlapping mode velocities. Since each mode propagates with its own modal structure, modes traveling with the same group velocity result in a superposition of the individual displacement and stress components. The resulting signals are inherently complex, and sophisticated analysis techniques need to be applied to make sense of the signals. Contrary to traditional 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 65 Fig. 2.11 A plate with no flaw present, a plate with a shallow, large crack present, and a plate with both a dent and a crack present bulk-wave ultrasonics, standard peak-detection approaches often fail for Lamb wave analysis. 2.7.1 Dent and Surface Crack Detection in Aircraft Skins Aircraft manufacturers and operators have concerns over fuselage damage commonly called “ramp rash” which occurs in the lower fuselage sections and can be caused by incidental contact with ground equipment such as air stairs, baggage elevators, and food service trucks [245]. Ramp rash costs airlines in both damage repair and downtime [246]. Even minor damage can become the source of serious mechanical failure if left unattended [247]. The formation of dents in the fuselage is common, but the generation of hidden cracks can lead to serious complications. Repairing this subsequent damage is critical for aviation safety; however, the hidden extent of the damage is initially unknown. In order to accurately estimate repair requirements, and therefore reduce unnecessary downtime losses, an accurate assessment of the damage is required. A continuously monitoring inspection system at fuselage areas prone to impact would provide an alternative to conventional point-by-point inspections. Guided waves have previously shown potential for damage detection in metallic aircraft components. We test a guided wave-based SHM technique for identifying potential flaws in metallic plate samples. We employ the DWFP to generate time-scale signal representations of the complex guided waveforms, from which we extract subtle features to assist in damage identification. The samples tested were aircraft-grade 1.62-mm-thick aluminum 2024 plates (with primer), approximately 45 cm × 35 cm in size. Some of them had dents, and some also had simulated cracks of varying length and orientation, images of which are shown in Fig. 2.11. Experimental data was collected using transducers manufactured by Olympus.1 Signals were generated and received using a Nanopulser 3 (NP3) from Automated 1 Waltham, MA (http://www.olympus-ims.com/en/). 66 M. K. Hinders and C. A. Miller Inspection Systems.2 The NP3 integrates both a pulser/receiver and a digitizer into a portable USB-interfaced unit, emitting a spike excitation which relies on the resonant frequency of the transducers in order to generate the ultrasonic waves with a broad frequency content to maximize scattering interaction with the surface cracks. The compact size of the NP3 makes it favorable for in-field use for aircraft damage analysis. A graphical user interface was developed in MATLAB to control the NP3 system. We explore both a pitch/catch scanning configuration, where one transducer is the transmitter while a second transducer is the receiver, as well as a pulse/echo configuration where one transducer acts as both the transmitter and receiver. We control the position of each transducer by attaching two individual transducers to linear slides, controlled by stepper motors that advance them along the edges of the test area. The test area was centered around the flaw, with the transducers placed 215 mm apart from each other and advancing through 25 individual locations in 8.6 cm increments for a total scan length of 215 mm. Each waveform is run through the DWFP algorithm to generate a fingerprint representation. This process is demonstrated for a typical waveform in Fig. 2.12 and involves windowing the full signal around the first arriving wave mode, which is then used as an input signal for the DWFP transformation. Dent Detection—Pitch/Catch Scanning We first present results here from the pitch/catch scanning configuration. The first sample considered contained a crack only, and the raw waveforms can be seen in Fig. 2.13. It can be seen that these signals are all very similar to each other, and little information is readily available to identify the crack in the time domain. Figure 2.14 provides the DWFP representations of these same signals. For each individual DWFP image, a specific feature is identified and automatically tracked throughout the scan progression. The feature in this case is the first arriving mode, indicated by the first “peak” (in white) in time. For each following DWFP image, this feature is identified and highlighted by a red star (∗), with the corresponding position in time given to the right of each fingerprint image. This first plate sample again shows no variation in the DWFP representations as the transducers are moved along the edges of the plate. The feature of interest highlighted varies little in its position. This indicates that the pitch/catch scanning configuration is inadequate for identifying surface cracks in the material. The second plate scanned contained a dent only. Raw waveforms collected as the pair of transducers progress along the sample edges can be seen in Fig. 2.15. It can be seen again that these signals are all very similar to each other, with the timedomain representations unable to highlight the dent. Figure 2.16 provides the DWFP representations of these same signals. The same feature described above, the first arriving mode indicated by the first DWFP peak, is identified and tracked again. It can be seen that there is a region of the sample where this feature shifts, indicating the presence of a discontinuity in the propagation caused by the dent. It follows that the pitch/catch scanning configuration is adequate for identifying dents in the material. 2 Martinez, CA (http://www.ais4ndt.com/). 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 67 1500 Fig. 2.12 A raw Rayleigh–Lamb waveform collected from an unflawed plate sample. The signal is first windowed to the region of interest, here the first mode arrival, from which a DWFP image is generated 1000 500 0 -500 -1000 -1500 55 60 65 70 75 80 85 90 95 Time (µs) 1000 0 -1000 74 75 76 77 78 79 80 81 Time (µs) Scale 10 20 30 40 74 75 76 77 78 79 80 81 Time (µs) Crack Detection—Pulse/Echo Scanning We now use the same transmitting transducer in pulse/echo mode to determine if any energy is being reflected from either type of flaw to aid in detection. We again present results here for the two samples previously considered: one with a crack and one with a dent. We first present the raw waveforms collected from interaction with the crack only, shown in Fig. 2.17. The raw waveforms contain an initial highamplitude portion that is a result of energy reflecting within the transducer itself. We window our search in the time frame where a reflection would be expected to exist if present. It can be seen that these signals are all very similar to each other, and all have a low signal-to-noise ratio making it difficult to easily detect any features identifying a flaw. Figure 2.18 provides the DWFP representations of these same signals. For each individual DWFP image, a specific feature is identified and automatically tracked throughout the scan progression. The feature in this case is the first peak “doublet” feature after the 65 μs mark, where a doublet is indicated by two fingerprint features 68 M. K. Hinders and C. A. Miller Fig. 2.13 Raw waveforms collected from a plate sample with a crack only, using a pitch/catch scanning configuration. Little information is readily available from the time-domain representation of the signal. The dotted lines indicate the windowed region used for the DWFP image generation existing for different scales at the same point in time. For each following DWFP image, this feature is identified and highlighted by a red star (∗), with the corresponding position in time given to the right of each fingerprint image. If no such feature is found, the end point of the window is used. We can see that the DWFP representations of the pulse/echo signals are able to identify waveforms that correlate with the position of the crack if present. The second plate under consideration contains a dent only. Raw waveforms collected as the transducer progressed along an edge of the sample can be seen in Fig. 2.19. It is again clear that these signals are all very similar to each other, with the low signal-to-noise ratios making it difficult to easily extract any information from them. Figure 2.20 provides the DWFP representations of these same signals. The same feature described above, the first peak “doublet” feature after the 65 μs 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 69 Fig. 2.14 DWFP representations of the pitch/catch raw waveforms shown in Fig. 2.13 from a sample with a crack only. For each individual DWFP image, a specific feature is identified and automatically tracked throughout the scan progression. The feature in this case is the first arriving mode, indicated by the first “peak” (in white) in time. For each following DWFP image, this feature is identified and highlighted by a red star (∗), with the corresponding position in time given to the right of each fingerprint image mark, is again identified and tracked. It can be seen that there is no region in this scan where the feature is identified, indicating that this pulse/echo approach is not sufficient to identify dents. For each of the DWFP images, we have identified and automatically tracked in time a specific feature within the image. We summarize the extracted time locations of these features in Fig. 2.21, where the extracted times are plotted against their position relative to the plate. Vertical dotted lines indicate the actual (known) locations of each flaw present. We can clearly see that the pitch/catch waveforms are able to identify the dent, while the pulse/echo waveforms are able to identify the crack. This follows from the concept that the surface crack is not severe enough to distort much of the 70 M. K. Hinders and C. A. Miller Fig. 2.15 Raw waveforms collected from a plate sample with both a dent and a crack, using a pitch/catch scanning configuration. Little information is readily available from the time-domain representation of the signal. The dotted lines indicate the windowed region used for the DWFP image generation propagating waveform, but still reflects enough energy to be identified back at the initial transducer location. Crack Angle Dependence In order to determine how the angle of the crack affects the reflected signal’s energy with respect to the incident wave angle, an angle-dependent study was performed. Samples containing a large crack only, a dent without a crack, and a dent with a large crack were included here. A point in the center of the flaw (either the center of the dent or the center of the crack) was chosen as the center of rotation, and the Rayleigh– Lamb transducers were placed 10 cm from this center point in 10◦ increments around the point of rotation up to a minimum of 120◦ away from the starting location. A 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 71 Fig. 2.16 DWFP representations of the pitch/catch raw waveforms shown in Fig. 2.15 from a sample with a dent only. For each individual DWFP image, a specific feature is identified and automatically tracked throughout the scan progression. The feature in this case is the first arriving mode, indicated by the first “peak” (in white) in time. For each following DWFP image, this feature is identified and highlighted by a red star (∗), with the corresponding position in time given to the right of each fingerprint image pulse/echo measurement was taken at each location to measure any energy reflected from the flaw. Two features we extracted from the recorded pulse/echo signals are the arrival time of the reflected signal and the peak instantaneous amplitude of that reflection. The instantaneous amplitude is calculated by first calculating the discrete Hilbert transform of the signal s(t), which returns a version of the original signal with a 90◦ phase shift (preserving the amplitude and frequency contents of the signal). The magnitude of this phase-shifted signal and the original signal is the instantaneous amplitude of the signal, another name for the signal’s envelope. The maximum of 72 M. K. Hinders and C. A. Miller Fig. 2.17 Raw waveforms collected from a plate sample with a crack only, using a pulse/echo scanning configuration. The low signal-to-noise ratio makes it difficult to analyze these raw timedomain signals. The dotted lines indicate the windowed region used for the DWFP image generation this instantaneous amplitude is what will now be referred to as the “peak energy” of the signal. The first sample, which did not have a crack present, did not return any measurable reflected energy at any angle. Both of the two samples with cracks show a measurable reflection from the crack when the incident angle is normal to the crack, regardless of whether or not a dent is present. Incident angles that are 0–20◦ from normal still had measurable reflection energy; however, incident angles beyond that did not have a significant measurable reflection. These results agree with expectations that the cracks would be highly directional in their detectability (Fig. 2.22). 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 73 Fig. 2.18 DWFP representations of the pulse/echo raw waveforms shown in Fig. 2.17 from a sample with a crack only. For each individual DWFP image, a specific feature is identified and automatically tracked throughout the scan progression. The feature in this case is the first peak “doublet” feature after the 65 μs mark, where a doublet is indicated by two fingerprint features existing for different scales at the same point in time. For each following DWFP image, this feature is identified and highlighted by a red star (∗), with the corresponding position in time given to the right of each fingerprint image 2.7.2 Corrosion Detection in Marine Structures Structural health monitoring is an equally important area of research for the world’s navies. From corrosion due to constant exposure to harsh saltwater environments, to the more recent issue of sensitization due to the cyclic day/night heat profile exposure of the open water, maritime vessels are in constant need of repair. The biggest cost in maintenance of these ships is often having to pull them out of service in order to characterize and repair any and all damage present. Navies and shipyards are actively 74 M. K. Hinders and C. A. Miller Fig. 2.19 Raw waveforms collected from a plate sample with a dent only, using a pulse/echo scanning configuration. The low signal-to-noise ratio makes it difficult to analyze these raw timedomain signals. The dotted lines indicate the windowed region used for the DWFP image generation researching intelligent monitoring systems that will provide constant feedback on the structural integrity of areas prone to these damages. Lamb waves provide a natural approach to identifying corrosion in metals. Each mode’s group velocity is dependent on both the inspection frequency used and the material thickness. Since corrosion can be thought of as a local material loss, it follows that mode velocities will change when passing through a corroded area. They will either speed up or slow down, depending on the modes and frequency-thickness product being used. Often several modes will propagate simultaneously within a structure and will overlap each other if they share a similar group velocity. As long as a frequency-thickness regime is chosen so that a single mode is substantially faster than the rest and therefore arrives earlier, most of the slower, jumbled modes can be windowed out of the signal. Key to utilizing Lamb waves for SHM is understanding 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 75 Fig. 2.20 DWFP representations of the pulse/echo raw waveforms shown in Fig. 2.19 from a sample with a dent only. For each individual DWFP image, a specific feature is identified and automatically tracked throughout the scan progression. The feature in this case is the first peak “doublet” feature after the 65 μs mark, where a doublet is indicated by two fingerprint features existing for different scales at the same point in time. For each following DWFP image, this feature is identified and highlighted by a red star (∗), with the corresponding position in time given to the right of each fingerprint image this dispersion-curve behavior since this is what allows arrival time shifts to be correlated directly to material thickness changes. In order to reliably identify these arrival time shifts in Lamb wave signals, we again employ the dynamic wavelet fingerprint (DWFP) technique. The patterns in DWFP images then allow us to identify particular Lamb wave modes and directly track subtle shifts in their arrival times. Apprentice shipbuilders fabricated a “T”-shaped plate sample made of 9.5-mm (3/8-in.)-thick mild steel for testing, as shown in Fig. 2.23. The sample was first ground down slightly in several different areas on both the top and bottom surfaces M. K. Hinders and C. A. Miller Feature arrival time (µs) 76 75.6 75.4 75.2 75 74.8 74.6 0 20 40 60 80 100 120 140 160 180 200 140 160 180 200 140 160 180 200 140 160 180 200 Feature arrival time (µs) Position (mm) 80 60 40 20 0 0 20 40 60 80 100 120 Feature arrival time (µs) Position (mm) 75.6 75.4 75.2 75 74.8 74.6 0 20 40 60 80 100 120 Feature arrival time (µs) Position (mm) 80 60 40 20 0 0 20 40 60 80 100 120 Position (mm) Fig. 2.21 Extracted time locations of the features identified in each of the DWFP representations for the sample with a crack only using pitch/catch and pulse/echo waveforms, as well as those from the sample with a dent only using pitch/catch and pulse/echo waveforms. We see that the pitch/catch scanning configuration is able to identify any dents present, while the pulse/echo configuration is better at identifying any cracks present 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 77 Peak energy 15 10 5 0 0 20 40 60 80 Degrees 100 120 100 120 Peak energy 15 10 5 0 0 50 100 Degrees Peak energy 15 10 5 0 0 20 40 80 60 Degrees Fig. 2.22 A plate sample with a dent only. Results show no reflection above the noise level at any angle due to no crack being present. A plate sample with a dent and a large crack. Results show significantly higher reflection energy when the Rayleigh–Lamb waves were incident at an angle close to broadside. A plate sample with a large crack only. Results show highest reflection energy at normal incidence to the crack to simulate the effects of corrosion. The sample was then covered in yellow paint, and a 1-in.-thick green foam layer was bonded to both the top and bottom surfaces. The sample was intended to be representative of bulkhead sections. Shear wave contact transducers in a parallel pitch/catch scanning configuration were used to systematically scan the full length of the T-plate. A Matec3 TB1000 3 Northborough, MA (http://www.matec.com/). 78 M. K. Hinders and C. A. Miller Fig. 2.23 A 9.5-mm (3/8-in.)-thick steel bulkhead sample used in this analysis. Several areas on the sample were ground down to introduce simulate wastage or corrosion. The sample was then covered in yellow paint and then a 1-in.-thick insulating layer of green foam was bonded to the surface. In this picture, a section of the green foam has been removed over one of the several underlying thinned regions pulser/receiver was paired with a Gage4 CS8012a A/D digitizer to collect data. We selected a 0.78 MHz tone-burst excitation pulse for this inspection, giving a frequency-thickness product of 7.4 MHz-mm. This corresponds to an area of the dispersion curve where the S2 mode is fastest and its group velocity is at a maximum, shown in Fig. 2.24. As the thickness of the plate decreases, i.e., a corroded region, the frequency-thickness will decrease resulting in a shifted, slower S2 group velocity and therefore a later arrival time of the S2 mode. Because the entire T-plate sample 4 Lockport, IL (http://www.gage-applied.com/). 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 79 Fig. 2.24 Group velocity dispersion curve for steel. At a frequency-thickness product of 7.4 MHzmm, the S2 mode is the fastest and therefore first arriving mode. If the wave propagates through an area of reduced thickness, the frequency-thickness product drops as well. This results in a slower S2 mode velocity, and therefore a later arrival time was covered in the thick foam insulation, we removed a narrow strip from the edges of the plate in order to have direct contact between the transducers and the material. The remaining foam was left on the sample for scanning. This insulation removal could be avoided if transducers were bonded to the material during the construction process. The transducers were stepped in parallel along opposing exposed edges of the T-plate in 1 cm steps through 29 total locations, covering the full length of the sample (projection 1). The sample was rotated 90◦ , and the process was repeated for the remaining two edges (projection 2). In order to extract mode arrivals from the raw waveforms, we used the DWFP technique. We first filtered the raw waveforms with a third-order Coiflet mother wavelet. We then used the DWFP to generate 2D fingerprint images. These are used to identify the features of interest, here the S2 mode arrival, in the signals. Proper mode selection is especially important for this plate sample because of the thick rubbery coating on each surface. Some modes are strongly attenuated by such coatings, although we found that the S2 mode was able to both propagate well and detect the thinning flaws. Results The raw waveforms were converted to DWFP fingerprint images for analysis. The mode of interest is the first arriving S2 , so the signals were windowed around the expected S2 arrival time in order to observe any changes due to damage. A raw 80 M. K. Hinders and C. A. Miller Fig. 2.25 A raw Lamb wave signal (top) is first windowed around the region of interest, here the first mode arrival time. DWFP images of this region can then be compared directly between unflawed (middle) and flawed (bottom) signals. A simple tracking of the mode arrival time can be applied, shown here as the red dot and triangle with corresponding arrival time on the right of each image, allowing for the identification of any flawed regions waveform is shown in Fig. 2.25, with a windowed DWFP representation for both an unflawed ray path and one that passes through a flaw on the test sample. In each DWFP image, the red circle and corresponding red triangle indicate the automatically identified S2 arrival time, provided to the right of each image. It was found that several areas along the length of the plate in each orientation had easily identifiable changes in S2 mode arrival time from the 132.0 μs arrival time for the unflawed waveforms. We can apply a simple threshold to these extracted mode arrival times, where any waveforms with arrival times later than 134.0 μs are labeled as “flawed”, and any that arrive before this threshold are labeled as “unflawed”. Since we collected data along two orthogonal directions in projections 1 and 2, we can map out the flawed versus unflawed ray paths geometrically and identify any hotspots where ray paths in both orientations identify a suspected flaw. This is illustrated in Fig. 2.26, where “flawed” waveforms from each projection are indicated by gray, and any spatial areas that indicated “flaw” in both orientations highlighted in red. We also provide a photograph of the actual sample with the foam insulation completely removed for 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 81 Fig. 2.26 Spatially overlaying the individual ray paths onto a grid of the bulkhead sample and thresholding their first mode arrival time, we can see agreement between our experimentally identified “flawed” areas highlighted in red and the known thinned regions of the plate identified by the blue (top) and white (bottom) ellipses final identification of the flawed regions, indicated by the blue/white ovals in each subfigure. Excellent agreement was found between the suspected flaw locations and the sample’s actual flaws. It should be noted that these results were collected with transducers stepping in parallel (keeping straight across from each other) in two orthogonal directions, allowing us to do a reasonably good job of localizing and to some extent sizing the flaws. Two of the expected flaw areas are over-sized, and there is one “ghost flaw” which isn’t actually present but is due to an artifact of the “shadows” created by two alternate flaws. In order to more accurately localize and size flaws, it is necessary to 82 M. K. Hinders and C. A. Miller incorporate information from Lamb wave ray paths at other angles. The formal way to do this is called Lamb wave tomography, [114–125]. 2.7.3 Aluminum Sensitization in Marine Plate-Like Structures Al–Mg-based aluminum alloys suffer from intergranular corrosion and intergranular stress corrosion cracking when the alloy has become “sensitized” [248]. The degree of sensitization (DoS) is currently quantified by the ASTM G67 Nitric Acid Mass Loss Test. A nondestructive method for rapid detection and the assessment of the condition of DoS in AA5XXX aluminum alloys in the field is needed [249]. Two samples of aluminum plate approximately 2 × 3 in size were provided for this study, which had been removed from a Naval vessel due to sensitization-induced cracking. Sample 1, which was 0.375 thick, can be seen in Fig. 2.27, and Sample 2, which was 0.25 thick, can be seen in Fig. 2.28. The degree of sensitization (DOS) of each was unknown, although Sample 1 appeared to have a line of testing pits from a previous destructive sensitization measurement. The guided wave apparatus that was used in this data collection is a Nanopulser 3 (np3) pocket-sized, computer-controlled pulser/receiver and analog-to-digital (A/D) converter combination unit developed by AIS5 connected to a computer running MATLAB. The np3 generates a spike excitation voltage, which is converted to mechanical energy by a piezoelectric transducer in contact with the sample surface. The resonant frequency of the transducer determines the frequency content of ultrasonic waves which are generated. These Lamb waves then propagate throughout the sample while the receiving transducer converts these vibrations back into a voltage, which is then sampled and recorded via the on-board A/D. The transducer pairs used for the data shown here were Panametrics V204 2.25 MHz contact transducers made by Olympus, although a wide selection of transducer configurations was available for use with this apparatus. Ultrasonic C-scans of each of the samples were performed prior to Lamb wave scanning, with a 10.0 MHz unfocused (immersion) transducer stepped point-by-point in a raster pattern above the area of interest, recording the pulse-echo ultrasonic waveforms at each location. These signals are then gated around the reflection of interest—i.e., from the top surface, bottom surface, or a defect in between—and a map of that reflection time delay can be reconstructed. Another possible feature that can be extracted is the maximum amplitude of the reflected signal. Many more exist, each offering their own unique advantages. Two examples, both maximum amplitude and top surface reflection time-of-arrival, can be seen in Figs. 2.29 and 2.30 for two regions of interest on one of the samples, a crack and a bottom-surface welded attachment, respectively. 5 Automated Inspection Systems, Martinez, CA. http://ais4ndt.com/index.html. 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 83 Fig. 2.27 Photographs of Sample 1 showing the top and bottom surfaces. The bottom surface has a vertical line of small pits which are thought to be testing points from a previous destructive sensitization measurement, although no information about the results of those tests has been made available for this study Due to the large size and curvature of the two samples, they were each scanned in smaller regions and the individual C-scans for each reason were stitched together in the post-scan analysis to form composite images. As this technique uses a point-bypoint measurement, each individual section contained a 61 × 61 grid of waveform recordings, and took approximately 35 min to complete. Each of the two samples therefore took roughly 12 h to fully scan in an automated, continuous manner. Postprocessing to form the composite C-scans was done interactively and was also quite time-consuming. Results can be seen in Fig. 2.30. It can be seen that the Sample 1 results highlight both the macro-top surface features in Fig. 2.27 and more subtle and reverse-side features. Results for Sample 2 also highlight the obvious macrofeatures, but don’t return any unexpected features that weren’t visible to the naked eye (Figs. 2.31 and 2.32). Once the two samples had been characterized by traditional C-scan techniques, guided waves were used to interrogate a variety of the features found on the samples. Unlike the individual point-by-point pulse/echo measurements collected in the Cscan, guided waves propagate along the plate samples between pairs of transducers which can be widely separated. This allows for wave interactions with structural 84 M. K. Hinders and C. A. Miller Fig. 2.28 Photographs of Sample 2 showing the top and bottom surfaces. The top surface has original paint still attached, while the bottom surface has the remnants of attached insulation in places features and/or flaws along the entire “ray path” between two transducers arranged in a pitch/catch setup. Because of this, only a handful of waveforms are needed to interrogate a large region of interest. This rapid, large-area inspection capability is the inherent advantage of guided waves. No special surface treatment or coupling is required, and the transducers only need direct contact with the surface at the pitch– catch locations. Between the transducers, the surface can be covered or otherwise inaccessible and the transducers can contact the plate from whichever surface is most convenient. Sample 1 had three regions on it where guided waves were used to characterize an existing feature, as can be seen in Fig. 2.33: an L-shaped surface weld attachment, a through-thickness crack, and a rear weld located on the opposite side of the sample. Sample 2 had two regions where guided waves were used, as shown in Fig. 2.34: an area to demonstrate the distance guided waves are able to propagate as well as the fact that they also follow the curvature of the plate, and an area where a through-hole had been poorly repaired. Results that follow are presented via both raw waveforms and 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 85 Fig. 2.29 C-scan results from a subsection of Sample 1 that contained through-thickness crack using a time-of-arrival extraction technique, where the color relates to the height of the top surface of the sample. The same data was processed to extract a maximum amplitude value for each signal, where the color essentially relates to the strength of the reflected signal. Areas where the material was not consistent or was angled oddly show up as lower energy reflections. The photograph shows the actual scanned area highlighted by the green dotted line, as well as the visible crack, traces of which are seen in both of the C-scan representations DWFP representations, and we will see that the DWFPs allow for easier recognition of subtle differences between the raw signals. Guided Wave Results—Surface Weld The first region inspected using guided waves, a surface weld located on Sample 1, included an obvious surface feature in the shape of an L where an attachment had been previously removed, as can be seen in Fig. 2.35. Transmitting and receiving transducers were placed 17 cm apart from each other on opposite sides of the feature, and stepped horizontally across the region in 2 cm steps, recording a total of 11 86 M. K. Hinders and C. A. Miller Fig. 2.30 C-scan results from a subsection of Sample 1 that contained a weld on the reverse side of the plate using a time-of-arrival extraction technique, where the color relates to the height of the top surface of the sample. The same data was processed to extract a maximum amplitude value for each signal, where the color essentially relates to the strength of the reflected signal. The weld on the reverse side of the plate scatters the incident wave, resulting in a lower energy reflection. The photograph shows the actual scanned area highlighted by the green dotted line. The weld is not visible from this side of the plate; however, it clearly shows up in the maximum amplitude C-scan image waveforms. Even positioning the transducers manually, this 340 cm2 region only took around one minute to cover using guided waves, which is significantly shorter than the 40+ min required for an automated C-scan of the same area. Visually, it appears that waveforms 04–08 pass through, or immediately next to, this L feature. The raw waveforms from surface weld region can be seen in Fig. 2.36. In this case, it is not too difficult to see that waveforms 5 and 6 are lower in amplitude than the surrounding waveforms, which correspond to the two ray paths that travel directly through the L-shaped surface feature. These signals were windowed around 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 87 Fig. 2.31 Composite C-scan representation of Sample 1 using a time-of-arrival extraction technique. With point-by-point pulse-echo ultrasound, the time-of-arrival of the reflection measures the height of the top surface of the plate, which shows up as variations in color intensity in this image. All the surface features can be easily identified using this method; however, features beneath the surface or on the reverse side do not show up. An alternate representation for the C-scan data is by extracting the maximum signal amplitude, which relates the color intensity to the reflected signal energy. In this image, not only are the surface feature visible, but the two large welds on the reverse side are detectable as well. It should be noted that the curvature in these representations is largely an artifact of the section-by-section stitching algorithm and is not necessarily to scale 88 M. K. Hinders and C. A. Miller Fig. 2.32 Composite C-scan representation of Sample 2 using a time-of-arrival extraction technique. With point-by-point pulse-echo ultrasound, the time-of-arrival of the reflection measures the height of the top surface of the plate, which shows up as variations in color intensity in this image. The surface features can be easily identified using this method; however, the gradual thinning section is hard to identify from this image alone. An alternate representation for the C-scan data is by extracting the maximum signal amplitude, which relates the color intensity to the reflected signal energy. In this image, not only are the surface feature visible, but the gradual thinning region shows up more clearly as well. It should be noted that the curvature in these representations is largely an artifact of the section-by-section stitching algorithm and is not necessarily to scale 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 89 Fig. 2.33 Regions of Sample 1 where guided wave data was collected, including a surface weld, a through-thickness crack, and a weld located on the reverse side of the plate. Each region was inspected individually using guided waves. All three regions represent different types of features that the through-thickness profiles of guided waves will interact with Fig. 2.34 Regions of Sample 2 where guided wave data was collected, including a full-length region to highlight the guided waves’ ability to propagate long distances and follow plate curvature, as well as a section that included a thinned region with a through-hole the Lamb wave arrival time, as identified by dotted lines in Fig. 2.36, and DWFP representations were generated as shown in Fig. 2.37. The red arrow indicates a “double-triangle” signal feature that changes with respect to the location L-shaped feature. The feature vanishes completely in signals 5 and 6 but also appears slightly different in waveforms 4 and 7, corresponding to the ray paths that propagate along the edges of the L so that the Lamb waves interact weakly with it. By tracking this feature, the guided waveforms allow a relatively complete interrogation of the full 340 cm2 region around the surface weld, interacting strongly with the surface L in the center of the test area. 90 M. K. Hinders and C. A. Miller Fig. 2.35 Picture of the L-shaped surface weld on Sample 1 with an overlay of the ray paths used in the guided wave inspection. A transmitting transducer generating a signal was placed at one end of ray path 01 with a receiving transducer to record the signal placed at the other. Both transducers were then moved to ray path 02, and the process was repeated across the weld region 2.7.4 Guided Wave Results—Crack A through-thickness crack near the edge of Sample 1 can be seen in Fig. 2.38. Visually the crack appeared to have propagated a few centimeters, but the actual end location of the crack was not known. Lamb waves are quite sensitive to cracks, since throughthickness cracks will transmit very little of the Lamb wave signal. Transmitting and receiving transducers were placed 5 cm apart from each other, on either side of the crack, and stepped parallel to the length of the crack in 2 cm increments. The raw waveforms from the region along the crack can be seen in Fig. 2.39. Since ultrasonic vibrations at MHz frequencies do not travel well through air, any cracks or similar discontinuities in the material will reflect almost all of the wave energy. It follows that a receiving transducer will pick up essentially zero signal so long as a full-thickness crack is directly between it and the transmitting transducer. This is seen in waveforms 01–03. If the crack depth is only partial thickness of the plate, then some wave energy can propagate through the still intact portion of the material. This can be seen in waveform 04, where a very low amplitude signal was recorded. The signal gains amplitude in waveforms 05 and 06, indicating that the crack is still present but less severe, and waveforms 07–09 appear to have no crack present. The DWFP representations reinforce these interpretations, as can be seen in Fig. 2.40. These results indicate that the severity of the crack is larger than visually apparent, i.e., it extends several centimeters past where it’s obvious visually. Since 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 91 Fig. 2.36 Raw waveforms corresponding to ray paths covering a region with the L-shaped surface weld. The raw waveforms are difficult to interpret, with no clear indication of a weld present in any of the ray paths other than a slight reduction in amplitude in signals 5 and 6. Windowed sections of these signals, defined by the dotted lines, were used to generate DWFP fingerprints, as shown in Fig. 2.37 guided waves propagate with a full-thickness displacement profile, they are useful at identifying and locating hidden cracks. Guided Wave Results—Rear Weld The scanning surface for the rear weld on Sample 1 can be seen in Fig. 2.41. However, the feature of interest in this section is actually on the opposite side of the plate. Highlighted by the blue dotted line in Fig. 2.41, an attachment on the rear side had been cut off at the weld, leaving behind the remaining weld in a box outline. The transmitting and receiving transducers were placed 10 cm apart, so the ray paths would cross this box weld as shown in Fig. 2.41. Geometrically, it was known that ray paths 04–07 crossed the weld, with ray paths 03 and 08 not only crossing but running along the weld. The raw waveforms, as shown in Fig. 2.42, are more difficult to interpret in this example, as there is no obvious signal feature that correlates with the box weld. The windowed and converted DWFP representations, as shown in Fig. 2.43, offer a better comparison. As indicated by the red arrow, an early DWFP feature can be seen that shifts in time (to the new location of the blue arrow) when the ray paths cross the 92 M. K. Hinders and C. A. Miller Fig. 2.37 DWFP representations of the raw waveforms covering an area with the L shaped surface weld. A column beneath the red triangle highlights the location of a “double-triangle” signal feature which can be used to identify the surface weld. As the ray paths cross the surface weld, this doubletriangle feature first loses amplitude in signal 4, and then disappears completely in signals 5 and 6, only to reappear once the ray path has crossed the weld. By tracking this signal feature, the guided waveforms allow a complete interrogation of the full 40 cm2 region around the surface weld Fig. 2.38 Picture of the through-thickness crack on Sample 1 with an overlay of the ray paths used in the guided wave inspection. A transmitting transducer generating a signal was placed at one end of ray path 01 with a receiving transducer to record the signal placed at the other. Both transducers were then moved to ray path 02, and the process was repeated along the crack region 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 93 Fig. 2.39 Raw waveforms corresponding to ray paths covering a region with a through-thickness crack. The raw waveforms easily identify where the crack is located, and even suggest that the crack is longer than visually apparent. Since guided waves propagate with a full-thickness displacement profile, they are useful at identifying and locating cracks. Windowed sections of these signals, defined by the dotted lines, were used to generate DWFP fingerprints, as shown in Fig. 2.40 weld located on the reverse side of the plate. Also, in waveforms 03 and 08, there is an even greater time shift corresponding to the ray paths traveling along the weld, rather than just crossing it, resulting in the earliest DWFP feature being even further to the right than the blue arrow. By changing the time-of-arrival of the wave modes, such welds are readily detectable as effective local increases in plate thickness. These results show that the guided waves are sensitive to surface features on either side of a plate, which is essential in situations where only one side of the plate is accessible. Guided Wave Results—Distance and Curvature The region of Sample 2 used to demonstrate the propagation properties of guided waves can be seen in Fig. 2.44. The region of interest spans the full 46 cm length of the plate and follows the greatest curvature of the plate. Several ray paths were considered, again 2 cm apart from each other. There were no visible flaws in this section, and the sample still had paint. The raw waveforms from this region can be seen in Fig. 2.45. Again, the raw waveforms all appear very similar. Any differences should be minor, since no flaws 94 M. K. Hinders and C. A. Miller Fig. 2.40 DWFP representations of the raw waveforms covering a region with a through-thickness crack. The red triangle highlights the arrival time of the waveforms that do not propagate through the crack. It is clearly seen that waveforms 1–4 have almost zero signal because they are uniformly dark, indicating that the crack is not only present but extends the full depth of the plate and blocks the Lamb waves so that no signal is recorded at the receiving transducer. Waveforms 5 and 6 show a clear signal; however, the first feature seen in the DWFP is a bit further to the right than in waveforms 7–9, indicating that waveforms 5 and 6 contain slower Lamb wave modes as a result of a partial-thickness crack still being present. Waveforms 7–9 all appear essentially identical and correspond to no flaw being present. By tracking the mode arrival features in the DWFP, crack presence and severity can be determined were believed to be present along any of the ray paths. The DWFP representations can be seen in Fig. 2.46 and do not show significant differences between the waveforms. The initial mode arrival, indicated by the red arrow, does not change in time apart from slight differences due to coupling since the data was collected by hand. All the fingerprint representations share the same overall structure, as was expected. These results highlight the guided waves’ ability to propagate longer distances, interacting with the full plate thickness throughout the entire propagation path. The curvature of the plate is also followed, allowing for more complicated structures to be investigated using guided waves. 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 95 Fig. 2.41 Picture of the region of Sample 1 where a rear-located weld was present, indicated by the blue dotted line, with an overlay of the ray paths used in the guided wave inspection. A transmitting transducer generating a signal was placed at one end of ray path 01 with a receiving transducer to record the signal placed at the other. Both transducers were then moved to ray path 02, and the process was repeated along the width of the region Fig. 2.42 Raw waveforms corresponding to ray paths covering a region with a weld on the reverse side of the plate. The raw waveforms do not easily identify the presence of the weld, which runs from waveforms 3–8. Welds which are effective local increases in plate thickness result in a change in arrival time for the guided waves. Windowed sections of these signals, defined by the dotted lines, were used to generate DWFP fingerprints for further inspection, as shown in Fig. 2.43 96 M. K. Hinders and C. A. Miller Fig. 2.43 DWFP representations of the raw waveforms covering region with a weld located on the reverse side of the plate. The red triangle highlights the arrival time of the waveforms that do not interact with the weld, which can be seen to correspond with a small initial group of fingerprints in 1–2 and 9–12. This group of fingerprints is shifted to the right, meaning later in time, in waveforms 5–7. These waveforms are known to cross the weld located on the rear side of the plate. This local increase in thickness slows the mode velocity momentarily, resulting in a later mode arrival time. Waveforms 3, 4, and 8 have an even further shift in mode arrival, which agrees with the fact that these ray paths run along the weld, not just crossing it. By tracking a specific feature in the DWFP corresponding to a mode arrival, it is straightforward to track features on either side of the plate, regardless of which side is accessible for scanning Guided Wave Results—Thinning Near Hole A region of Sample 2 that contained a thinned region with a through-hole can be seen in Fig. 2.47. This is a 29 cm by 20 cm region of the plate, with the waveforms crossing an unflawed region and then a region with a poorly repaired through-hole. Around the repaired through-hole, the plate gradually loses thickness in a larger circular area. Again, the paint was still left on the sample in its as-received state. The raw waveforms can be seen in Fig. 2.48. They are difficult to distinguish from each other, although they appear to have a gradual decrease in amplitude as the ray paths approach the flaw. When the signals are windowed and DWFP representations are generated, a more thorough analysis can be performed. Waveforms 1–4 appear 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 97 Fig. 2.44 Picture of the region of Sample 2 where the propagation distance was to be demonstrated along with the property that the guided waves follow plate curvature, with an overlay of the ray paths used in the guided wave inspection. A transmitting transducer generating a signal was placed at one end of ray path 01 with a receiving transducer to record the signal placed at the other. Both transducers were then moved to ray path 02, and the process was repeated across the region. The curvature of the plate is also highlighted very similar to each other, both in the time domain and the number of rings present in the fingerprint objects. The red arrow corresponds to the point in time for first mode arrival. Waveforms 5–10 all show a reduced number of rings in the fingerprint regions, as the ray paths cross the reduced thickness region in the repaired hole. They also show a shifted mode arrival, as indicated by the blue arrow. Waveform 11 shows a further time shift, corresponding to the ray path that crosses the through-hole itself. Again, the guided waves are successfully able to interrogate the plate sample and identify the flawed region. A series of recorded signals can cover a large region in very little time, while the DWFP representations allow for quick fingerprint feature identification, a process that can also be automated using standard image processing techniques. The rapid inspection and analysis of these signals allows for a thorough large-area inspection technique. 98 M. K. Hinders and C. A. Miller Fig. 2.45 Raw waveforms corresponding to ray paths propagating the full length of the plate. The raw waveforms all appear quite similar, with mode arrival times all the same. Windowed sections of these signals, defined by the dotted lines, were used to generate DWFP fingerprints, as shown in Fig. 2.46 2.8 Summary We have demonstrated the usefulness of the dynamic wavelet fingerprinting technique for analysis of time-domain signals, with three specific applications in the field of structural health monitoring. We have discussed the advantages of time–frequency signal representations, and the specific subset associated with time-scale wavelet transformations. We demonstrated how the DWFP technique can be used to automatically identify differences between dents and subtle surface cracks in aircraft-grade aluminum through a combination of pitch/catch and pulse/echo inspection configurations. Combined with straightforward image analysis techniques identifying and tracking specific features of interest within these DWFP images, we have shown how to implement a low-power, portable Rayleigh–Lamb wave inspection system for mapping flaws in airplane fuselages caused by incidental contact on runways. Additionally, we have presented a similar approach for the identification and localization of corrosion and cracks in marine structures. Even when the material is underneath a bonded layer of insulation, the guided wave modes were shown to reliably propagate the full length of the sample without significant distortion of the wavelet fingerprints. 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 99 Fig. 2.46 DWFP representations of the raw waveforms propagating the full length of the plate. The red triangle highlights the approximate arrival time of the guided wave modes in all the waveforms. Because the propagation distance was so long and the transducers were held by hand, the effects of transducer coupling and orientation show up as minor fluctuations in the fingerprints. The overall shape of the fingerprints and time-of-arrival is similar for all waveforms, indicating that there were no hidden flaws present in any of the ray paths. The main point here is that the signals were still very usable propagating the full distance of the plate, even with the curvature present Multiple simulated corrosion regions were identified using DWFP image analysis, again by identifying a specific feature of interest and tracking it. The two sensitized aluminum samples provided for this work contained a variety of flaws and features that can be located and characterized with Lamb waves. The wavelet fingerprint representation of Lamb waveforms that we have used here provides a route to routine use of Lamb waves outside of laboratory environments. The method can be tailored to specific inspection scenarios of interest, and the identification of specific “flaw signatures” can be done automatically and in real time. These examples both make use of a DWFP feature identification technique, where the feature identified is specific to the application. There is no one “magic” feature we can identify that works across all applications. Here, we had insight into the underlying physics of Lamb wave propagation and were able to predict what types of features in the signals would correlate with the changes we were interested in identifying, i.e., changes in mode arrival time or structure due to some form of material damage. For 100 M. K. Hinders and C. A. Miller Fig. 2.47 Picture of the region of Sample 2 where a gradual thinning occurs along with an eventual through-hole, highlighted by the blue dotted line, with an overlay of the ray paths used in the guided wave inspection. A transmitting transducer generating a signal was placed at one end of ray path 01 with a receiving transducer to record the signal placed at the other end. Both transducers were then moved to ray path 02, and the process was repeated up along the area of interest. It should be noted that ray path 01 corresponds to no flaw, progressing to ray path 11 which corresponds to the through-hole Fig. 2.48 Raw waveforms corresponding to ray paths covering a region with gradual thinning leading up to a through-hole. The raw waveforms 5–11 appear to have a lower amplitude than waveforms 1–4; however, not much else can be determined directly from the raw signals. Windowed sections of these signals, defined by the dotted lines, were used to generate DWFP fingerprints for further inspection, as shown in Fig. 2.49 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 101 Fig. 2.49 DWFP representations of the raw waveforms covering a region with gradual thinning leading up to a through-hole. The red triangle highlights the arrival time of the waveforms that do not interact with the thinned region, as shown in waveforms 1–4. As the ray paths start interacting with the thinned region, there is an obvious shift in arrival time where the first fingerprint feature is further to the right, as indicated by the blue triangle. This indicates that the plate thickness has decreased in this region. Waveform 11 has a significantly later mode arrival, corresponding to the ray path that crosses the through-hole. The hole scatters most of the waveforms’ energy. By tracking the mode arrival through a specific feature in the DWFP, it is straightforward to identify when material thickness changes or when flaws scatter the propagating wave’s energy many time-domain signals, we either do not have prior knowledge of expected signal changes, or they are too complicated for analytical solutions that might provide insight into the physical interaction at hand. The DWFP analysis technique provides enough freedom for generating and isolating useful features of interest; however, only formal pattern classification feature selection would guarantee the inclusion of the best features for a given application. 102 M. K. Hinders and C. A. Miller References 1. Lamb H (1917). On waves in an elastic plate. In: Proceedings of the royal society of London series A, Vol. XCIII, pp. 114–128 2. Worlton DC (1961) Experimental confirmation of lamb waves at megacycle frequencies. J Appl Phys 32(6):967–971 3. Viktorov IA (1967) Rayleigh and Lamb waves - physical theory and applications. Plenum Press, New York 4. Graff KE (1991) Wave motion in elastic solids. Dover, New York 5. Rose J (1999) Ultrasonic waves in solid media. Cambridge University Press, Cambridge 6. Rose JL (2002) A baseline and vision of ultrasonic guided wave inspection potential. J Press Vessel Technol 124:273–282 7. Giurgiutiu V, Cuc A (2005) Embedded nondestructive evaluation for structural health monitoring, damage detection, and failure prevention. Shock Vib Digest 37:83–105 8. Su Z, Ye L, Lu Y (2006) Guided Lamb waves for identification of damage in composite structures: a review. J Sound Vib 295:753–780 9. Raghavan A, Cesnik CES (2007) Review of guided-wave structural health monitoring. Shock Vib Digest 39(2):91–114 10. Giurgiutiu V (2007) Damage assessment of structures - an US Air Force office of scientific research structural mechanics perspective. Key Eng Mater 347:69–74 11. Giurgiutiu V (2010) Structural health monitoring with piezoelectric wafer active sensors predictive modeling and simulation. INCAS Bull 2(3):31–44 12. Adams DA (2007) Health monitoring of structural materials and components. Wiley, West Sussex 13. Giurgiutiu V (2008) Structural health monitoring with piezoelectric wafer active sensors. Academic, London 14. Filho JV, Baptista FG, Inman DJ (2011) Time-domain analysis of piezoelectric impedancebased structural health monitoring using multilevel wavelet decomposition. Mech Syst Signal Process In Press, https://doi.org/10.1016/j.ymssp.2010.12.003 15. Giridhara G, Rathod VT, Naik S, Roy Mahapatra D, Gopalakrishnan S (2010) Rapid localization of damage using a circular sensor array and Lamb wave based triangulation. Mech Syst Signal Process 24(8):2929–2946. https://doi.org/10.1016/j.ymssp.2010.06.002 16. Park S, Anton SR, Kim J-K, Inman DJ, Ha DS (2010) Instantaneous baseline structural damage detection using a miniaturized piezoelectric guided waves system. KSCE J Civil Eng 14(6):889–895. https://doi.org/10.1007/s12205-010-1137-x 17. Wang D, Ye L, Ye L, Li F (2010) A damage diagnostic imaging algorithm based on the quantitative comparison of Lamb wave signals. Smart Mater Struct 19:1–12. https://doi.org/ 10.1088/0964-1726/19/6/065008 18. Silva, C., Rocha, B., and Suleman, A. (2009). A structural health monitoring approach based on a PZT network using a tuned wave propagation method. Paper presented at the 50th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference, Palm Springs, California 19. Rocha B, Silva C, Suleman A (2010) Structural health monitoring system using piezoelectric networks with tuned lamb waves. Shock Vib 17(4–5):677–695 20. Wang G, Schon J, Dalenbring M (2009) Use of Lamb wave propagation for SHM and damage detection in sandwich composite aeronautical structures,. In: Paper presented at the 7th international workshop on structural health monitoring, California, Stanford, pp 111–118 21. Park S, Anton SR, Inman DJ, Kim JK, Ha DS (2009) Instantaneous baseline damage detection using a low power guided waves system. In: Paper presented at the 7th international workshop on structural health monitoring, California, Stanford, pp 505–512 22. Kim Y-G, Moon H-S, Park K-J, Lee J-K (2011) Generating and detecting torsional guided waves using magnetostrictive sensors of crossed coils. NDT E Int 44(2):145–151. https://doi. org/10.1016/j.ndteint.2010.11.006 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 103 23. Chati F, Leon F, El Moussaoui M, Klauson A, Maze G (2011) Longitudinal mode L(0,4) used for the determination of the deposit width on the wall of a pipe. NDT E Int 44(2):188–194. https://doi.org/10.1016/j.ndteint.2010.12.001 24. Mustapha S, Ye L, Wang D, Ye L (2011) Assessment of debonding in sandwich CF/EP composite beams using A0 Lamb wave at low frequency. Compos Struct 93(2):483–491. https://doi.org/10.1016/j.compstruct.2010.08.032 25. Huang Q, Balogun O, Yang N, Regez B, Krishnaswamy S (2010) Detection of disbonding in glare composites using lamb wave approach. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 11211, pp 1198–1205 26. Koduru JP, Rose JL (2010) Modified lead titanate/polymer 1–3 composite transducers for structural health monitoring. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1211, pp 1799–1806 27. Koehler B, Frankenstein B, Schubert F, Barth M (2009) Novel piezoelectric fiber transducers for mode selective excitation and detection of LAMB waves. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 982–989 28. Advani SK, Breon LJ, Rose JL (2010) Guided wave thickness measurement tool development for estimation of thinning in plate-like structures. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1211, pp 207–214 29. Kannajosyula H, Puthillath P, Lissenden CJ, Rose JL (2009) Interface waves for SHM of adhesively bonded joints. In: Paper presented at the 7th international workshop on structural health monitoring, California, Stanford, pp 247–254 30. Chandrasekaran J, Krishnamurthy CV, Balasubramaniam K (2009) Higher order modes cluster (HOMC) guided waves - a new technique for NDT inspection. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 121–128 31. Ratassepp M, Lowe MJS (2009) SH0 Guided wave interaction with a crack aligned in the propagation direction in a plate. In: Paper presented at the review of progress in QNDE, AIP, conference proceedings, vol 1096, pp 161–168 32. Masserey B, Fromme P (2009) Defect detection in plate structures using coupled rayleigh-like waves. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 193–200 33. De Marchi L, Ruzzene M, Buli X, Baravelli E, Speciale N (2010) Warped basis pursuit for damage detection using lamb waves. IEEE Trans Ultrason Ferroelectr Freq Control 57(12):2734– 2741. https://doi.org/10.1109/TUFFC.2010.1747 34. Limongelli MP (2010) Frequency response function interpolation for damage detection under changing environment. Mech Syst Signal Process 24(8):2898–2913. https://doi.org/10.1016/ j.ymssp.2010.03.004 35. Clarke T, Simonetti F, Cawley P (2010) Guided wave health monitoring of complex structures by sparse array systems: influence of temperature changes on performance. J Sound Vib 329(12):2306–2322. Structural Health Monitoring Theory Meets Practice. https://doi.org/10. 1016/j.jsv.2009.01.052 36. De Marchi L, Marzani A, Caporale S, Speciale N (2009) A new warped frequency transformation (WFT) for guided waves characterization. In: Paper presented at health monitoring of structural and biological systems, proceedings of spie - the international society for optical engineering, vol 7295 37. Xu B, Yu L, Giurgiutiu V (2009) Lamb wave dispersion compensation in piezoelectric wafer active sensor phased-array applications. In: Paper presented at health monitoring of structural and biological systems, proceedings of SPIE - the international society for optical engineering, vol 7295 38. Engholm M, Stepinski T (2010) Using 2-D arrays for sensing multimodal lamb waves. In: Paper presented at nondestructive characterization for composite materials, aerospace engineering, civil infrastructure, and homeland security, proceedings of SPIE - the international society for optical engineering, vol 7649 39. De Marchi L, Ruzzene M, Xu B, Baravelli E, Marzani A, Speciale N (2010) Warped frequency transform for damage detection using lamb waves. In: Paper presented at health monitoring of 104 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. M. K. Hinders and C. A. Miller structural and biological systems, proceedings of SPIE - the international society for optical engineering, vol 7650 Thalmayr F, Hashimoto K-Y, Omori T, Yamaguchi M (2010) Frequency domain analysis of lamb wave scattering and application to film bulk acoustic wave resonators. IEEE Trans Ultrason Ferroelectr Freq Control 57(7):1641–1648. https://doi.org/10.1109/TUFFC.2010. 1594 Lu Y, Ye L, Wang D, Wang X, Su Z (2010) Conjunctive and compromised data fusion schemes for identification of multiple notches in an aluminium plate using lamb wave signals. IEEE Trans Ultrason Ferroelectr Freq Control 57(9):2005–2016. https://doi.org/10.1109/TUFFC. 2010.1648 Ye L, Ye L, Zhongqing S, Yang C (2008) Quantitative assessment of through-thickness crack size based on Lamb wave scattering in aluminium plates. NDT E Int 41(1):59–68. https://doi. org/10.1016/j.ndteint.2007.07.003 Moore EZ, Murphy KD, Nichols JM (2011) Crack identification in a freely vibrating plate using Bayesian parameter estimation. Mech Syst Signal Process In Press, https://doi.org/10. 1016/j.ymssp.2011.01.016 An Y-K, Sohn H (2010) Instantaneous crack detection under varying temperature and static loading conditions. Struct Control Health Monit 17:730–741. https://doi.org/10.1002/stc.394 Parnell WJ, Martin PA (2011) Multiple scattering of flexural waves by random configurations of inclusions in thin plates. Wave Motion 48(2):161–175. https://doi.org/10.1016/j.wavemoti. 2010.10.004 Wilcox PD, Velichko A, Drinkwater BW, Croxford AJ, Todd MD (2010) Scattering of plane guided waves obliquely incident on a straight feature with uniform cross-section. J Acoust Soc Am 128:2715–2725. https://doi.org/10.1121/1.3488663 Soni S, Kim SB, Chattopadhyay A (2010) Reference-free fatigue crack detection, localization and quantification in lug joints. In: Paper presented at the 51st AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference, Orlando, Florida Hall JS, Michaels JE (2009) On a model-based calibration approach to dynamic baseline estimation for structural health monitoring. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 896–903 Sohn H, Kim SB (2010) Development of dual PZT transducers for reference-free crack detection in thin plate structures. IEEE Trans Ultrason Ferroelectr Freq Control 57(1):229–240 Park S, Lee C, Sohn H (2009) Frequency domain reference-free crack detection using transfer impedances in plate structures. In: Paper presented at health monitoring of structural and biological systems, proceedings of SPIE - the international society for optical engineering, vol 7295 Michaels TE, Ruzzene M, Michaelsa JE (2009) Frequency-wavenumber domain methods for analysis of incident and scattered guided wave fields. In: Paper presented at health monitoring of structural and biological systems, proceedings of SPIE - the international society for optical engineering, vol 7295 Lee C, Kim S, Sohn H (2009) Application of a baseline-free damage detection technique to complex structures. In: Paper presented at sensors and smart structures technologies for civil, mechanical, and aerospace systems, proceedings of SPIE - the international society for optical engineering, vol 7292 Ruzzene M, Xu B, Lee SJ, Michaels TE, Michaels JE (2010) Damage visualization via beamforming of frequency-wavenumber filtered wavefield data. In: Paper presented at health monitoring of structural and biological systems ,proceedings of SPIE - the international society for optical engineering, vol 7650 Soni S, Kim SB, Chattopadhyay A (2010) Fatigue crack detection and localization using reference-free method. In: Paper presented at smart sensor phenomena, technology, networks, and systems proceedings of SPIE - the international society for optical engineering, vol 7648 Ayers J, Apetre N, Ruzzene M, Sharma V (2009) Phase gradient-based damage characterization of structures. In: Paper presented at the 7th international workshop on structural health monitoring, California, Stanford, pp 1113–1120 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 105 56. Zhang J, Drinkwater BW, Wilcox PD, Hunter AJ (2010) Defect detection using ultrasonic arrays: the multi-mode total focusing method. NDT E Int 43(2):123–133. ISSN:0963-8695. https://doi.org/10.1016/j.ndteint.2009.10.001 57. Kang T, Lee D-H, Song S-J, Kim H-J, Jo Y-D, Cho H-J (2011) Enhancement of detecting defects in pipes with focusing techniques, NDT E Int 44(2):178–187. ISSN 0963-8695. https:// doi.org/10.1016/j.ndteint.2010.11.009 58. Higuti RT, Martinez-Graullera O, Martin CJ, Octavio A, Elvira L, De Espinosa FM (2010) Damage characterization using guided- wave linear arrays and image compounding techniques. IEEE Trans Ultrason Ferroelectr Freq Control 57(9):1985–1995. https://doi.org/10. 1109/TUFFC.2010.1646 59. Hall JS, Michaels JE (2010) Minimum variance ultrasonic imaging applied to an in situ sparse guided wave array. IEEE Trans Ultrason Ferroelectr Freq Control 57(10):2311–2323. https:// doi.org/10.1109/TUFFC.2010.1692 60. Kumar A, Poddar B, Kulkarni G, Mitra M, Mujumdar PM (2010) Time reversibility of lamb wave for damage detection in isotropic plate. In: Paper presented at the 51st AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference, Orlando, Florida 61. Zhao N, Yan S (2011) A new structural health monitoring system for composite plate. Adv Mater Res 183–185:406–410 62. Zhang H, Cao Y, Yu J, Chen X (2010) Time reversal and cross-correlation analysis for damage detection in plates using lamb waves. In: Paper presented at the 2010 international conference on audio, language and image processing, Shanghai, China, ICALIP 2010 Proceedings, pp 1516–1520 63. Teramoto K, Uekihara A (2009) Time reversal imaging for gradient sensor networks over the lamb-wave field. In: Paper presented at the ICCAS-SICE 2009 - ICROS-SICE international joint conference, Fukuoka, Japan, Proceedings, pp 2311–2316 64. Zhao N, Yan S (2010) Experimental research on damage detection of large thin aluminum plate based on lamb wave. In: Paper presented at sensors and smart structures technologies for civil, mechanical, and aerospace systems, proceedings of SPIE - the international society for optical engineering, vol 7647 65. Bellino A, Fasana A, Garibaldi L, Marchesiello S (2010) PCA-based detection of damage in time-varying systems. Mech Syst Signal Process 24(7) 2010:2250–2260. https://doi.org/10. 1016/j.ymssp.2010.04.009. Special Issue: ISMA 66. Garcia-Rodriguez M, Yanez Y, Garcia-Hernandez MJ, Salazar J, Turo A, Chavez JA (2010) Application of Golay codes to improve the dynamic range in ultrasonic Lamb waves aircoupled systems. NDT E Int 43(8):677–686. https://doi.org/10.1016/j.ndteint.2010.07.005 67. Kim K-S, Fratta D (2010) Travel-time tomographic imaging: multi-frequency diffraction evaluation of a medium with a high-contrast inclusion. NDT E Int 43(8):695–705. https://doi. org/10.1016/j.ndteint.2010.08.001 68. Xin F, Shen Q (2011) Fuzzy complex numbers and their application for classifiers performance evaluation. Pattern Recognit 44(7):1403–1417. https://doi.org/10.1016/j.patcog.2011.01.011 69. Gutkin R, Green CJ, Vangrattanachai S, Pinho ST, Robinson P, Curtis PT (2011) On acoustic emission for failure investigation in CFRP: Pattern recognition and peak frequency analyses. Mech Syst Signal Process 25(4):1393–1407. https://doi.org/10.1016/j.ymssp.2010.11.014 70. Duroux A, Sabra KG, Ayers J, Ruzzene M (2010) Extracting guided waves from crosscorrelations of elastic diffuse fields: applications to remote structural health monitoring. J Acoust Soc Am 127:204–215. https://doi.org/10.1121/1.3257602 71. Kerber F, Sprenger H, Niethammer M, Luangvilai K, Jacobs LJ (2010) Attenuation analysis of lamb waves using the chirplet transform. EURASIP J Adv Signal Process 2010, 6. Article ID 375171. https://doi.org/10.1155/2010/375171 72. Martinez L, Wilkie-Chancellier N, Glorieux C, Sarens B, Caplain E (2009) Transient spacetime surface waves characterization using Gabor analysis paper presented at the anglo french physical acoustics conference. J Phys: Conf Ser 195 012009. https://doi.org/10.1088/17426596/195/1/012009 106 M. K. Hinders and C. A. Miller 73. Roellig M, Schubert L, Lieske U, Boehme B, Frankenstein B, Meyendorf N (2010) FEM assisted development of a SHM-piezo-package for damage evaluation in airplane components. In: Paper presented at the 11th international conference on thermal, mechanical and multiphysics simulation, and experiments in microelectronics and microsystems, EuroSimE 2010, Bordeaux, France 74. Xu H, Xu C, Zhou S (2010) Study of lamb wave propagation in plate for UNDE by 2-D FEM model. In: Paper presented at the 2010 international conference on measuring technology and mechatronics automation, Changsha, China, ICMTMA 2010, vol 3, pp 556–559 75. Qu W, Xiao L (2009. Finite element simulation of lamb wave with piezoelectric transducers for composite plate damage detection. Adv Mater Res 79–82, 1095–1098 76. Fromme P (2010) Directionality of the scattering of the A0 lamb wave mode at cracks. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 11211, pp 129–136 77. Karthikeyan P, Ramdas C, Bhardwa MC, Balasubramaniam K (2009) Non-contact ultrasound based guided lamb waves for composite structure inspection: Some interesting observations. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 928–935 78. Fromme P (2009) Structural health monitoring of plates with surface features using guided ultrasonic waves. In: Paper presented at health monitoring of structural and biological systems, proceedings of SPIE - the international society for optical engineering, vol 7295 79. Moreau L, Velichko A, Wilcox PD (2010) Efficient methods to model the scattering of ultrasonic guided waves in 3D. In: Paper presented at Health monitoring of structural and biological systems proceedings of SPIE - the international society for optical engineering, vol 7650 80. Qu W, Xiao L, Zhou Y (2009) Finite element simulation of Lamb wave with piezoelectric transducers for plastically-driven damage detection. In: Paper presented at the 7th international workshop on structural health monitoring, California, Stanford, pp 737–744 81. Teramoto K, Uekihara A (2009) The near-field imaging method based on the spatio-temporal gradient analysis. In: Paper presented at the ICCAS-SICE 2009 - ICROS-SICE international joint conference, proceedings, pp 3393–3398 82. Teramoto K, Tamachi N (2010) Near-field acoustical imaging of cracks over the A0-mode lamb-wave field. In: Paper presented at the SICE annual conference, Proceedings, Taipei, Taiwan, pp 2742–2747 83. Fellinger P, Marklein R, Langenberg KJ, Klaholz S (1995) Numerical modeling of elastc wave propagation and scattering with EFIT- elastodynamic finite integration technique. Wave Motion 21(1):47–66 84. Schubert F, Peiffer A, Kohler B, Sanderson T (1998) The elastodynamic finite integration technique for waves in cylindrical geometries. J Acoust Soc Am 104(5):2604–2614 85. Schubert F, Koehler B (2001) Three-dimensional time domain modeling of ultrasonic wave propagation in concrete in explicit consideration of aggregates and porosity. J Comput Acoust 9(4):1543–1560 86. Schubert F (2004) Numerical time-domain modeling of linear and nonlinear ultrasonic wave propagation using finite integration techniques - theory and applications. Ultrasonics 42(1):221–229 87. Rudd K, Bingham J, Leonard K, Hinders M (2007) Simulation of guided waves in complex piping geometries using the elastodynamic finite integration technique. JASA 121(3):1449– 1458 88. Rudd K, Hinders M (2008) Simulation of incident nonlinear sound beam 3d scattering from complex targets. Comput Acoust 16(3):427–445 89. Bingham J, Hinders M (2009) Lamb wave detection of delaminations in large diameter pipe coatings. Open Acoust J 2:75–86 90. Bingham J, Hinders M (2009) Lamb wave characterization of corrosion-thinning in aircraft stringers: experiment and 3d simulation. JASA 126(1):103–113 91. Bingham J, Hinders M, Friedman A (2009) Lamb Wave detection of limpet mines on ship hulls. Ultrasonics 49:706–722 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 107 92. Bingham J, Hinders M (2010) 3D Elastodynamic finite integration technique simulation of guided waves in extended built-up structures containing flaws computational acoustics, vol 18, Issue 2, pp 165–192 93. Bowman JJ, Senior TBA, Uslenghi PLE (1987) Electromagnetic and acoustic scattering by simple shapes. Hemisphere Publishing, New York 94. Varadan VV, Lakhtakia A, Varadan VK (eds) (1986) Low and high frequency asymptotics. Elsevier Science Publishing, New York 95. Varadan VV, Lakhtakia A, Varadan VK (eds) (1991) Field representations and introduction to scattering. Elsevier Science Publishing, New York 96. Mindlin RD In: Yang J (ed) (2006) Introduction to the mathematical theory of vibrations of elastic plates. World Scientific Publishing, Singapore 97. Takeda N, Takahashi I, Ito Y (2010).Visualization of impact damage in composite structures using pulsed laser scanning method. In: Paper presented at the 51st AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference, Orlando, Florid 98. Garcia-Rodriguez M, Ya Y, Garcia-Hernandez MJ, Salazar J, Turo A, Chavez JA (2010) Laser interferometric measurements of air-coupled lamb waves. In: Paper presented at the 9th international conference on vibration measurements by laser and non-contact techniques and short course, Ancona, Italy. AIP Conference Proceedings, vol 1253, pp 88–93 99. Kostson E, Fromme P (2009) Defect detection in multi-layered structures using guided ultrasonic waves. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 209–216 100. Chunguang X, Rose JL, Yan F, Zhao X (2009) Defect sizing of plate-like structure using Lamb waves. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 1575–1582 101. Fromme P (2010) Directionality of the scattering of the A0 Lamb wave mode at cracks. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1211, pp 129–136 102. Schubert L, Lieske U, Kohler B, Frankenstein B (2009) Interaction of Lamb waves with impact damaged CFRP’s -effects and conclusions for acousto-ultrasonic applications. In: Paper presented at the 7th international workshop on structural health monitoring, California, Stanford, pp 151–158 103. Salas KI, Nadella KS, Cesnik CES (2009) characterization of guided-wave excitation and propagation in composite plates. In: Paper presented at the 7th international workshop on structural health monitoring, California, Stanford, pp 651–658 104. Zhang H, Sun X, Fan S, Qi X, Liu X, Donghui L (2008) An ultrasonic signal processing technique for extraction of arrival time from Lamb waveforms, advanced intelligent computing theories and applications. Springer Lect Notes Comput Sci 5226(2008):704–711. https://doi. org/10.1007/978-3-540-87442-3-87 105. Chai HK, Momoki S, Kobayashi Y, Aggelis DG, Shiotani T (2011) Tomographic reconstruction for concrete using attenuation of ultrasound. NDT E Int 44(2):206–215. https://doi.org/ 10.1016/j.ndteint.2010.11.003 106. Ramadas C, Balasubramaniam Krishnan, Makarand Joshi CV, Krishnamurthy, (2011) Characterisation of rectangular type delaminations in composite laminates through B- and D-scan images generated using Lamb waves. NDT E Int 44(3):281–289. https://doi.org/10.1016/j. ndteint.2011.01.002 107. Belanger P, Cawley P, Simonetti F (2010) Guided wave diffraction tomography within the born approximation. IEEE Trans Ultrason Ferroelectr Freq Control 57(6):1405–1418. https:// doi.org/10.1109/TUFFC.2010.1559 108. Wu1 C-H, Yang C-H (2010) An investigation on ray tracing algorithms in Lamb wave tomography. In: Paper presented at the 31st symposium on ultrasonic electronics, Proceedings, vol 31, Tokyo Japan, pp 483–484 109. Hu Y, Xu C, Xu H (2009) The application of wavelet transform for lamb wave tomography. In: Paper presented at the 1st international conference on information science and engineering, Nanjing, China. ICISE 2009, pp 681–683 108 M. K. Hinders and C. A. Miller 110. Lissenden CJ, Cho H, Kim CS (2010) Fatigue crack growth monitoring of an aluminum joint structure. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1211, pp 1868–1875 111. Balvantin A, Baltazar A, Kim J (2010) Ultrasonic lamb wave tomography of non-uniform interfacial stiffness between contacting solid bodies. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1211, pp 1463–1470 112. Xu C, Rose JL, Yan F, Zhao X (2009) Defect sizing of plate-like structure using lamb waves. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 1575–1582 113. Ng CT, Veidt M, Rajic N (2009) Integrated piezoceramic transducers for imaging damage in composite laminates. In: Paper presented at the second international conference on smart materials and nanotechnology in engineering, Proceedings of SPIE - the international society for optical engineering, vol 7493 114. McKeon J, Hinders M (1999) Parallel projection and crosshole Lamb wave contact scanning tomography. J Acoust Soc Am 106(5):2568–2577 115. McKeon J, Hinders M (1999) Lamb Wave scattering from a through hole. J Sound Vib 224(2):843–862 116. Malyarenko E, Hinders M (2000) Fan beam and double crosshole Lamb wave tomography for mapping flaws in aging aircraft structures. J Acous Soc Am 108(10):1631–1639 117. Malyarenko E, Hinders M (2001) Ultrasonic Lamb wave diffraction tomography. Ultrasonics 39(4):269–281 118. Hinders M, Malyarenko E, Leonard K (2002) Blind test of Lamb wave diffraction tomography. In: Thompson DO, Chimenti DE (eds) Reviews of progress in QNDE, vol 21, AIP CP 615, pp 278–283 119. Leonard K, Malyarenko E, Hinders M (2002) Ultrasonic Lamb wave tomography. Inverse Prob Special NDE Issue 18(6):1795–1808 120. Leonard K, Hinders M (2003) Guided wave helical ultrasonic tomography of pipes. JASA 114(2):767–774 121. Hou J, Leonard KR, Hinders M (2004) Automatic multi-mode Lamb wave arrival time extraction for improved tomographic reconstruction. Inverse Prob 20:1873–1888 122. Hinders M, Leonard KR (2005) Lamb wave tomography of pipes and tanks using frequency compounding. In: Thompson DO, Chimenti DE (eds) Reviews of progress in QNDE, vol 24, pp 867-874 123. Hinders M, Hou J, Leonard KR (2005) Multi-mode Lamb wave arrival time extraction for improved tomographic reconstruction. In: Thompson DO, Chimenti DE (eds) Reviews of progress in QNDE, vol 24, pp 736–743 124. Leonard K, Hinders M (2005) Multi-mode Lamb wave tomography with arrival time sorting. JASA 117(4):2028–2038 125. Leonard K, Hinders M (2005) Lamb wave tomography of pipe-like structures Ultrasonics 44(7):574–583 126. Griffin DR (1958) Listening in the dark: the acoustic orientation of bats and men. Yale University Press, New Haven 127. L. Cohen, (1995). Time-frequency analysis, Prentice-Hall signal processing series 128. Deubechies I (1992) Ten lectures on wavelets. Society for Industrial and Applied Mathematics 129. Abbate A, Koay J, Frankel J, Schroeder SC, Das P (1994) Application of wavelet transform signal processor to ultrasound. In: Paper presented at the IEEE Ultrasonics Symposium, pp 1147–1152 130. Masscotte D, Goyette J, Bose TK (2000) Wavelet-transorm-based method of analysis for lamb-wave ultrasonic nde signals. IEEE Trans Instrum Meas 49(3):524–529 131. Perov DV, Rinkeich AB, Smorodinskii YG (2002) Wavelet filtering of signals for ultrasonic flaw detector. Russ J Nondestr Test 38(12):869–882 132. Lou HW, Guang Rui H (2003) An approach based on simplified klt and wavelet transform for enhancing speech degraded by non-stationary wideband noise. J Sound Vib 268:717–729 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 109 133. Zou J, Chen J (2004) A comparative study on time-frequency fracture of cracked rotor by Wigner-Ville distribution and wavelet transform. J Sound Vib 276:1–11 134. Hou J, Hinders MK (2002) Dynamic wavelet fingerprint identification of ultrasound signals. Mater Eval 60(9):1089–1093 135. Hinders M, Bingham J, Jones KR, Leonard K (2006) Wavelet thumbprint analysis of TDR signals for wiring flaw detection. In: Thompson DO, Chimenti DE (eds) Reviews of progress in QNDE, vol 25, pp 641–648 136. Hinders M, Hou J, McKeon JCP (2005) Ultrasonic inspection of thin multilayers. In: Thompson DO, Chimenti DE (eds) Reviews of progress in QNDE vol 24, pp 1137–1144 137. Ghorayeb SR, Bertoncini CA, Hinders MK (2008) Ultrasonography in Dentistry: A Review IEEE Transactions on Ultrasonics. Ferroelectrics and Frequency Control 55(6):1256–1266 138. Al-Badour F, Sunar M, Cheded L (2011) Vibration analysis of rotating machinery using timefrequency analysis and wavelet techniques. Mech Syst Signal Process In Press. https://doi. org/10.1016/j.ymssp.2011.01.017 139. Hein H, Feklistova L (2011) Computationally efficient delamination detection in composite beams using Haar wavelets. Mech Syst Signal Process In Press. https://doi.org/10.1016/j. ymssp.2011.02.003 140. Li H, Zhang Y, Zheng H (2011) Application of Hermitian wavelet to crack fault detection in gearbox. Mech Syst Signal Process 25(4):1353–1363. https://doi.org/10.1016/j.ymssp.2010. 11.008 141. Wang X, Zi Y, He Z (2011) Multiwavelet denoising with improved neighboring coefficients for application on rolling bearing fault diagnosis. Mech Syst Signal Process 25(1):285–304. https://doi.org/10.1016/j.ymssp.2010.03.010 142. Jiang X, Mahadevan S (2011) Wavelet spectrum analysis approach to model validation of dynamic systems. Mech Syst Signal Process 25(2):575–590. https://doi.org/10.1016/j.ymssp. 2010.05.012 143. Kim JH, Kwak H-G (2011) Rayleigh wave velocity computation using principal waveletcomponent analysis. NDT E Int 44(1):47–56. https://doi.org/10.1016/j.ndteint.2010.09.005 144. Jin X, Gupta S, Mukherjee K, Ray A (2011) Wavelet-based feature extraction using probabilistic finite state automata for pattern classification. Pattern Recognit 44(7):1343–1356. https://doi.org/10.1016/j.patcog.2010.12.003 145. Acciani G, Brunetti G, Fornarelli G, Giaquinto A (2010) Angular and axial evaluation of superficial defects on non-accessible pipes by wavelet transform and neural network-based classification. Ultrasonics 50(1):13–25. https://doi.org/10.1016/j.ultras.2009.07.003 146. Jha R, Watkins R (2009) Lamb wave based diagnostics of composite plates using a modified time reversal method. In: Paper presented at the 50th AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference, Palm Springs, California 147. Rathod VT, Panchal M, Mahapatra DR, Gopalakrishnan S (2009) Lamb wave based sensor network for identification of damages in plate structures. In: Paper presented at the 50th AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference, Palm Springs, California 148. Du C, Ni Q, Natsuki T (2010) Determination of vibration source locations in laminated composite plates using lamb wave analysis. Adv Mater Res 79–82:1181–1184 149. Raghuram V, Shukla R, Pramila T (2010) Studies on Lamb waves in long aluminium plates generated using laser based ultrasonics. In: Paper presented at the 9th international conference on vibration measurements by laser and non-contact techniques and short course, AIP conference proceedings, vol 1253, pp 100–105 150. Rathod VT, Mahapatra DR, Gopalakrishnan S (2009) Lamb wave based identification and parameter estimation of corrosion in metallic plate structure using a circular PWAS array. In: Paper presented at health monitoring of structural and biological systems, Proceedings of SPIE - the international society for optical engineering, vol 7295 151. Song F, Huang GL, Hu GK (2009) Online debonding detection in honeycomb sandwich structures using multi-frequency guided waves. In: Paper presented at the second international conference on smart materials and nanotechnology in engineering, Proceedings of SPIE - the international society for optical engineering, vol 7493 110 M. K. Hinders and C. A. Miller 152. Hinders MK, Bingham JP (2010) Lamb wave pipe coating disbond detection using the dynamic wavelet fingerprinting technique. In: Paper presented at the review of progress in QNDE AIP conference proceedings, vol 1211, pp 615–622 153. Bingham JP, Hinders MK (2010) Automatic multimode guided wave feature extraction using wavelet fingerprints. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1211, pp 623–630 154. Treesatayapun C, Baltazar A, Balvantin A, Kim J (2009) Thickness determination of a plate with varying thickness using an artificial neural network for time-frequency representation of Lamb waves. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 619–626 155. Yu L, Wang J, Giurgiutiu V, Shin Y (2010) Corrosion detection/quantification on thin-wall structures using multimode sensing combined with statistical and time-frequency analysis. In: Paper presented at the ASME international mechanical engineering congress and exposition, proceedings, vol 14, pp 251–257 156. Bao P., Yuan, M., and Fu, Z. (2009). Research on monitoring technology of bolt tightness degree based on wavelet analysis. In: Paper presented at the ICEMI 2009 - proceedings of 9th international conference on electronic measurement and instruments, pp 4329–4333 157. Martinez L, Wilkie-Chancellier N, Glorieux C, Sarens B, Caplain E (2009) Transient spacetime surface waves characterization using Gabor analysis. In: Paper presented at the Anglo -French physical acoustics conference. J Phys: Conf Ser 195:1–9 158. Michaels TE, Ruzzene M, Michaels JE (2009) Incident wave removal through frequencywavenumber filtering of full wavefield data. In: Paper presented at the review of progress in QNDE, AIP conference proceedings, vol 1096, pp 604–611 159. Hanhui X, Chunguang X, Shiyuan Z, Yong H (2009) Time-frequency analysis for nonlinear lamb wave signal. In: Paper presented at the 2nd international congress on image and signal processing, CISP’09, Tianjin, China 160. Feldman M (2011) Hilbert transform in vibration analysis. Mech Syst Signal Process 25(3):735–802. https://doi.org/10.1016/j.ymssp.2010.07.018 161. Li C, Wang X, Tao Z, Wang Q, Shuanping D (2011) Extraction of time varying information from noisy signals: an approach based on the empirical mode decomposition. Mech Syst Signal Process 25(3):812–820. https://doi.org/10.1016/j.ymssp.2010.10.007 162. Yoo B, Pines DJ, Purekar AS, Zhang Y (2010) Piezoelectric paint based 2-D sensor array for detecting damage in aluminum plate. In: Paper presented at the 51st AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics and materials conference, Orlando, Florida 163. Yuan M, Fu Z, Bao P (2009) Detection of bolt tightness degree based on HHT. In: Proceedings of 9th international conference on electronic measurement and instruments, paper presented at ICEMI 2009, Beijing, China, pp 4334–4337 164. Haiyan Z, Xiuli S, Zhidong S, Yueyu X (2009) Group velocity measurement of piezo-actuated lamb waves using hilbert-huang transform method. In: Paper presented at the proceedings of the 2nd international congress on image and signal processing, CISP’09, Tianjin, China 165. Martinez L, Wilkie-Chancellier N, Glorieux C, Sarens B, Caplain E (2009) Transient spacetime surface waves characterization using gabor analysis paper presented at the Anglo–french physical acoustics conference. J Phys: Conf Ser 195:012009. https://doi.org/10.1088/17426596/195/1/012009 166. Xu B, Giurgiutiu V, Yu L (2009) Lamb waves decomposition and mode identification using matching pursuit method. In: Paper presented at sensors and smart structures technologies for civil, mechanical, and aerospace systems, proceedings of SPIE - the international society for optical engineering, vol 7292 167. Cole SA (2001) Suspect identities: a history of fingerprinting and criminal identification. Harvard University Press, Cambridge 168. Varitz R (2007) Wavelet transform and pattern recognition method for heart sound analysis. United States Patent 20070191725 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 111 169. Aussem A, Campbell J, Murtagh F (1998) Wavelet-based feature extraction and decomposition strategies for financial forecasting. J Comp Int Financ 6(2):5–12 170. Tang YY, Yang LH, Liu J, Ma H (2000) Wavelet theory and its application to pattern recognition. World Scientific, River Edge 171. Brooks RR, Grewe L, Iyengar SS (2001) Recognition in the wavelet domain: a survey. J Electron Imag 10(3):757–784 172. Nason GP, Silverman BW (1995) The stationary wavelet transform and some statistical applications. In: Oppenheim G Antoniadis A (ed) Wavelets and statistics. Lecture notes in statistics. Springer, pp 281–299 173. Pittner S, Kamarthi SV (1999) Feature extraction from wavelet coefficients for pattern recognition tasks. IEEE Trans Pattern Anal Mach Intel 21(1):83–88 174. Sabatini AM (2001) A digital-signal-processing technique for ultrasonic signal modeling and classification. IEEE Trans Instrum Meas 50(1):15–21 175. Coifman R, Wickerhauser M (1992) Entropy based algorithms for best basis selection. IEEE Trans Inform Theory 38:713–718 176. Szu HH, Telfer B, Kadambe S (1992) Neural network adaptive wavelets for signal representation and classification. Opt Eng 31:1907–1916 177. Telfer BA, Szu HH, Dobeck GJ, Garcia JP, Ko H, Dubey A, Witherspoon N (1994) Adaptive wavelet classification of acoustic and backscatter and imagery. Opt Eng 33: 2,192–2,203 178. Mallet Y, Coomans D, Kautsky J, De Vel O (1997) Classification using adaptive wavelets for feature extraction. IEEE Trans Pattern Anal Mach Intel 19:1058–1066 179. Mallat S (1989) A theory for multiresolution signal processing: the wavelet representation. IEEE Trans Pattern Anal Mach Intel 11:674–693 180. Antoine J-P, Barachea D, Cesar RM Jr, da Fontoura CL (1997) Shape characterization with the wavelet transform. Sig Process 62(3):265–290 181. Yeh C-H (2003) Wavelet-based corner detection using eigenvectors of covariance matrices. Pattern Recognit Lett 24(15):2797–2806 182. Chapa JO, Raghuveer MR (1995) Optimal matched wavelet construction and its application to image pattern recognition. Proc SPIE 2491(1):518–529 183. Liang J, Parks TW (1996) A translation-invariant wavelet representation algorithm with applications. IEEE Trans Sig Process 44(2):225–232 184. Maestre RA, Garcia J, Ferreira C (1997) Pattern recognition using sequential matched filtering of wavelet coefficients. Opt Commun 133:401–414 185. Murtagh F, Starck J-L, Berry MW (2000) Overcoming the curse of dimensionality in clustering by means of the wavelet transform. Comput J 43(2):107–120 186. Yu T, Lam ECM, Tang YY (2001) Feature extraction using wavelet and fractal. Pattern Recognit Lett 22:271–287 187. Tsai D-M, Chiang C-H (2002) Rotation-invariant pattern matching using wavelet decomposition. Pattern Recognit Lett 23(1–3):191–201 188. Du T, Lim KB, Hong GS, Yu WM, Zheng H (2004) 2D occluded object recognition using wavelets. In: 4th international conference on computer and information technology, pp 227– 232. https://doi.org/10.1109/CIT.2004.1357201 189. Saito N, Coifman RR (1994) Local discriminant bases. Proc SPIE 2303(2):2–14. https://doi. org/10.1117/12.188763 190. Livens S, Scheunders P, de Wouwer GV, Dyck DV, Smets H, Winkelmans J, Bogaerts W (1995) Classification of corrosion images by wavelet signatures and LVQ networks. In: Hlavác V, Sára R (eds) Computer analysis of images and patterns V. Springer, Berlin, pp 538–543 191. Tansel IN, Mekdeci C, Rodriguez O, Uragun B (1993) Monitoring drill conditions with wavelet based encoding and neural networks. Int J Mach Tool Manu 33:559–575 192. Tansel IN, Mekdeci C, McLaughlin C (1995) Detection of tool failure in end milling with wavelet transformations and neural networks (WT-NN). Int J Mach Tool Manu 35:1137–1147 193. Learned RE, Wilsky AS (1995) A wavelet packet approach to transient signal classification. Appl Comput Harmon A 2:265–278 112 M. K. Hinders and C. A. Miller 194. Wu Y, Du R (1996) Feature extraction and assessment using wavelet packets for monitoring of machining processes. Mech Syst Signal Process 10:29–53 195. Case TJ, Waag RC (1996) Flaw identification from time and frequency features of ultrasonic waveforms. IEEE Trans Ultrason Ferr Freq Cont 43(4):592–600 196. Drai R, Khelil N, Benchaala A (2002) Time frequency and wavelet transform applied to selected problems in ultrasonics NDE. NDT E Int 35(8):567–572 197. Buonsanti M, Cacciola M, Calcagno S, Morabito FC, Versaci M (2006) Ultrasonic pulseechoes and eddy current testing for detection, recognition and characterisation of flaws detected in metallic plates. In: Proceedings of 9th European conference on non-destructive testing, Berlin, Germany 198. Momenan R, Loew MH, Insana MF, Wagner RF, Garra BS (1990) Application of pattern recognition techniques in ultrasound tissue characterization. In: Proceedings of 10th international conference on pattern recognition, vol 1, pp 608–612 199. Bankman IN, Johnson KO, Schneider W (1993) Optimal detection, classification, and superposition resolution in neural waveform recordings. IEEE Trans Biomed Eng 40(8):836–841 200. Kalayci T, Özdamar Ö (1995) Wavelet preprocessing for automated neural network detection of EEG spikes. IEEE Eng Med Biol 14:160–166 201. Tate R, Watson D, Eglen S (1995) Using wavelets for classifying human in vivo Magnetic Resonance spectra. In: Antoniadis A, Oppenheim G (eds) Wavelets and statistics. Springer, New York, pp 377–383 202. Mojsilovic A, Popovic MV, Neskovic AN, Popovic AD (1995) Wavelet image extension for analysis and classification of infarcted myocardial tissue. IEEE Trans Biomed Eng 44(9):856– 866 203. Georgiou G, Cohen FS (2001) Tissue characterization using the continuous wavelet transform. I. Decomposition method. IEEE Trans Ultrason Ferr Freq Cont 48(2): 355–363 204. Georgiou G, Cohen FS, Piccoli CW, Forsberg F, Goldberg BB (2001) Tissue characterization using the continuous wavelet transform. II. Application on breast RF data. IEEE Trans Ultrason Ferr Freq Cont 48(2): 364–373 205. Lee W-L, Chen Y-C, Hsieh K-S (2003) Ultrasonic liver tissues classification by fractal feature vector based on M-band wavelet transform. IEEE Trans Med Imag 22(3):382–392 206. Alacam B, Yazici B, Bilgutay N, Forsberg F, Piccoli C (2004) Breast tissue characterization using FARMA modeling of ultrasonic RF echo. Ultrasound Med Biol 30(10):1397–1407 207. Bertoncini C, Hinders M (2010) Fuzzy classification of roof fall predictors in microseismic monitoring measurement, vol 43, pp 1690–1701. https://doi.org/10.1016/j.measurement. 2010.09.015 208. Fehlman W, Hinders M (2009) Mobile robot navigation with intelligent infrared image interpretation. Springer tracts in advanced robotics. Springer, Berlin 209. Cara Leckey, M. Rogge, C. Miller and M. Hinders, Multiple-mode Lamb wave scattering simulations using 3D elastodynamic finite integration technique. Ultrasonics 52(2), 193–344 (2012). https://doi.org/10.1016/j.ultras.2011.08.003 210. DO Thompson, DE Chimenti (eds) (2012) 3D simulations for the investigation of lamb wave scattering from flaws, review of progress in quantitative nondestructive evaluation, vol 31. In: AIP Conference Proceedings, vol 1430, pp 111–117. https://doi.org/10.1063/1.4716220 211. Miller C, Hinders M (2012) Flaw detection and characterization using lamb wave tomography and pattern classification review of progress in quantitative nondestructive evaluation. In: Thompson DO, Chimenti DE (eds) AIP conference proceedings, vol 31, pp 1430, 663–670. https://doi.org/10.1063/1.4716290 212. Miller C, Hinders M (2014) Classification of flaw severity using pattern recognition for guided wave-based structural health monitoring. Ultrasonics 54:247–258. https://doi.org/10.1016/j. ultras.2013.04.020 213. Miller C, Hinders M (2014) Multiclass feature selection using computational homology for Lamb wave-based damage characterization. J Intell Mater Syst Struct 25:1511. https://doi. org/10.1177/1045389X13508335 2 Intelligent Structural Health Monitoring with Ultrasonic Lamb Waves 113 214. Miller C, Hinders M (2014) Intelligent feature selection techniques for pattern classification of Lamb wave signals. In: AIP conference proceedings of review of progress in quantitative nondestructive evaluation, vol 1581, p 294. https://doi.org/10.1063/1.4864833 215. Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York 216. Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Computer science and scientific computing. Academic, Boston 217. Kuncheva LI (2004) Combining pattern classifiers. Wiley, New York 218. Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin 219. Webb AR (2012) Statistical pattern recognition. Wiley, New York 220. Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge 221. Nagy G (1968) State of the art in pattern recognition. Proc IEEE 56(5):836–863 222. Kanal L (1974) Patterns in pattern recognition: 1968–1974. IEEE Trans Inf Theory 20(6):697– 722 223. Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37 224. Watanabe S (1985) Pattern recognition: human and mechanical. Wiley-Interscience Publication, Wiley 225. Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1):245–271 226. Jain AK, Chandrasekaran B (1982) Dimensionality and sample size considerations in pattern recognition practice. In: Krishnaiah PR, Kanal LN (eds) Classification pattern recognition and reduction of dimensionality. Handbook of statistics, vol 2. Elsevier, pp 835–855 227. Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517 228. Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1–4):131–156 229. Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171 230. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182 231. Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin 20(1):101–148 232. Romero E, Sopena JM, Navarrete G, Alquézar R (2003) Feature selection forcing overtraining may help to improve performance. In: 2003 Proceedings of the international joint conference on neural networks, vol 3. IEEE, pp 2181–2186 233. Learned RE, Willsky AS (1995) A wavelet packet approach to transient signal classification. Appl Comput Harmon Anal 2(3):265–278 234. Yen GG, Lin KC (2000) Wavelet packet feature extraction for vibration monitoring. IEEE Trans Ind Electron 47(3):650–667 235. Jin X, Gupta S, Mukherjee K, Ray A (2011) Wavelet-based feature extraction using probabilistic finite state automata for pattern classification. Pattern Recognit 44(7):1343–1356 236. Gaul L, Hurlebaus S (1999) Wavelet-transform to identify the location and force-time-history of transient load in a plate. In: Chang F-K (ed) Structural health monitoring. Technomic Publishing Co, Lancaster, pp 851–860 237. Sohn H, Farrar CR, Hemez FM, Shunk DD, Stinemates DW, Nadler BR, Czarnecki JJ (2004) A review of structural health monitoring literature: 1996-2001. Technical report, Los Alamos National Laboratory, Los Alamos, NM. Report Number LA-13976-MS 238. Lee C, Park S (2011) Damage classification of pipelines under water flow operation using multi-mode actuated sensing technology. Smart Mater Struct 20(11):115002–115010 239. Min J, Park S, Yun CB, Lee CG, Lee C (2012) Impedance-based structural health monitoring incorporating neural network technique for identification of damage type and severity. Eng Struct 39:210–220 240. Sohn H, Farrar CR, Hunter NF, Worden K (2001) Structural health monitoring using statistical pattern recognition techniques. J Dyn Syst Meas Contr 123(4):706–711 114 M. K. Hinders and C. A. Miller 241. Biemans C, Staszewski WJ, Boller C, Tomlinson GR (1999) Crack detection in metallic structures using piezoceramic sensors. Key Eng Mater 167:112–121 242. Legendre S, Massicotte D, Goyette J, Bose TK (2000) Wavelet-transform-based method of analysis for Lamb-wave ultrasonic NDE signals. IEEE Trans Instrum Meas 49(3):524–530 243. Zou J, Chen J (2004) A comparative study on time-frequency feature of cracked rotor by Wigner-Ville distribution and wavelet transform. J Sound Vib 276(1):1–11 244. Raghavan A, Cesnik CES (2007) Review of guided-wave structural health monitoring. Shock Vib Dig 39(2):91–114 245. Harris D (2006) The influence of human factors on operational efficiency. Aircr Eng Aerosp Technol 78(1):20–25 246. Marsh G (2006) Duelling with composites. Reinf Plast 50(6):18–23 247. Bowermaster D (2006) Alaska isn’t the only airline with ground-safety troubles. http:// seattletimes.com/html/businesstechnology/2002750657_alaska20.html 248. Zhang R, Knight SP, Holtz RL, Goswami R, Davies CHJ, Birbilis N (2016) A survey of sensitization in 5xxx series aluminum alloys. Corrosion 72(2):144–159. https://doi.org/10. 5006/1787 249. Li F, Xiang D, Qin Y, Pond RB, Slusarski K (2011) Measurements of degree of sensitization (DoS) in aluminum alloys using EMAT ultrasound. Ultrasonics 51(5), 561–570. https://doi. org/10.1016/j.ultras.2010.12.009, ISSN 0041-624X Chapter 3 Automatic Detection of Flaws in Recorded Music Ryan Laney and Mark K. Hinders Abstract This chapter describes the application of wavelet fingerprinting as a technique to analyze and automatically detect flaws in recorded audio. Specifically, it focuses on time-localized errors in digitized wax cylinder recordings and contemporary digital media. By taking the continuous wavelet transform of various recordings, we created a two-dimensional binary display of audio data. After analyzing the images, we implemented an algorithm to automatically detect where a flaw occurs by comparing the image matrix against the matrix of a known flaw. We were able to use this technique to automatically detect time-localized clicks, pops, and crackles in both cylinders and digital recordings. We also found that while other extra-musical noises, such as coughing, did not leave a traceable mark on the fingerprint, they were distinguishable from samples without the error. Keywords Digital music editing · Wavelet fingerprint 3.1 Introduction Practical audio recording began in the late nineteenth century with Thomas Edison’s wax cylinders (Fig. 3.1). The product of experimentation with tinfoil as a recording medium, wax, was found to be a more viable and marketable method of capturing sound. A performer would play sound into the recording apparatus, shaped like a horn, and the pressure would increase as sound traveled down the horn. It would then cause a stylus to etch a groove into the wax mold, which could be played back. Unfortunately, the material would degrade during playback, and the recording would become corrupted as the wax eroded. This problem persisted into the twentieth R. Laney · M. K. Hinders (B) Department of Applied Science, William & Mary, Williamsburg, VA 23187-8795, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_3 115 116 R. Laney and M. K. Hinders Fig. 3.1 An Edison phonograph (left). An Edison wax cylinder (right) [9] century, even as the production process continued to improve. Among the flaws produced are time-localized pops and crackles, which often render cylinder recordings unlistenable. The first step in preserving these cylinders is digitization [1] because at this point the recording cannot undergo any further damage. While many cylinders have been digitized, many of these recordings are not of sufficient commercial value to merit fixing by a sound engineer. The National Jukebox project at the Library of Congress [2] has made available to the public free of charge digitized recordings from the extraordinary collections of the Library of Congress Packard Campus for Audio Visual Conservation and other contributing libraries and archives. At launch, the Jukebox included more than 10,000 recordings made by the Victor Talking Machine Company between 1901 and 1925. Jukebox content is expected to increase regularly. Other ongoing efforts of note include the Smithsonian’s Time-Based Media and Digital Art Working Group [3], the National Digital Information Infrastructure and Preservation Program [4], the New York Public Library, which has one of the oldest, largest, and most comprehensive institutional preservation programs in the United States, with activities dating back to 1911 [5], the Stanford Media Preservation Lab [6], the Columbia Center for Oral History [7], and the Media Digitization and Preservation Initiative at Indiana University [8]. Our work aims to make the next phase easier. If we can automatically detect where the flaws are in a digitized recording that makes it unlistenable, an engineer will have a much easier time fixing them and that process will be less expensive. Moreover, if multiple recordings of a particular piece exist, an engineer can select which one is worth fixing based on which has the least amount of errors. Thus, more of these historically important recordings will be conserved and made available for the public to fully enjoy. Most recording studios now use digital audio workstations (DAWs) on computers in conjunction with physical mixing stations, and it is common for audio to be recorded at bit depths at or above 24 bits and 96 kHz, although this is usually truncated 3 Automatic Detection of Flaws in Recorded Music 117 Fig. 3.2 Gunnar Johansen was the first musician to be appointed artist in residence at an American university. He held that post at the University of Wisconsin in Madison, from 1939 to 1976, during which he not only taught at the school, but also performed several extended series of concerts on the university’s radio station. Acting as his own recording producer and technician, he marketed recordings on his Artist Direct record label from his home near Blue Mounds, Wisconsin. In total, there were about 150 recordings issued on LP, including the complete keyboard works of Franz Liszt, J.S. Bach and the lesser known composer, Ignaz Friedman, among others. All of these releases were subsequently made available for purchase on cassette, as well. Although it has been said that some of these recordings suffered from Johansen’s inexperience as a recording technician, the performances, particularly the Liszt, were highly praised by reviewers in scholarly journals and music magazines. The recording studio included a Hamburg Steinway ca. 1920 Concert Grand (D) (after ca. 1960), a Moor Steinway Double Keyboard (D), a Moor Bösensdorfer Double Keyboard, a Harpsichord, Spinet, Virginal, and Clavichord built in the 1950s. Johansen’s magnetic tape recordings have been digitized and painstakingly restored by his student, Jonathan Stevens [10] down to 16 bits and 44.1 kHz for commercial production. Unlike the case of digitized cylinders, engineers have each instrument’s individual track (or tracks) to work with rather than just one master track. Often this means that a flaw can be isolated to just one track, and a correction on that level is less musically offensive than altering the master. Common problems which recording engineers spend hours finding and fixing include bad splices, accidental noises from an audience, and extra-musical sounds from the performers such as vocal clicks, guitar pick noises, and unwanted breathing. A time-consuming and monotonous portion of many recording engineers’ jobs is to go through a track little by little, find these errors, and then correct them. Fortunately, the circumstances of digital recording allow for much easier correction of errors. It is usually a simple matter to create a “correct” version of a track using various processing techniques, as the digital realm gives the engineer essentially unlimited room to duplicate, manipulate, and test different variations. This makes automatic detection of the errors much easier in most cases (Fig. 3.2). 118 R. Laney and M. K. Hinders 3.2 Digital Music Editing Tools Programs do exist for correcting errors in audio at a very fine level. CEDAR (http:// www.cedar-audio.com), commercial audio restoration software, lets the user look at a spectrogram of the sound and remove unwanted parts by altering the image of the spectrogram. Even cross-fading and careful equalization in most DAWs can eliminate time-localized clicks, pops, and splice errors. However, this can be both a time-consuming and expensive process, as the engineer has to both recognize all the errors in a recording and then figure out how to correct them. In the music industry, this never happens unless the recording is highly marketable in the first place. Due to financial limitations, many historically important recordings on cylinders, magnetic tape, other analog media, and even digital media don’t merit a human cleaning. The time and cost it takes to do this would be greatly reduced with an algorithm that automatically detects errors. There is no effort on the engineers’ part to find the errors; they simply have to make the correction at the given time. In addition, if multiple versions of a recording exist, an engineer could select which one would be worth fixing based on the amount of flaws that require attention. Previous work in this field has involved using spectrographic analysis as a basis to de-noise a signal using the continuous wavelet transform. The Single Wavelet Transform One-Dimensional De-Noising toolbox in MATLAB (The MathWorks, Inc.) was used to generate a spectrogram of an input signal followed by the spectrogram of the filtered signal. The continuous wavelet transform yields coefficients of “approximations” and “details”, or lower frequencies and higher frequencies, which can be individually modified with a certain gain. In a sense, this is a type of equalization that relies on a wavelet transform as opposed to a Fourier transform, as all the coefficients belonging to a certain group will be changed according to their individual gain. Unfortunately, spectrographic analysis was not a very practical tool in automating noise removal. While it was useful in making individual sections of a recording better, large-scale de-noising on the recording did not make it more listenable. For a given moment in time, it was appropriate to kill some of the detail coefficients, while removing them at other moments actually detracted from the musical quality. We are using a Graphical User Interface (GUI) in MATLAB implemented by the Non-Destructive Evaluation lab at William and Mary to display and analyze the wavelet fingerprint of filtered data (Fig. 3.3) [11–13]. The continuous wavelet transform is used to evaluate patterns related to frequency and intensity in the music, but the output is not a plot of frequency intensities over time like a spectrogram, but rather a two-dimensional binary image. This is a viable model because it is much easier to recognize patterns in a binary image than in a spectrogram, and it’s easier to tell a computer how to look at it. By examining raw and filtered fingerprints for many variations of a certain type of input, we can make generalizations about how a flaw in a recording will manifest itself in the wavelet fingerprint. The wavelet fingerprint tool works similar to MATLAB’s built-in wavelet tool: we input sound data from the workspace in MATLAB and view it as a waveform in the user interface. Below that, 3 Automatic Detection of Flaws in Recorded Music 119 Fig. 3.3 Wavelet fingerprint tool. Top plot: the entire input waveform. Middle plot: a selected portion of the filtered waveform. Bottom plot: fingerprint of the selected part we see a filtered version of the wave file according to the filter parameters, and then the wavelet fingerprint below that. The process of writing a detection algorithm involves several important steps. When working with digital recordings, we reproduced a given flaw via recording or processing in a digital audio workstation. Using Avid’s Pro Tools DAW (http://www. avid.com), we recorded coughs, instrumental sounds, and generated other noises one might find in a recording. We gathered multiple variations of a type of recorded data from multiple people; it does not reveal anything if the manifestation of one person’s cough is unique, for instance, because then the algorithm would only be valid for finding that one error. The amplitude, panning, and reverb of the error can be edited in Pro Tools and then synchronized in time with the clean audio tracks. The wave files are then exported separately, imported into a workspace in MATLAB, and analyzed using the wavelet fingerprint tool. Files are compared with other files containing the same error as well as a control file without the error. We can then filter the waveform using Fourier or wavelet analysis and examine the fingerprints for patterns and similarities. When working with cylinder recordings, we took a slightly different approach. In this case, we did not simply insert an error into a clean recording; the recordings we worked with had the error to begin with. This lack of control meant that, initially, we did not know exactly what we are looking for. Fortunately, we have access to files that have been processed through CEDAR’s de-clicking, de-crackling, and dehissing system. This does help make the recording more listenable, but the errors are still present to a smaller degree. The cleaned-up files are also louder and include the proper fades at the beginning and end of the track. Thus, this process is somewhat analogous to the mastering process in digital music—the final step in making a track sound its best. By synchronizing the cleaned-up track with the raw track, we can 120 R. Laney and M. K. Hinders figure out what we are looking for in the wavelet fingerprint. The errors that are affected by CEDAR will appear differently in the edited track, and the rest of the track should manifest itself in a similar way. However, synchronizing the files can be rather challenging. In a digital audio workstation, it is relatively simple to get the files lined-up within several hundredths of a second. At this point, the files will sound like they overlap almost completely, but the sound will be more “full” due to the delay. However, several hundredths of a second at the industry standard 44.1 kHz sample rate can be up to about 2,000 samples. Due to the visual nature of the wavelet fingerprint, we need to get the files synchronized much better than that. Synchronization within about 100 samples is sufficient. One method of synchronizing the files is simply to examine them visually in a commercial DAW. By filtering out much of the non-musical information using an equalization program, the similarities in the unedited and edited waveforms become clearer. We can continue to zoom-in on the waveforms in the program and synchronize them as much as possible until they are lined-up down to the sample. However, although this is an effective approach, it is not preferable because it is time-consuming and subject to human error. Thus, we have developed a program in MATLAB to automatically synchronize the two waveforms. Like a commercial DAW, our audio workstation allows the user to first manually line-up the waveforms to the extent that is humanly practical. A simultaneous playback function exists so that we can verify aurally that the waveforms are reasonably synchronized. At this point, the user can run an automatic synchronization. This program analyzes the waveforms against each other over a specified region and calculates a Pearson correlation coefficient between them. The second waveform is then shifted by a sample, and another correlation coefficient is calculated. After desired number of iterations, based on how close the user thinks the initial manual synchronization is, the program displays the maximum correlation coefficient and shifts the second waveform by the appropriate amount, thus automatically synchronizing the two waveforms. Like a commercial DAW, our MATLAB implementation also includes a Fourierbased equalization program, a compression algorithm to cut intensities above a certain decibel level, a gate algorithm to cut intensities below a certain decibel level, gain control, and a trim function to change the lengths of a waveform by adding zero values. Unlike most DAWs, however, our program allows for a more critical examination and alteration of a waveform, including frequency spectrum analysis by way of a discrete Fourier transform, and spectrograms of the waveforms. The equilization tool, while not a paragraphic equalizer, allows for carefully constructed multi-band rectangular filters to be placed on a waveform after Fourier analysis. This is particularly helpful in the synchronization process; by removing the low end of the signal (below anything that is musically important) and removing the high-end (above any of the fundamental pitch frequencies, usually no more than several thousand hertz), we can store the unedited and edited waveforms as temporary variables that end up looking much more like each other than they did initially. By running the synchronization algorithm on the musically similar temporary variables, 3 Automatic Detection of Flaws in Recorded Music 121 we know exactly how the actual waveforms should match up. With the unedited and edited files synchronized, we can examine them effectively with the wavelet fingerprint tool. The more significant differences between the fingerprint of the unedited sound and that of the edited sound will likely indicate an error. After we examine multiple instances, if we have enough reason to relate a visual pattern to an audio error, we can use our detection algorithm to automatically locate a flaw. The detection algorithm we have implemented involves a numerical evaluation of the similarity between a known flaw and a given fingerprint. Because the wavelet fingerprint is in reality just a pseudocolor plot of ones and zeros, we can incrementally shift a smaller flaw matrix over the larger fingerprint matrix and determine how similar the two matrices are. To do this, we simply increase a value representing how similar the matrices are by a given amount every time there is a match. The user is able to decide how many points are awarded for a match of zeroes and how many points are awarded for a match of ones. After the flaw is shifted over the entire waveform from beginning to end, we plot the match values and determine where the error is located. The advantage to this approach is that it not only points out where errors likely are, but also allows the user to evaluate graphically where an error might be, in case there is something MATLAB failed to catch. In using the detection algorithm to analyze audio data, we found that it is often necessary to increase the width of the ridges and decrease the number of ridges for a reasonable evaluation. Multiple iterations of one error (a pop, for instance) often manifest themselves very differently in the fingerprint. Thus, it is helpful to have a more general picture of what is happening sonically from the fingerprint. The match value technique in the detection algorithm gives us an unrepresentative evaluation of similarity between the given flaw and the test fingerprint if we set the ridge detail too fine. A Fourier transform is a powerful tool in showing the frequency decomposition of a given signal, but we would like to see frequency, intensity, and time all in one representation. A spectrogram shows time and frequency on two axes and intensity with color; we can see what frequencies are present, and how much, at a given time. Programs exist which allow the user to modify the colors in the spectrogram to edit the sound, but spectrograms are difficult to use for automation. The image is difficult to mathematically associate with a particular type of flaw because the false-color image has to be first segmented in order to define the boundaries of shapes. A wavelet transform lends itself to our goals much better. Essentially, we calculate the similarity between a wavelet function and a section of the input signal, and then repeat the process over the entire signal. The wavelet function is then rescaled, and the process is repeated. This leaves us with many coefficients for each different scale of the wavelet, which we can then interpret as approximation or detail coefficients. Scale is closely related to frequency, since a wavelet transform taken at a low scale will correspond to a higher frequency component. This is easy to understand by imagining how Fourier analysis works; by changing the scale of a sine wave, we are essentially just changing its frequency. At this point, we now have a representation of time, frequency, and intensity, all at once. We can plot the coefficients in terms of time and scale simultaneously. The wavelet fingerprint tool we have implemented 122 R. Laney and M. K. Hinders Fig. 3.4 Wavelet fingerprints of an excerpt generated by the coiflet3 wavelet (top) and Haar wavelet (bottom) takes these coefficients and creates a binary image of the signal, much like an actual fingerprint, which is easy for both humans to visually interpret and computers to mathematically analyze (Fig. 3.4). The wavelet fingerprint tool currently runs from a 650-line code in MATLAB, which we modify as needed for practicality and functionality. The function is implemented in a GUI, so users can easily manipulate input parameters. By clicking on a section of the filtered signal, the user can view the signal’s fingerprint manifestation, and clicking on the fingerprint will also let the user view what is happening at that point in the signal to cause that particular realization. The mathematically interesting part of the code occurs almost exclusively just in the last hundred lines, where it creates the fingerprint from the filtered waveform. It obtains the coefficients of the continuous wavelet transform from MATLAB’s built-in cwt function. It takes the input parameters of the filtered wave from the specified left and right bounds selected by the user, the number of levels (which is related to scale) to use, and the wavelet to use for the analysis. This creates a two-dimensional matrix of coefficient values, which are then normalized to 1 by dividing each component of the matrix by the overall maximum value. Initially, the entire fingerprint matrix is set to the value zero, and then certain areas will be set to one based on the values of the continuous wavelet transform coefficients, the parameter selected by the user of how many of these “ridges” there will appear, and how thick they will be. Throughout the domain of the fingerprint, if the coefficients are greater than the number of ridges minus half their thickness and less than the number of ridges plus half their thickness, we set the fingerprint value at that point to be one. This outputs a “fingerprint” whose information is contained in a two-dimensional matrix. For each fingerprint, the twodimensional matrix is displayed as a pseudocolor plot at the bottom of the interface. Several modifications have been made to the original version of this code. For speed and ease-of-use, this version’s capability to recognize binary or ASCII input data has been eliminated. We use only data imported into MATLAB’s workspace, so there is no need to create an error while running the program by accidentally clicking the wrong button. Several of the program’s default parameters have also been changed. Namely, the wavelet fingerprint no longer needs to discern between 3 Automatic Detection of Flaws in Recorded Music 123 “valleys” (negative intensities) or “peaks” (positive intensities), because it is only the amplitude of the wave that contains relevant sonic information. We set a more full 75 “levels” of the wavelet fingerprint by default, rather than 50, so we can examine all frequencies contained in the data more efficiently. The highest frequency we can be concerned with is 22.05 kHz, as we cannot take any information from frequencies higher than half the sampling rate (we always use the industry standard 44.1 kHz). More practically, the user can now highlight any part of the fingerprint, input signal, or filtered signal for analysis. The program will still respond even if the user selects left and right bounds that don’t work, but rather will alter the bounds so that the entire domain shifts over. Titles and axis labels on all plots of the waveform and the fingerprints are now included as well. 3.3 Method and Analysis of a Non-localized Extra-Musical Event (Coughing) To analyze extra-musical events as errors in MATLAB, we generated them artificially using Pro Tools, and tested many different variations of them with the wavelet fingerprint tool. We then exported both tracks together as a WAV file to be used in MATLAB. We always specify WAV file because it is lossless audio and gives us the best possible digital sound of our original recording. We never want to experience audio losses due to compression and elimination of higher frequencies in formats like mp3, despite the great reduction of file size. The file was then sent to the workspace in MATLAB. By muting the cough, we created a control file without any extraneous noise and sent that to the workspace in MATLAB as well. We always exported audio files in mono rather than stereo, because currently the wavelet fingerprint tool can only deal with a single vector of input values rather than two (one each for left and right). For each file, we renamed the “data” portion of the file, containing the intensity of the sound at each sample, and remove the “fs” portion, containing the sampling rate (44.1 kHz). This created two files in the workspace: a vector of intensities for the “uncorrupted” file and a vector for the file with coughing. When we call the wavelet fingerprint tool in the command window, the loaded waveform appears above its fingerprint (Fig. 3.5). In the top plot, we see the original waveform. The second plot shows an optionally filtered (this one is still clean) version of the selected part of the first plot. The third image shows the fingerprint, in this case created by the coiflet3 wavelet. The black line on the fingerprint corresponds to the red line on the second plot, so we can see how the fingerprint relates to the waveform. In the third plot, the yellow areas represent ridges, or “ones” in the modified wavelet transform coefficients matrix. We can also load the file with the coughing and examine it in a similar fashion. As a convention, we always set the ridge number to 15 and ridge thickness to 0.03 for unfiltered data. The code allows us to take a snapshot of all the loaded fingerprints with the “compare fingerprints” button. This lets us see each of the fingerprints within the 124 R. Laney and M. K. Hinders Fig. 3.5 Analyzing the data with the wavelet fingerprint tool. Fig. 3.6 Comparing wavelet fingerprints using the wavelet fingerprint tool (no error present) most recent bounds of the middle plot (shown in Fig. 3.5). Purposely, we have selected an area of the fingerprint that does not contain a cough (Fig. 3.6). There is no visually obvious difference in the fingerprint—as we expect, if there is no difference in the fingerprint, an error does not exist in this part of the sample. Shifting the region of interest to a spot with the error (Fig. 3.7), we see in the second fingerprint that the thumbprint looks very skewed at higher levels. The ridges seem to stretch significantly further than they did at the lower levels, but their orientation and position are very much unchanged. As expected, introducing an error also introduces a difference in the waveform. But now, we have to test against many other types of errors to determine what the relationship is. Fortunately, Pro Tools makes it easy to keep track of many different tracks and move sound clips to the appropriate location in time. We then compared many different coughing samples in MATLAB (Fig. 3.8). While these coughs mani- 3 Automatic Detection of Flaws in Recorded Music Fig. 3.7 Difference in fingerprint with (bottom) and without (top) coughing Fig. 3.8 Comparing many different recorded coughs 125 126 R. Laney and M. K. Hinders fest themselves differently in the music, one similarity we do notice is an exaggeration of some of the upper level characteristics in the clean sample (the top fingerprint). For instance, in most variations, we notice a distortion in the “hook” that occurs at 500 samples around level 50. While this observation is vague, it is still somewhat significant, since these samples are taken from different “types” of coughs and from different people. Unfortunately, after testing the same coughing samples against several different audio recordings, the fingerprints did not show sufficient visual consistency. Although these results are inconclusive, we think that if we filter the track in some ideal way in future studies, we should be able to identify where a cough might likely occur. The rest of our analysis focuses on more time-localized errors. 3.4 Errors Associated with Digital Audio Processing In mixing music, if an engineer wants to put two sounds together in immediate sequence, he lines them up next to each other in a DAW and cross-fades each portion with the other over a short time span. One signal fades out while the other fades in. When this is done incorrectly, it can leave an awkward jump in the sound that sometimes manifests itself as an audible and very un-musical popping noise. For this analysis, we worked with three different types of music one might have to splice in the studio: tracks of acoustic guitar strumming, vocal takes spliced together, and the best takes of recorded audio from a live performance spliced together, which is very useful if there are multiple takes and the performers mess some of them up. First, we examined chord strumming recorded by an acoustic guitar. One part of a song was recorded, and then the second part was recorded separately, and the two pieces were put together. The three plots in Fig. 3.9 show the track spliced together correctly, at a musically acceptable moment with a brief cross-fade, followed by two incorrect tracks, which are missing the cross-fade. The last track even has the splice come too late, so there is a very short silence between the two excerpts. Mathematically, the second track just contains a slight discontinuity at the point of the splice, but the third track contains two discontinuities and a region of zero values. The late splice leaves an obvious mark; at the lower levels, the ridges disappear entirely. The flaw in the second graph is less obvious. At about 250 samples, the gap between the two adjacent “thumbprints” is sharper and narrower in the second plot. We then ran the same analysis with vocal data. We found some similarities, but the plot of the fingerprint is so dependent on the overall volume that it is hard to determine what is directly connected with the flaw and what isn’t (Fig. 3.10). The triangular shape still appears prominently due to the zero values in the third plot, but the second plot is a little harder to analyze now. Although there is no cross-fade, the discontinuity between the first and second waveforms is smaller. However, at about 300 samples, the gap between the two thumbprints is once again complete, narrow, and sharp, although a little more curved this time. 3 Automatic Detection of Flaws in Recorded Music 127 Fig. 3.9 Comparing acoustic guitar splices: a correct splice (top), a bad splice (middle), zero data in front of splice (bottom) Fig. 3.10 Comparing vocal splices: a correct splice (top), a bad splice (middle), zero data in front of splice (bottom) Finally, to work ensemble playing into our analysis, we chose to examine a performance of an original composition for piano, guitar, percussion, and tenor saxophone (Fig. 3.11). This piece was performed at a reading session, so splicing together different parts of the piece from the best takes is what made the complete recording; a full-length, un-spliced recording does not exist. Once again, in the third plot, we see the characteristic triangle shape. In the upper plots, however, there is little difference, but there is a slight sharpening of the gap right after 300 samples in the second plot. 128 R. Laney and M. K. Hinders Fig. 3.11 Comparing splices of a live recording: a correct splice (top), a bad splice (middle), zero data in front of splice (bottom) Fig. 3.12 Comparing wavelet fingerprints of the cylinder data (between 2,257,600 and 2,258,350 samples): edited data (top), unedited data containing pop (bottom) Fig. 3.13 Instance of a click at 256,000 samples (between 300 and 400 samples in the fingerprint): edited waveform fingerprint (top) and unedited waveform fingerprint (bottom) While this result certainly is helpful, it is probably not codeable, since sharpness of a gap is not an artifact unique to not cross-fading. However, the presence of similar visual patterns confirmed that investigating splices further was necessary. 3 Automatic Detection of Flaws in Recorded Music 129 Fig. 3.14 Instance of a click at 103,300 samples (slightly less than 400 samples in the fingerprint): edited waveform fingerprint (top), unedited waveform fingerprint (bottom) 3.5 Automatic Detection of Flaws in Cylinder Recordings Using Wavelet Fingerprints Using files from the University of California at Santa Barbara’s Cylinder Conservation Project [1], we compared the fingerprints of unedited and edited digitized cylinders using the wavelet fingerprint tool. The files provided were in stereo, but in actuality are the same in both the left and right channels, because stereo sound did not exist in cylinder recording. So, we converted the files to mono by taking only the left channel’s data. As before, the unedited file and edited file are each a vector the length of the waveform in samples. Because a complete song is millions of samples long, it is impractical for the wavelet fingerprint program to process all the data at once. Thus, we examined the complete audio file using our digital audio workstation implemented in MATLAB to look for more specific spots we wanted to check for errors. CEDAR’s de-clicking program did a relatively good job getting rid of the many of the louder clicks, but they still affect the sound in a less overbearing way. We compared several fingerprints of clicks in the unedited waveform with the fingerprints of the edited waveform. We used a relatively large ridge width of 0.1 and 20 ridges, so we could look for more general trends in the fingerprint (Fig. 3.12). At about 2,258,000 samples, we found a noticeable difference in the two images. The flower-like shape that occurs at about 400 samples is common in other instances of clicks as well. From about 256,000– 256,750 samples (Fig. 3.13) and about 103,300 samples (Fig. 3.14), we observed a similar shape. We found that this particular manifestation was more visually apparent when the musically important information is quieter than the noise. We then chose a flaw to use as the control. We compared this against a larger segment of the wavelet fingerprint, and using our automatic flaw detection algorithm found the exact location of the error. Taking the first error to generate our generic pop fingerprint, we store the area from 400 to 450 samples in a 75 × 50 matrix. This was shifted over the entire test fingerprint, counting only the values of the ridges that match, as we can see by looking at the fingerprint the placement of the zero values does not seem to have much consistency between each shape. Assigning one match point for each ones match and no points for each zeros match, the program 130 R. Laney and M. K. Hinders Fig. 3.15 Using the error at 2,258,000 samples as our generic flaw, we automatically detected the flaw at 256k samples (beginning at 296 samples in the fingerprint) using our flaw detection program detected the flaw slightly early at 296 samples (Fig. 3.15). However, it is very visually apparent that the flaw most likely occurs in the neighborhood of 300 samples, so it did a reasonably good job of locating it. We expanded on this method further by filtering the waveform before the fingerprint was created. Creating a temporary variable from the original waveform revealed useful information about where important events are located in the waveform. Using the coiflet3 wavelet as our de-noising wavelet, we removed the third through fifth approximation coefficients as well as the second through fifth detail coefficients on the above excerpts. We found that while much of the musical information is lost, information about where the errors occurred was exaggerated. At 256,500 samples, we observed a prominent fork shape in the unedited fingerprint (Fig. 3.16). In taking this data, we normalized each fingerprint individually due to the large difference in intensities between the filtered versions of the clean and unedited wave files. We observed the same shape at 103,300 samples (Fig. 3.17). For all filtered data, we use a ridge number of 20 and thickness of 0.05. Taking this feature as our representative pop sample, we counted the number of zeros and ones in the first fingerprint that match up to see if we can automatically locate it. Our program caught the flaw perfectly at about 350 samples (Fig. 3.18). The 3 Automatic Detection of Flaws in Recorded Music 131 Fig. 3.16 Using the coiflet3 wavelet to de-noise the input data, we observed the flaw at 256,500 samples with the Haar wavelet (about 400 samples in the fingerprint): filtered edited data (top) and its associated fingerprint (second from top), filtered unedited data (second from bottom) and its associated fingerprint (bottom) weakness of using this method is that it appears more likely, for instance, that there is a flaw at zero samples than at 300 samples. For a more accurate search, we need to develop a method that does not count the zero matches unless there is a significant amount of ones matches. Interestingly, we found that while it is difficult to locate the crackling noise made by cylinders just by looking at the waveform, we believe that we can successfully identify many instances through the same filtering process. Since these are so widespread throughout the unedited waveform, we simply selected a portion of the file and examined the fingerprint for the above shape. For instance, we chose the region at approximately 192,800–193,550 samples in the same file (Fig. 3.19) and saw less-defined fork shapes appear frequently in the edited waveform than the unedited waveform. At about 292,000 samples, we observed a similar pattern (Fig. 3.20). While both fingerprints contain the fork shape throughout, they are more defined in the unedited waveform. Since the forks seem to correspond to intensity extremes when they are well defined, we think that this means they relate to the degree of presence of the crackling sounds. Unfortunately, our algorithm for locating these forks is of little use in this situation. We need a program to count the number of 132 R. Laney and M. K. Hinders Fig. 3.17 Using the coiflet3 wavelet to de-noise the input data, we observed the flaw at 103,300 samples with the Haar wavelet (about 400 samples in the fingerprint): filtered edited data (top) and its associated fingerprint (second from top), filtered unedited data (second from bottom) and its associated fingerprint (bottom) occurrences of forks, rather than individual points, per a certain number of samples. When we run the fingerprint through the algorithm we already have, our results are only reasonable. We assign match points for both ones and zeros so that we can account for how defined the fork shapes are. Figure 3.21 shows the relative likelihood of a flaw in the unedited waveform, and in the edited waveform. We see that for the unedited waveform the peaks in the match values are higher than in the edited waveform, suggesting that there is more crackling present in the unedited version. 3.6 Automatic Detection of Flaws in Digital Recordings A common issue in editing is the latency of the cross-fade between splices. For example, it is often necessary for an engineer to go through vocalists’ recordings and make sure everything is in tune. This can be fixed in a DAW by selecting the part of the file that is out of tune and simply raising or lowering its pitch. However, an extraneous noise such as a click or pop may be created from a change in the waveform. There might be a discontinuity, or the waveform may stay in the positive 3 Automatic Detection of Flaws in Recorded Music 133 Fig. 3.18 We took the shape in Fig. 3.17 as our generic flaw and automatically detected the error in Fig. 3.16 using our flaw detection program or negative region for too long. Oftentimes, an automatically generated cross-fade will not solve the problem, so an engineer needs to go through a recording, track by track, and listen for any bad splices. Moreover, after the problem is located, the engineer must make a musical judgment on how to deal with it. The cross-fade cannot be too long, or the separate regions will overlap, and filtering only a portion of the data could make that sound too strange. However, using wavelet filtering and fingerprinting comparison techniques, it is simple to automatically locate bad splices that require additional attention. In Pro Tools, simulating this is straightforward. We took the vocal track from a recording session and fully examined it, finding and correcting all the pitch errors. Then, we listened to the track for any sort of error that resulted from the corrections. We duplicated the corrected track, and at each instance of an error, manually fixed it on the control track. The track was exported from Pro Tools and examined using the wavelet fingerprint tool (Fig. 3.22). Fortunately, since we were working in the digital realm, we already knew exactly where the flaws were, and there was no need to use our audio workstation to automatically sync the two tracks. At about 81,500 samples, we see a very distinct image corresponding to the flaw: a discontinuity in the waveform that sounds like a small click. The unedited waveform, unlike the 134 R. Laney and M. K. Hinders Fig. 3.19 Using wavelet filtering and fingerprinting to find where clicks occur between 192,800 and 193,550 samples: filtered edited data (top) and its associated fingerprint (second from top), filtered unedited data (second from bottom) and its associated fingerprint (bottom) control, has a triangular feature in the middle of the fingerprint. Removing the fifth approximation coefficient using the coiflet3 wavelet as well as the second through fifth detail coefficients, however, reveals a very distinct fork shape (Fig. 3.23). We already know from our analysis of cylinder flaws that it is possible to code for this shape. In one way, this is a very good thing, because it means that this shape, when it is well defined, is indicative of a flaw and a computer has a relatively easy time finding it. However, it does not say much about which particular type of flaw it is. For that information, we need to rely on the appearance of unfiltered or less filtered data. At 117300 samples, we observed another error from pitch correction, this time a loud pop. The waveform stays in the positive region for too long as a result of the splicing. By filtering out the second through fifth detail coefficients using the coiflet3 wavelet, we noticed that while the error is visually identifiable in the fingerprint, it is a different type of error than the previous one we examined. Adding the fifth approximation coefficient to the filter, we see the characteristic fork shape once again (Fig. 3.24). Once more, the heavy filtering with the coiflet3 wavelet successfully marked an error, but we cannot say from the fingerprint exactly what kind of error it is. 3 Automatic Detection of Flaws in Recorded Music 135 Fig. 3.20 Using wavelet filtering and fingerprinting to find where clicks occur between approximately 292,600 and 293,350 samples: filtered edited data (top) and its associated fingerprint (second from top), filtered unedited data (second from bottom) and its associated fingerprint (bottom) Fig. 3.21 Match values for the unedited (left) and edited (right) waveforms between 292,600 and 293,350 samples and the fork-shaped flaw 136 R. Laney and M. K. Hinders Fig. 3.22 Instance of a splice at 81,600 samples (about 400 samples in the fingerprint): edited waveform (top) and associated fingerprint (second from top), unedited waveform (second from bottom), and associated fingerprint (bottom). Fingerprint created using the Haar wavelet Next, we analyzed the fingerprint of an electric guitar run through a time compression program. An engineer would need to change how long a note or group of notes lasts if a musician was not playing in time, but like with pitch corrections, this can result in flaws. We found that although the instrument is completely different, the error manifests itself in a similar way in the wavelet fingerprint. At about 567,130 samples, there is a discontinuity in the waveform resulting from the time compression (to the left). We made a control track by cross-fading the contracted and normal sections of the waveform and analyzed the differences in the wavelet fingerprint in MATLAB. Using the Haar wavelet to generate the fingerprint without any pre-filtering, we found that a triangular figure manifests itself in the middle of the fingerprint where the splice occurs (Fig. 3.25). This “discontinuity” in the fingerprint was not apparent in that of the control track. 3 Automatic Detection of Flaws in Recorded Music 137 Fig. 3.23 Using the coiflet3 wavelet to de-noise the input data, we observed the flaw at 81,600 samples with the Haar wavelet (about 400 samples in the fingerprint): filtered edited data (top) and its associated fingerprint (second from top), filtered unedited data (second from bottom) and its associated fingerprint (bottom) Next, we used the coiflet3 wavelet to filter the waveform (Fig. 3.26). We killed the fifth approximation coefficient as well as the second through fifth detail coefficients, and we found the typical pronounced fork shape occurred at the discontinuity in the waveform. In the fingerprint of the control waveform, we saw that there was a lack of clarity between each fork, and the forks themselves were once again not well defined. Although the cause of the discontinuity was different in this test, it manifested itself in the fingerprint the same way. Thus, we were able to identify that it is a discontinuity error as well as its location in the sample. 3.7 Discussion Wavelet fingerprinting proved to be a useful technique in portraying and detecting time-localized flaws. We were able to very accurately and effectively detect pops and clicks in both digital recordings and digitized cylinders through a process of filtering the input waveform and analyzing the fingerprint. Our algorithm precisely located at what point on the wavelet fingerprint the flaw began and provided a useful graphical display of the relative likelihood of an error. We found that the same algorithm could 138 R. Laney and M. K. Hinders Fig. 3.24 Instance of a splice at 117,300 samples (slightly before 400 samples in the fingerprint), with the second through fifth detail coefficients removed as well as the fifth approximation coefficient using the coiflet3 wavelet: edited waveform (top) and associated fingerprint (second from top), unedited waveform (second from bottom), and associated fingerprint (bottom). Fingerprint created using the Haar wavelet be used to detect many manifestations of the same flaw. Whether it was produced by a bad splice from placement, tuning, or tempo adjustment, or induced by cylinder degradation, we were able to find where the problem occurred. While our filtering process and algorithm did not tell us exactly what type of error occurred or how prominent it was, analysis of the unfiltered fingerprint did reveal a visual difference between flaws. Wavelet fingerprinting proved relatively inefficient in detecting flaws in coughs. We think that because events like this are more sonically diverse in nature and less time-localized, it was not possible for us to create an automatic detection algorithm. With a high level of filtering, it may be possible in future work to locate such errors. We also hope in future studies to further develop a method of detecting less audible flaws, such as the crackles we noticed in cylinder files, with a higher mathematical level of precision and certainty. We look to extend our program’s reach to musical flaws such as vocal clicks and the sound of a guitarist’s pick hitting a fingerboard. Such a tool could be helpful for engineers in both audio restoration and digital mastering. 3 Automatic Detection of Flaws in Recorded Music 139 Fig. 3.25 Using Haar wavelet to create the fingerprint without any pre-filtering, we observed the flaw at about 567,130 samples (slightly before 400 samples in the fingerprint): filtered edited data (top) and its associated fingerprint (second from top), filtered unedited data (second from bottom) and its associated fingerprint (bottom) There are several steps we must take to fully apply wavelet fingerprinting as an error detection technique. First, we must modify our algorithm to search entire tracks for errors and display every instance where one occurs, rather than just the several hundred samples where it most likely occurs. We can do this essentially by placing intensity marks (for instance, a color or a number) at every point along the waveform indicating how likely it is that an error occurs there. When searching for one given type of error, we suspect that these intensity marks will spike at the points of occurrence. Furthermore, we hope to modify our algorithm to accurately detect flaws based on the type of flaw it is. This involves inventing a more flexible and empirical method of detecting a flaw; this could mean not necessarily looking at all levels of the fingerprint, allowing uncertainty in how close matching points are, contracting or expanding the flaw matrix, or assigning match points more liberally or conservatively. It will be a challenge, but our goal is to make our algorithms more adaptive as we continue to investigate more flaws and improve our current detection methods. 140 R. Laney and M. K. Hinders Fig. 3.26 Using the coiflet3 wavelet to de-noise the input data, we observed the flaw at 81,600 samples with the Haar wavelet (between 300 and 400 samples in the fingerprint): filtered edited data (top) and its associated fingerprint (second from top), filtered unedited data (second from bottom) and its associated fingerprint (bottom) Acknowledgements The authors would like to thank Jonathan Stevens for numerous helpful discussions. References 1. UC Santa Barbara Library (n.d.) Cylinder recordings: a primer. Retrieved 2010, from cylinder preservation and digitization project: http://cylinders.library.ucsb.edu/ 2. The National Jukebox project at the Library of Congress, http://www.loc.gov/jukebox/ 3. Smithsonian’s Time-Based Media and Digital Art Working Group, https://www.si.edu/tbma/ about 4. National Digital Information Infrastructure and Preservation Program, http://www. digitalpreservation.gov/about/index.html and https://www.diglib.org/ 5. New York Public Library Preservation Division, https://www.nypl.org/collections/nyplcollections/preservation-division 6. Stanford Media Preservation Lab, https://library.stanford.edu/research/digitization-services/ labs/stanford-media-preservation-lab 7. Columbia Center for Oral History, https://library.columbia.edu/locations/ccoh.html 3 Automatic Detection of Flaws in Recorded Music 141 8. Media Digitization and Preservation Initiative at Indiana University, https://mdpi.iu.edu/index. php 9. Edison phonograph and wax cylinder. http://en.wikipedia.org/wiki/Phonograph and http://en. wikipedia.org/wiki/Phonograph_cylinders 10. Gunnar Johansen musician and composer. https://www.nytimes.com/1991/05/28/obituaries/ gunnar-johansen-85-a-pianist-composer-and-instructor-is-dead.html; https://www.discogs. com/artist/3119653-Gunnar-Johansen-2 11. Hou J, Hinders MK (2002) Dynamic wavelet fingerprint identification of ultrasound signals. Mater Eval 60(9):1089–1093 12. Amara Graps. An introduction to wavelets. Retrieved 2011, from Institute of Electrical and Electronics Engineers, Inc: http://www.amara.com/IEEEwave/IEEEwavelet.html 13. The MathWorks, Inc. (n.d.) Continuous wavelet transform: a new tool for signal analysis. Retrieved 2010, from The MathWorks-MATLAB and SimuLink for Technical Computing: http://www.mathworks.com/help/toolbox/wavelet/gs/f3-1000759.html Chapter 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe Crystal B. Acosta and Mark K. Hinders Abstract Periodontal disease, commonly known as gum disease, affects millions of people. The current method of detecting periodontal pocket depth is painful, invasive, and inaccurate. As an alternative to manual probing, the ultrasonographic periodontal probe is being developed to use ultrasound echo waveforms to measure periodontal pocket depth, which is the main measure of periodontal disease. Wavelet transforms and pattern classification techniques are used in artificial intelligence routines that can automatically detect pocket depth. The main pattern classification technique used here, called a binary classification algorithm, compares test objects with only two possible pocket depth measurements at a time and relies on dimensionality reduction for the final determination. The method correctly identifies up to 90% of the ultrasonographic probe measurements within the manual probe’s tolerance. Keywords Probe · Ultrasonography · Binary classification · Wavelet fingerprint 4.1 Introduction In the clinical practice of dentistry, radiography is routinely used to detect structural defects such as cavities, but too much ionizing radiation is harmful to the patient. Radiography can also only detect defects parallel to the projection path, so cracks are difficult to detect, and it is useless for identifying conditions such as early stage gum disease because soft tissues are transparent to X-rays. Medical ultrasound, however, is safe to use as often as indicated, and computer interpretation software can make disease detection automatic. The structure of soft tissues can be analyzed effectively with ultrasound, and even symptoms such as inflammation can be registered with this C. B. Acosta · M. K. Hinders (B) Department of Applied Science, William & Mary, Williamsburg, VA 23187-8795, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_4 143 144 C. B. Acosta and M. K. Hinders Fig. 4.1 The figure compares a manual periodontal probing, in which a thin metal probe is inserted in between the tooth and gums, versus b the ultrasonographic periodontal probe, in which ultrasound energy propagates through water non-invasively technology [1–6]. An ultrasonographic periodontal probe was developed as a spinoff of NASA technology [7–15]. Periodontal disease is caused by bacterial infections in plaque, and the advanced stages can cause tooth loss when the periodontal ligament holding the tooth in place erodes [16]. Periodontal disease is so widespread worldwide that 10–15% of adults have advanced stages of the disease with deep periodontal pockets that put them at risk of tooth loss [17]. The usual method of detection is with a thin metal probe scribed with gradations marking depth in millimeters (Fig. 4.1a). The dental hygienist inserts the probe down into the area between the tooth and gum to estimate the depth to the periodontal ligament. At best, this method is accurate to ± 1 mm and depends on the force the hygienist uses to push the probe into the periodontal pocket. Furthermore, this method is somewhat painful and often causes bleeding [18]. The ultrasonographic periodontal probe uses high-frequency ultrasound to find the depth of the periodontal ligament non-invasively. An ultrasonic transducer projects high-frequency (10–15 MHz) ultrasonic energy in between the tooth and the gum and detects echoes of the returning wave (Fig. 4.1b). In the usual practice of ultrasonography1 the time delay of the reflection is converted to a distance measurement by using the speed of sound in water (1482 m/s). However, both experimental and simulation waveforms show that the echoes from the anatomy of interest are smaller than the surrounding reflections and noise. Further mathematical 1 In the medical field, ultrasound (ultrasonic) and ultrasonography (ultrasonographic) are used inter- changeably to describe MHz frequency acoustic waves being used for diagnostic applications. However, in dentistry, ultrasound or ultrasonic refers to kHz-frequency vibrating tools (scalers) used for cleaning, which is a different application of the same technology. Here, we prefer to use “sonographic” or “ultrasonographic” to imply the diagnostic application of ultrasound energy. 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 145 techniques, including a wavelet transform and pattern classification techniques, are required in order to identify pocket depth from these ultrasound waveforms. The following sections describe the development of machine learning algorithms for the ultrasonographic periodontal probe. Section 4.2 describes related work involving applying wavelet transforms for feature extraction, applying pattern classification to ultrasound waveforms, as well as other applications of pattern classification in the medical field. Section 4.3 describes the materials and methods for gathering the ultrasound data. Sections 4.4 and 4.5 describe the feature extraction and selection procedures in preparation for a binary classification algorithm described in Sect. 4.6. Section 4.6.2 describes dimensionality reduction, and Sect. 4.6.3 explores classifier combination. The results are presented in Sect. 4.7 and statistically analyzed in Sect. 4.8, with conclusions presented in Sect. 4.9. 4.2 Related Work Previous researchers have made use of the wavelet transform for pattern classification applications [19, 20]. One option is to integrate wavelets directly by capitalizing on the orthogonal property of wavelets to estimate the class density functions [21]. However, most applications of wavelets to pattern recognition focus on feature extraction techniques [22]. One common method involves finding the wavelet transform of a continuous variable (sometimes a signal) and computing the spectral density, or energy, which is the square of the coefficients [23, 24]. Peaks of the spectral density or the sum of the density can be used as features and have been applied to flank wear estimation in turning processes and classifying diseased lung sounds [23] as well as to evaluating astronomical chirps and the equine gait [24]. This technique is similar to finding the cross-correlation but is only one application of wavelets to signal analysis. One alternate method is to deconstruct the signal into an orthogonal basis, such as Laguerre polynomials [25]. Another technique is the adaptive wavelet method [26–29] which stems from multiresolution analysis [30]. Multiresolution analysis applies a wavelet transform using an orthogonal basis resulting in filter coefficients in a pyramidal computation scheme, while adaptive wavelet analysis uses a generalized M-band wavelet transform to similarly achieve decomposition into coefficients and inserts those coefficients into matrix form for optimization. Adaptive wavelets result in efficient compression and have the advantage of being widely applicable. Wavelet methods are also often applied for feature extraction in images, such as for shape characterization and to find boundaries [31, 32]. Wavelets are particularly useful for detecting singularities, and in 2D data spaces this results in an ability to identify corners and boundaries. Shape characterization itself is a precursor to template matching in pattern classification, in which outlines of objects are extracted from an image and matched to known shapes from a library. Other techniques are similar to those described above, including multiresolution analysis, which is also similar to the image processing technique of matched filters [33–37]. Either libraries of known wavelets or wavelets constructed from the original signal are used to match the signal of interest [33]. Pattern recognition then proceeds in a variety of 146 C. B. Acosta and M. K. Hinders ways from the deconstructed wavelet coefficients. The coefficients, with minimal cost, can be used as features [34] or the results from each sequential step can be correlated individually [35]. To reduce dimensionality, sometimes projection transforms are precursors to decomposition [36, 37]. Some authors have constructed a rotationally invariant projection that deconstructs an image into sub-images and transforms the mathematical space from 2D to 1D [21, 38, 39]. Also, constructing a set of orthonormal bases, just as in the case of adaptive wavelets above, remains useful for images [40]. Because of the dimensionality, computing the square of the energy is cumbersome and so a library of codebook vectors is necessary for classification [41]. There have been many successful applications of pattern recognition in ultrasound, some of which include wavelet feature extraction methods. In an industrial field, Tansel et al. [42, 43] selected coefficients from wavelet decomposition for feature vectors and were able to use pattern recognition techniques to detect tool failure in drills. Learned and Wilsky [44] constructed a wavelet packet approach, in which energy values are calculated from a full wavelet decomposition of a signal, to detect sonar echoes for submarines. Wu and Du [45] also used a wavelet packet description but found that feature selection required knowledge of the physical space, in this case, drill failure. Case and Waag [46] used Fourier coefficients instead of wavelet coefficients for features that successfully identified flaws in pipes. Comparing several techniques, including features selected from wavelet, time, and spectral domains, Drai et al. [47] identified welding defects using a neural network classifier. Buonsanti et al. [48] compared ultrasonic pulse-echo and eddy techniques to detect flaws in plates using a fuzzy logic classifier and wavelet features relevant to the physical domain. In the medical field, an early example of applying classification techniques is evidenced in the work of Momenan et al. [49] in which features selected offline are used to identify changes in tissues as well as clustering in medical ultrasound images. Bankman et al. [50] classified neural waveforms successfully with the careful application of preprocessing techniques such as whitening via autocorrelation. Meanwhile, Kalayci et al. [51] detected EEG spikes by selecting eight wavelet coefficients from two different Debauchies wavelet transforms for application in a neural network. Tate et al. [52] similarly extracted wavelet coefficients as well as other prior information to attempt to identify vegans, vegetarians, and meat eaters by their magnetic resonance spectra. Mojsilovic et al. [53] applied wavelet multiresolution analysis to identify infarcted myocardial tissue from medical ultrasound images. Georgiou et al. [54, 55] used wavelet decomposition to calculate scale-averaged wavelet power up to a threshold and detected the presence of breast cancer in ultrasound waveforms by means of hypothesis testing. Also using multiresolution techniques, Lee et al. [56] further selected fractal features to detect liver disease. Alacam et al. [57] improved existing breast cancer characterization of ultrasonic B-mode images by adding fractional differencing and moving average polynomial coefficients as features. Hou [15, 58] developed the dynamic wavelet fingerprint (DWFP) method to transform wavelet coefficients to 2D binary images. The method is general, e.g., [59–65], but when directly applied to data obtained from a fourth-generation ultrasonographic periodontal probe tested on 14 patients [66, 67], the authors were able 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 147 to resolve at best around 75% of the pocket depths accurately within a tolerance of 1.5 mm. When the tolerance is adjusted to the 1 mm precision of the manual probe, the highest success rate per patient drops to about 60%. The authors did not use pattern classification but instead relied on image recognition techniques to detect pocket depth. 4.3 Data Collection Previous publications describe the fabrication of several generations of prototype ultrasonographic probes [7–15, 68–70]. The body of the fifth-generation probe is manufactured similar to other dental hand pieces, with the 10 MHz piezoelectric transducer located in the head of the probe. Water is the ultrasonic coupling agent, so the fabrication of the probe allows water to be funneled through the custom-shaped tip. The rest of the components used to control the probe, including the generalpurpose pulser-receiver and the custom-built water flow interface device, are shown in Fig. 4.2. Fig. 4.2 All devices necessary to conduct experiments using the ultrasonographic periodontal probe are shown, including a laptop computer, b interface device, c pulser-receiver, d ultrasonographic probe, e manual probe, and f foot pedal to control water flow 148 C. B. Acosta and M. K. Hinders Fig. 4.3 Pictured is the apparatus shown in Fig. 4.2 being operated by Prof. Gayle McCombs, with Jonathan Stevens monitoring the computer Clinical tests were performed at Old Dominion University’s (ODU) Dental Hygiene Research Center on 12 patients using both the ultrasonographic periodontal probe and the traditional manual periodontal probe (Fig. 4.3). The human subjects protocol was approved by IRBs at both William and Mary and ODU. Most of the measurements were for healthy subjects, with 76% of the data measuring 3 mm or less. Figure 4.4 shows a distribution of the data versus manual pocket depth. The ultrasonographic probe measurement was always performed first, and 30 repeated ultrasonic waveforms were recorded at the standard 6 tooth sites in at most 28 teeth per patient. Simulations using a finite integration technique were also performed by our group prior to the clinical tests [68]. The simulations animate the generation of ultrasonic energy in the transducer and its propagation through the tissues. The output of the simulations includes a time-domain waveform recorded by the transducer. Figure 4.5 compares the simulated waveform with a sample filtered experimental waveform. Note that the region of interest in between the first and second reflections from the tip does not register any obvious reflections from the periodontal pocket. To detect pocket depth, further mathematical abstractions are required. The basic steps of the artificial intelligence are as follows: 1. Feature Extraction: Get wavelet fingerprints with DWFP and find fingerprint properties using image recognition software. 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 149 Fig. 4.4 The histogram shows the distribution of manual periodontal pocket depth measurements for the ODU clinical tests. Note that the majority of the population has healthy (≤3 mm) pocket depths a) Simulated waveform Amplitude 4000 2000 0 −2000 2.6 2.4 Time (s) 2.2 2 1.8 2.8 3.2 −5 x 10 3 Amplitude a) Experimental waveform 1000 0 −1000 0.4 0.6 0.8 1 1.2 Time (s) 1.4 1.6 1.8 2 −5 x 10 Fig. 4.5 Sample waveforms from a simulation and b experiment are compared here. The large reflections indicated in the boxes are artifacts from the tip geometry. The echo from the bottom of the pocket, which would occur in between the rectangles, is not apparent 150 C. B. Acosta and M. K. Hinders 2. Feature Selection: Find μ and σ (Eqs. 4.3 and 4.4) of each property for all waveforms and collect key values where the average property varies per pocket depth. 3. Binary Classification: Compare the selected features in a leave-one-out routine using well-known pattern classification schemes and only two possible pocket depths at a time. 4. Dimensionality Reduction: Evaluate binary labels using four different methods to collapse each binary choice to one label. 5. Classifier Combination: Combine the predicted labels from the most precise tests to improve accuracy and/or spread of labels. Each step is explained in further detail in the sections that follow. It is also important to note that because of the computation time of this task, the computer algorithms were adapted to run on William and Mary’s Scientific Computing Cluster.2 4.4 Feature Extraction Since reflections from the periodontal ligament are not apparent in the ultrasound waveforms, advanced mathematics are needed to identify pocket depth. The clinical trials yielded ultrasonic waveforms ws,k (t), where there are k = 1, . . . , 30 waveforms recorded for each tooth site s = 1, . . . , 1470. The continuous wavelet transform can be written as +∞ w(t)ψa,b dt. (4.1) C(a, b) = −∞ Here, w(t) represents a square-integrable 1D function and ψ(t) represents the mother wavelet. The mother wavelet is transformed in time (t) and scaled in frequency ( f ) using a, b ∈ R, respectively, where a ∝ t and b ∝ f , in order to form the ψa,b (t) in Eq. 4.1. To extract features from ws,k (t) for classification, we use the dynamic wavelet fingerprinting technique (DWFP) (Fig. 4.6), which creates binary contour plots of the wavelet transform coefficients C(a, b). The DWFP, along with a mother wavelet ψ(t) and some scaling and translation parameters a, b, applied to the waveforms ws,k (t) yields an image, I (Fig. 4.7): ws,k (t) DW F P(ψa,b ) −→ Is,k (a, b). (4.2) Preliminary tests showed that mother wavelets Debauchies 3 (db3) and Symelet 5 (sym5) showed promise for this application, and so both were applied in this technique. The resulting image I contains fingerprint-like binary contours of the initial waveform ws,k (t) at tooth site s. 2 http://www.compsci.wm.edu/SciClone/. 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 151 Fig. 4.6 The DWFP technique [58] begins with a the ultrasonic signal, where it generates b wavelet coefficients indexed by time and scale. Then c the coefficients are sliced and projected onto the time-scale plane (d) The next step is to perform image processing through MATLAB’s toolbox in order to gather properties of the fingerprints in the waveform ws,k . First, the binary image I is labeled with the 8-connected objects (Fig. 4.7d), allowing each individual fingerprint in I to be recognized as a separate object using the procedure in Haralick and Shapiro [71]. Next, properties are measured from each fingerprint. Some of these properties include counting the on- and off-pixels in the region, but many involve finding an ellipse matching the second moments of the fingerprint and measuring properties of that ellipse, such as eccentricity. In addition to the orientation measure provided by the ellipse, another measurement of inclination relative to the horizontal axis was determined by Horn’s method for a continuous 2D object [72]. Lastly, further properties were measured by determining the boundary of the fingerprint and fitting second- or fourth-order polynomials. Table 4.11 in the appendix summarizes the features as well as the selected indices from Sect. 4.5. The image processing routines result in fingerprint properties ps,k,n [t], where n represents the image processing-extracted fingerprint properties. These properties are discrete in time because the values of the properties are matched to the time value of the fingerprint’s center of mass. Linear interpolation yields a smoothed 152 C. B. Acosta and M. K. Hinders a) Original Waveform 2000 0 −2000 0.5 1 1.5 2 2.5 −5 x 10 b) Windowed and Filtered Waveform 50 0 −50 −100 1.3 1.4 1.35 1.45 1.5 1.55 1.6 −5 x 10 c) DWFP 20 40 1.25 1.3 1.4 1.35 1.45 1.5 1.55 1.6 −5 x 10 d) Labeled Fingerprints 20 40 1.25 20 1.3 1.35 1.4 1.45 1.5 1.55 1.6 0 −5 x 10 Fig. 4.7 In a–c, the DWFP is shown applied to a filtered window of the original data. Image recognition software through MATLAB is used to recognize each individual fingerprint (d) and measure their image properties array of property values, ps,k,n (t). Figure 4.8a shows the sparse values of the DWFP orientation values for one tooth site, while Fig. 4.8b shows the smoothed values. 4.5 Feature Selection It is now possible to begin reducing the dimensionality of the features extracted from the DWFP technique. The reasoning is three-fold. First, most of the measurements do not likely correspond to pocket depth, since the periodontal pocket is a discrete event in the ultrasonic waveform while the features extracted from the DWFP technique are a 1D array in the time domain. Another reason to cull information from the wavelet fingerprint property dataset is that it is too large to manipulate even on the available 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe b) Smoothed values of Orientation for one tooth site Repeated Waveforms Repeated Waveforms a) Actual values of Orientation for one tooth site 5 10 15 20 25 153 5 10 15 20 25 30 30 0.5 0.5 1.5 1 −5 Time (s) x 10 1.5 1 −5 Time (s) x 10 Fig. 4.8 The image created by the actual wavelet fingerprint properties a has been smoothed using a linear approximation for the intervening points (b). The smoothing shown here is indexed by one value of s, ψ, n but all values of k computing cluster. Lastly, the set of extracted features, ps,k,n (t), is too large to use directly in pattern classification routines. One dimension can be eliminated immediately by averaging over the repeated waveforms, k, so p = ps,n (t). The remaining dimensionality reduction will be performed by selecting values of p at times ti for particular fingerprint properties n i so the feature vector at tooth site s, f s , will be formed of f s (i) = ps,ni (ti ). Then the classification matrix will have rows corresponding to s and columns corresponding to f s (i). To select features of interest, we look for values of the fingerprint properties that are different, on average, for different measured values of pocket depth. Therefore, for each property n, we find the mean (4.3) and standard deviation (4.4) over that property for all tooth sites: N ps,n (t) (4.3) m n (t) = s=1 N 1 ( ps,n (t) − m n t). σn (t) = N s=1 (4.4) The selected features correspond to the times ti at which m n (ti ) varies greatly for different pocket depths while σn (ti ) remains small. For a particular set of properties, n i , and their corresponding times, ti , the feature vector for tooth sites would become 154 C. B. Acosta and M. K. Hinders db3 Orientation: mean feature vector for all pocket depths 50 1 2 3 4 5 6 7 40 30 20 10 0.4 0.6 0.8 1 1.2 Time (s) 1.4 1.6 1.8 −5 x 10 db3 Orientation: STD of feature vector for all pocket depths 25 1 2 3 4 5 6 7 20 15 10 5 0 0.4 0.6 0.8 1 1.2 Time (s) 1.4 1.6 1.8 −5 x 10 Fig. 4.9 The feature vector created by averaging the properties in Fig. 4.8 over the repeated waveforms is plotted here. The first is the average over k for each pocket depth (Eq. 4.3), and the second is the standard deviation (Eq. 4.4). The red boxes indicate time values of the property selected for classification (Table 4.11), when the mean differs while the standard deviation is low f s = { ps,ni (ti )}. Figure 4.9 shows an example, with the regions of interest marked out by the red boxes. This feature selection process was performed interactively, not automatically, though it would be possible to apply this technique with thresholds on the mean and standard deviation. Also, it is important to note that the black vertical lines mark the boundaries of the window regions used in the original wavelet fingerprinting. The wavelet fingerprints distort along the edges of the window, so the often extreme behavior of the feature vectors near these points is disregarded. In the end, 54 different features were selected from the DWFP properties from two different mother wavelets (see Table 4.11 in the appendix). Two other methods of reducing the dimensionality of the extracted feature space are possible, however. The method described above first averaged over the k repeated 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 155 waveforms, but it is also possible to avoid reducing dimensionality until after classification. One option is to build feature vectors for each of the k repeated waveforms and classify them separately. This yields feature vectors of the form f s,k = { ps,k,ni (ti )}, which are used for training and testing of the classifier related to each of the k = 1 . . . 30 waveforms. The extra k th dimension is collapsed later. Another way is to use the f s for training and the f s,k for testing the classifier, again reserving dimensionality reduction until after classification. All three of these basic methods were performed: 1. Train and test the classifier with features f s , 2. Train and test the classifier f s,k for each k = 1 . . . 30, and 3. Train the classifier with f s and test it with f s,k . 4.6 Classification: A Binary Classification Algorithm Once the features have been generated for the wavelet fingerprint properties, we then apply various standard pattern classification techniques from the PRTools catalog of MATLAB functions [73]. Many of these are Bayesian techniques (Table 4.1) except in this case we have set the prior probabilities as equal for all classes. Unfortunately, when the classification matrix is used with these classification maps in a leave-oneout technique, no less than 60% error is observed. In all of these tests, the map tends to classify most of the waveforms in those pocket depths that have the largest number of objects, namely, 2 and 3 mm, because they are the most populated pocket depths in the clinical dataset (Fig. 4.4). Table 4.1 A list of the standard classifiers from the PRTools catalog of classification functions for MATLAB [73]. These maps were also later used in the binary classification algorithm Classifier Description LDC QDC KLLDC PCLDC LOGLC FISHERC NMC NMSC POLYC SUBSC KNNC Linear classifier using normal densities Quadratic classifier using normal densities Linear classifier using Karhunen–Loève expansion of covariance matrix Linear classifier using principle component analysis Linear logistic classifier Linear minimum least square classifier Nearest mean classifier Scaled NMC Untrained classification with additional polynomial features Subspace classifier k nearest neighbor 156 C. B. Acosta and M. K. Hinders Table 4.2 A sketch of the steps in the binary classification algorithm is shown below. The classification step was performed with [73] INPUT: C Classifier map (such as KNN) f s,k (i) Array of features ds,k Array of manually measured pocket depth values INDICES: s Tooth site index k Index of repeated waveforms i Feature vector index r Repetition index FOR LOOPS: Pick k ∈ 1, . . . , 30 Pick s ∈ 1, . . . , 1470 Pick Binary pocket depth pairs pd1, pd2 ∈ 1, . . . 7, pd1 = pd2 Pick {S1|(s ∈ / S1)&(d S1,k = pd1)} And {S2|(s ∈ / S2)&(d S2,k = pd2)} With |S1| = |S2| REPEAT r Until 90% data sampled Under these conditions CLASSIFY T = C ( f §1,2 ,k , d S1,2 ,k ) TEST L = C · f s,k (i) SAVE Labels L (k, s, pd1, pd2, r ) In order to counter this predisposition of the standard classification schemes to classify all the objects into the highest volume classes, a binary classification scheme was developed. The procedure is similar to the one-versus-one technique of support vector machines [74]. The basic idea is to classify the waveform associated with any tooth site against only two possible classes at a time using standard classifiers from Table 4.1. The training and test sets are divided from the data using a leaveone-out technique. If the number of objects in each class differs, a random sample from the larger class of equal size to the smaller class is chosen for training. This process is repeated until at least 90% of the waveforms from the larger class size have been sampled. With each repetition, the predicted labels are stored. The procedure is labeled a binary algorithm because each classification is restricted to only two possible pocket depth values. Table 4.2 shows a flowchart of our binary classification algorithm. In general, the predicted pocket depth is calculated as the most frequently labeled value, but before discussing the dimensionality reduction of the matrix of labels, we first examine the viability of the binary classification algorithm. 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 157 Table 4.3 In any binary pocket depth pairs pd1, pd2, the table shows the % of pocket depths measurements accurately predicted whenever the test object is one of the binary pairs. The rows correspond to pd1 and the columns to pd2. Note that the larger accuracies occur for largely different pocket depth pairs, which means the classifier tends to classify the tooth site as closer to its manually measured pocket depth value 1 2 3 4 5 6 7 1 2 3 4 5 6 7 0 57.2 64.3 70.1 74.5 81.6 66.7 55.4 0 58.9 64.7 73.5 68.4 44.4 70.1 62.5 0 54.9 59.8 63.2 44.4 75.7 66.2 56.3 0 45.1 68.4 44.4 76.3 70.6 61.8 60.3 0 44.7 55.6 79.1 73.7 61.6 66.2 54.9 0 77.8 74.6 71.5 69 61.8 71.6 65.8 0 4.6.1 Binary Classification Algorithm Examples In our binary classification algorithm, the waveform from one tooth site is tested against binary pairs of all possible pocket depths using the classification matrices, even if the manually measured pocket depth of that tooth site is not one of the classes in the binary choice of pocket depths. To examine the accuracy of this technique, we first test the binary classification algorithm on a set of waveforms when the manually measured pocket depth value is one of the binary choices. Table 4.3 shows the percentage of manually measured pocket depths correctly identified by the binary classification algorithm using a linear Bayes normal classifier (LDC). The grid is indexed by the binary pocket depth pairs, so that since no pocket depth class is tested against itself, the diagonal is set to zero. Note that the numbers near the diagonal are smaller, which implies that the scheme finds it difficult to resolve adjacent pocket depth measurements. The fact that adjacent pocket depths are poorly resolved is a positive feature of the classifier, since that may imply that they share similar characteristics, which we would expect from pocket depths only 1 mm apart. Also, the precision of the manual probe is itself only within 1 mm at best, because the markings on the probe are spaced at 1 mm intervals. So it is possible that a reported manual pocket depth measurement could actually differ by 1 mm, due only to the imprecision of that probe. It is much more difficult to quantify how well the classification scheme works when the manually measured pocket depth is not one of the choices. Consider three examples from the same classifier as above, LDC, using the same classification matrix (Table 4.4). Three different binary pairs are shown. The percent of tooth sites classified in each of the binary pairs is displayed in the table. Note that waveforms will be labeled with the member of the binary pair closest to its manual pocket depth measurement. The accuracy increases the more the binary pocket depth pairs differ. The results imply that the binary classification scheme not only accurately classifies 158 C. B. Acosta and M. K. Hinders Table 4.4 The percent of tooth sites classified in each of the predicted binary pairs is shown for three different binary pair choices Actual pocket depth (mm) Label 1 2 3 4 5 6 7 pd1 pd2 pd1 pd2 pd1 pd2 1 2 1 7 2 5 55.4 44.6 74.6 25.4 83.1 16.9 42.8 57.2 74.6 25.4 70.6 29.4 31.2 68.8 63.2 36.8 50.3 49.7 26.5 73.5 57.8 42.2 38.2 61.8 21.6 78.4 51.0 49.0 26.5 73.5 28.9 71.1 47.4 52.6 18.4 81.6 44.4 55.6 33.3 66.7 22.2 77.8 the ultrasonic waveforms when their manual measurement is one of the binary choices (Table 4.3) but when it is not one of the choices, the binary classification algorithm applies a label closer to the actual value. 4.6.2 Dimensionality Reduction The predicted class labels L returned by the binary classification algorithm are higher dimensional, even if the index k is averaged over before classification, as discussed in Sect. 4.5. Four different methods of dimensionality reduction were performed to yield only one label L p (s) per tooth site s: 1. Majority Rule: The most frequently labeled pocket depth is declared the predicted pocket depth L p (s) = mode(L (k, s, pd1, pd2, r )). 2. Weighted Probability 1: The first method unfairly weights the labels from the repeated index r . This method first finds the most frequently labeled pocket depth in the repeated index and then calculates the most frequently labeled pocket depth L (k, s, pd1, pd2) = mode(L (k, s, pd1, pd2, r )) . L p (s) = mode(L ) 3. Weighted Probability 2:This method creates a normalized vector of weights from the binary pocket depth choice to find the most probable pocket depth w(k, s, pd1, pd2) = 17 L (k, s, pd1, pd2) . L p = max(w · L ) 4. Process of elimination: Statistics from the above four methods are combined in a process of elimination to predict the final pocket depth. 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 159 4.6.3 Classifier Combination The binary classification algorithm as discussed above can be configured in three different ways with respect to the waveform index k, and the dimensionality can be reduced in four different ways as discussed in Sect. 4.6.2. All of these methods were applied using 11 different maps from Table 4.1. To combine the results of these classifiers, each individual classifier was sorted by average accuracy within 1 mm and the highest percent, where the percent ranged from 10 to 95% was combined using the same dimensionality reduction methods described above. The mean or mode of the labels can be calculated, or the most probable label can be calculated. Of these, the mean and highest probability methods are majority voting methods, while the mean method is a democratic voting method. Combining classifiers can often reduce the error of pattern classifiers but can never substitute for proper classification techniques applied when the individual classifiers are formed [75]. 4.7 Results and Discussion The binary classification algorithm was applied as described in the procedure above. The primary method of measuring the success of each technique is finding the percent of waveforms accurately described within 1 mm per pocket depth and averaging over the accuracy per pocket depth. If we instead tried to measure the total number of waveforms accurately described within the manual probe’s 1 mm precision regardless of pocket depth measurement, we tend to select for the classification techniques that accurately describe only the most populated pocket depths. We performed these tests for all seven possible pocket depths from the manually measured dataset, but we also restricted the datasets to the 1–5 mm pocket depths, since there are so few measurements in the 6–7 mm range in our patient population, and we further restricted the possible pocket depths to the 2–5 mm set, since we felt the 1 mm datasets might be poorly described by the ultrasonographic probe because overlapping reflections from inside the tip seem to cover that range. We show here results before classifier combination and after, but only the highest average percent accurately identified within 1 mm will be shown. The following results give the percent correctly identified per pocket depth (% correct) as well as the percent accurately identified within 1 mm tolerance (% close) (Tables 4.5 and 4.6). We also show the results graphically in a chart indexed by the manual periodontal probe and the ultrasonographic periodontal probe. These charts are useful in determining the strengths and weaknesses of each classification routine. Each row was normalized by the number of manual measurements per pocket depth to yield the percent of predicted pocket depth labels for each manual pocket depth. The results displayed in Fig. 4.10a–c show results without combining labels from different binary classification schemes for three different collections of possible pocket depths. Figure 4.10d–f similarly shows results when classifiers are combined. 160 C. B. Acosta and M. K. Hinders b) a) 100 2 75 4 50 6 25 6 4 c) 2 0 100 1 2 3 4 5 75 50 25 5 4 3 d) 2 1 100 2 3 4 5 2 3 4 5 100 75 2 75 50 4 50 25 6 25 0 2 4 f) e) 100 1 2 3 4 5 75 50 25 1 2 3 g) 4 5 0 1 2 3 4 5 75 50 25 2 3 i) 4 5 0 1 2 3 4 5 75 50 25 2 3 100 2 75 3 50 4 25 5 2 4 3 5 0 100 2 75 3 50 4 25 5 2 3 4 5 0 j) 100 1 0 6 h) 100 1 0 4 5 0 100 2 75 3 50 4 25 5 2 3 4 5 0 Fig. 4.10 The percent of tooth sites with a given manual pocket depth measurement (mm) is plotted versus their ultrasonographic periodontal probe measurement. Plots a–c and g–h were classifier results obtained without combination, but d–f and i–j demonstrate the classifier combination results. Plots a–f demonstrate the highest possible average accuracy within 1 mm, while g–j were manually adjusted to include more spread of labels. The array of possible pocket depths used in the classifier is evident in the axes labels 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 161 Histogram of best classification scheme 800 600 400 200 0 1 2 3 US pocket depth labels 4 5 Histogram of revised classification scheme 400 300 200 100 0 1 2 3 US pocket depth labels 4 5 Fig. 4.11 A comparison of labeling in the original (Fig. 4.10b) and revised (Fig. 4.10g) classification schemes for pocket depths 1–5 is shown here to illustrate the improved spread of labels There is a small improvement when classifiers are combined, and the time required for the extra classification is slight as well. Note that only the most accurate results are displayed here, up to 1 mm confidence. If the goal is to label all the waveforms as closely as possible, regardless of spread, this is sufficient. However, as the figures of manual versus ultrasonographic probe results show, especially in the reduced pocket depth cases, the lowest and highest pocket depth values do not receive labels at all in the ultrasonographic probe case. A compromise may be desirable between accuracy and large spread of labels. Figure 4.10g–j shows results for the restricted pocket depth cases with smaller accuracy but more spread in the labels. These were reconfigured to display a balance between a high precision and a larger spread of labels. Figure 4.11 shows how the spread of labels changes in these specially revised cases. Table 4.7 summarizes the average percent of measurements accurately classified within 1 mm per pocket depth. The results above show that the highest average percent accuracy for all the data collected from 12 patients in the ODU clinical trials using a fifth-generation prototype is at best 86.6% within a tolerance of 1 mm. Meanwhile, the best accuracy at that tolerance using the fourth-generation probe from previous methods without using pattern classification was 60% for a single patient [67]. 162 C. B. Acosta and M. K. Hinders Table 4.5 Percent of tooth sites accurately classified by the ultrasonographic probe Classifier PDs used % Correct per pocket depth (mm) combi(mm) nation 1 2 3 4 5 6 No No No Yes Yes Yes Norevised Norevised Yesrevised Yesrevised 7 1–7 1–5 2–5 1–7 1–5 2–5 1–5 40.7 0.6 0.0 35.0 0.0 0.0 0.6 42.2 43.9 3.1 36.9 55.8 3.3 43.9 6.6 24.6 46.0 8.4 28.7 49.9 24.6 8.3 56.9 60.3 20.1 51.5 60.3 56.9 50.0 0.0 0.0 48.0 0.0 0.0 0.0 28.9 0.0 0.0 34.2 0.0 0.0 0.0 100.0 0.0 0.0 100.0 0.0 0.0 0.0 2–5 0.0 3.1 46.0 60.3 0.0 0.0 0.0 1–5 0.0 55.8 28.7 51.5 0.0 0.0 0.0 2–5 0.0 3.3 49.9 60.3 0.0 0.0 0.0 Table 4.6 Percent of tooth sites accurately classified within 1 mm by the ultrasonographic probe Classifier PDs used % Close per pocket depth (mm) combi(mm) nation 1 2 3 4 5 6 7 No No No Yes Yes Yes Norevised Norevised Yesrevised Yesrevised 1–7 1–5 2–5 1–7 1–5 2–5 1–5 71.2 69.5 0.0 66.7 71.8 0.0 69.5 67.8 77.5 70.9 63.8 81.9 71.7 77.5 46.0 99.6 100.0 52.4 100.0 100.0 99.6 53.9 78.9 99.0 65.7 81.4 99.0 78.9 75.5 71.6 73.5 84.3 64.7 75.5 71.6 76.3 0.0 0.0 71.1 0.0 0.0 0.0 100.0 0.0 0.0 100.0 0.0 0.0 0.0 2–5 0.0 70.9 100.0 99.0 73.5 0.0 0.0 1–5 71.8 81.9 100.0 81.4 64.7 0.0 0.0 2–5 0.0 71.7 100.0 99.0 75.5 0.0 0.0 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 163 Table 4.7 Summary table of accuracy within 1 mm (%). The table below shows accuracy (%) within the tolerance of 1 mm for the most accurate and revised classification schemes, which involve more widely spread labels than the original Pocket depths Classifier combination used (mm) No Revised Yes Revised 1–7 1–5 2–5 70.1 79.4 85.9 – 71.3 81 72.0 79.9 86.6 – 76.8 80.6 Bland−Altman statistical analysis for predicted labels (L) and manual labels (M) 8 6 4 L−M [mm] 2 0 −2 −4 L−M vs (1/2)(L+M) μ μ ± 2σ −6 −8 1 2 3 4 5 6 7 (1/2)(L+M) [mm] Fig. 4.12 A sample Bland–Altman statistical plot is generated here from the best combined classifier results for possible pocket depths 1–7 mm. In this figure, only 6% of the labels were outside the range of μ ± 2σ 4.8 Bland–Altman Statistical Analysis We compared the binary classification labels with the manual probe labels using the Bland–Altman method, which is recommended for comparing different measurement schemes [76]. Figure 4.12 shows a sample plot of this method, in which the difference of the ultrasonographic measurements is plotted against the mean of those measurements. Any trend between the difference and the mean of the labels would indicate bias in the measurements. No trend is visible here and the grid-like distribution results from the discrete nature of the labels. The numerical results are displayed by pocket depth in Tables 4.8, 4.9, and 4.10. Included are the values and 164 C. B. Acosta and M. K. Hinders Table 4.8 The table shows the values and confidence intervals at 95% for the mean and the limits of agreement, which is μ ± σ , for classifiers using 1–7 mm Pocket depths Classifier combination 1–7 mm No Yes μ(mm) Lower confidence Upper confidence μ − 2σ Lower confidence Upper confidence μ + 2σ Lower confidence Upper confidence σ (mm) 0.44 0.36 0.51 −3.12 −3.25 −2.98 3.99 3.86 4.12 1.77 0.62 0.55 0.69 −2.83 −2.95 −2.70 4.07 3.94 4.20 1.72 Table 4.9 Bland–Altman statistical analysis for classification schemes using pocket depths 1–5 mm Pocket depths Classifier combination 1–5 mm No Revised Yes Revised μ(mm) 0.21 Lower confidence 0.16 Upper confidence 0.25 μ − 2σ −1.98 Lower confidence −2.06 Upper confidence −1.89 μ + 2σ 2.39 Lower confidence 2.31 Upper confidence 2.47 σ (mm) 1.09 0.30 0.24 0.36 −2.58 −2.69 −2.47 3.18 3.07 3.29 1.44 0.18 0.13 0.22 −1.98 −2.06 −1.90 2.33 2.25 2.42 1.08 0.12 0.07 0.17 −2.17 −2.25 −2.08 2.40 2.32 2.49 1.14 the confidence intervals at 95% of the mean and the mean plus or minus twice the standard deviation (μ ± 2σ ). The results show that the mean of the difference is low, less than 0.5 mm in most cases, and the standard deviation is just above 1 mm for most classification schemes. However, the limits of agreement of μ ± 2σ are large, around 2.5 mm, indicating that some ultrasonographic labels differ widely from the manual labels. It remains to be determined whether the accuracy of the ultrasonographic probe is sufficient. Bassani et al. [76] showed that differences of 1.5 mm in manual pocket depth measurements between different observers are not uncommon, which is closer to the limits of agreement in the 2–5 mm classification schemes. Ahmed [77] compared manual to controlled force probes using Bland–Altman statistics and found that the limits of agreement (μ + σ to μ − σ ) was ± 3.31 mm, which is larger than 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 165 Table 4.10 Bland–Altman statistical analysis for classification schemes using pocket depths 2– 5 mm Pocket depths Classifier combination 2–5 mm No Revised Yes Revised μ(mm) 0.47 Lower confidence 0.43 Upper confidence 0.52 μ − 2σ −1.37 Lower confidence −1.45 Upper confidence −1.30 μ + 2σ 2.32 Lower confidence 2.25 Upper confidence 2.39 σ (mm) 0.92 0.32 0.27 0.37 −1.77 −1.85 −1.68 2.41 2.32 2.49 1.04 0.47 0.43 0.51 −1.37 −1.45 −1.30 2.31 2.24 2.39 0.92 0.22 0.17 0.28 −1.99 −2.08 −1.90 2.44 2.35 2.53 1.11 any but the upper limits of agreement of the 1–7 mm ultrasonographic probe configuration described here. Yang et al. [78] compared measurements of a controlled force probe in four different ways and determined that, in most cases, the measurement error was around 0.5 mm, which is lower than the manual probe’s 1 mm precision. However, controlled force probes tend to be more reproducible than the manual probe, which we are using as our gold standard. Velden [79] showed that, when using controlled force probes, which are more accurate than manual probes, the best agreement occurred when 0.75 N of force was applied, and since the manual probe was used as the gold standard here, we cannot be entirely sure what force was being applied. Rams and Slots [80] compared controlled force probes to manual probes and found that the standard deviation between measurements was between 0.9 and 0.95 mm for lower pocket depths and 1.0 and 1.3 mm for higher pocket depths, and the manual probe almost always had a higher standard deviation than controlled force probes. These values are closer to the standard deviation of the difference in the 2–5 mm and 1–5 mm ultrasonographic probe configurations. The authors also found that the measurements were reproducible within 1.0 mm only in 82–89% of the cases, which is within the range of the most accurate ultrasonographic probe classification schemes described here. Mayfield et al. [81] found a higher degree of reproducibility, 92–96%, when testing two different observers. Lastly, Tupta-Veselicky et al. [82] found that the reproducibility up to 1 mm with a conventional probe was 92.8% when tested at 20 sites 2 hours apart by one examiner, again demonstrating the inaccuracy of the manual probe itself. 166 C. B. Acosta and M. K. Hinders 4.9 Conclusion We have described the development of the ultrasonographic periodontal probe and the clinical tests on 12 patients. The resulting waveforms were analyzed with DWFP processing and image recognition techniques applied to extract numerical measurements from these wavelet fingerprints. These parameters were optimized and averaged to yield training datasets for pattern recognition analysis, where testing datasets were configured for a leave-one-out technique. The binary classification algorithm was described and applied to these classification sets, and the labels from this technique were combined to strengthen the results. Different sets of possible pocket depths can be used, since there are small number of measurements in several of the pocket depths. The results can be configured either to yield the highest average percent of waveforms correctly identified within 1 mm, or they can be configured to yield a larger spread in the type of labels with approximately a 5% decrease in accuracy within 1 mm. Overall, the results from the classification scheme can identify ultrasonographic periodontal probe measurements closely to the manual probe, so that 70.1–86.6% of the ultrasonographic measurements are within 1 mm of the manual measurement. These values are close to those in the literature comparing other periodontal probes to the manual probe and may be due to the imprecision and low reproducibility of the manual probe itself. We conclude that the ultrasonographic periodontal probe is a viable technology and that we can use sophisticated mathematical techniques to extract pocket depth information. To yield better results, a larger training dataset will be required. Acknowledgements The authors would like to acknowledge partial funding support from DentSply, Int’l and from the Virginia Space Grant Consortium. We would also like to thank Gayle McCombs of ODU for acquiring the clinical data and Jonathan Stevens for constructing the investigational device. Appendix The feature selection method described in Sect. 4.5 was applied to collect sample number indices to later construct feature vectors from the fingerprint properties. Table 4.11 describes each property used for both mother wavelets and the sample numbers that were used to select the feature vectors. 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 167 Table 4.11 The types of features extracted using image recognition techniques from MATLAB are described here, along with the sample number indices resulting from the feature selection process described above from fingerprints generated using each mother wavelet (db3, sym5) Feature Description Sample numbers db3 sym5 Inclination Angle between fingerprint and horizontal axis, measured by binary image recognition techniques Orientation Angle between fingerprint and horizontal axis, measured by ellipse matching second moments Eccentricity Eccentricity of ellipse matching second moments of the fingerprint Area Number of pixels in the fingerprint region Y-center Value of y-axis (scale) of the binary image associated with its center of mass Euler number Number of objects in the fingerprint minus the number of holes in the object Deg2fit, p1 First coefficient of second degree polynomial fitting the boundary of the fingerprint Deg2fit, p2 Second coefficient of second degree polynomial Deg2fit, p3 Third coefficient of second degree polynomial First coefficient of fourth degree Deg4fit, p1 polynomial fitting the boundary of the fingerprint Deg4fit, p2 Second coefficient of fourth degree polynomial Deg4fit, p3 Third coefficient of fourth degree polynomial Deg4fit, p4 Fourth coefficient of fourth degree polynomial Deg4fit, p5 Fifth coefficient of fourth degree polynomial 1463, 2300, 2661 1200, 2400 1040, 2300 2215 2850 1150, 1500, 2540 1200, 2300, 2600 2270, 2800 1470, 2600 1400, 2258, 2550, 2650 1572 – 1200, 2672 1166, 1600, 2266, 2566 – 2275 – 1146, 2533 2833 2042, 2250, 2550, 2650 2831 1500, 2275, 2575 1200 1495, 2275, 2575 2850 2565 2865 – References 1. Ghorayeb SR, Bertoncini CA, Hinders MK (2008) Ultrasonography in dentistry. IEEE Trans Ultrason Ferr Freq Cont 55:1256–1266 2. Bains VK, Mohan R, Gundappaand M, Bains R (2008) Properties, effects and clinical applications of ultrasound in periodontics: an overview. Perio (2008) 5:291–302 168 C. B. Acosta and M. K. Hinders 3. Agrawal P, Sanikop S, Patil S (2012) New developments in tools for periodontal diagnosis. Int Dent J 62:57–64 4. Hayashi T (2012) Application of ultrasonography in dentistry. Jpn Dent Sci Rev 48:5–13 5. Marotti J, Heger S, Tinschert J, Tortamano P, Chuembou F, Radermacher K, Wolfar S (2013) Recent advances of ultrasound imaging in dentistry - a review of the literature. Oral Surg Oral Med Oral Pathol Oral Radiol 115:819–832 6. Evirgen S, Kamburoglu K (2016) Review on the applications of ultrasonography in dentomaxillofacial region. World J Radiol 8:50–58 7. Hinders MK, Companion JA (1998) Ultrasonic periodontal probe. In: Thompson DO, Chimenti DE (eds) 25th review of progress in quantitative nondestructive evaluation 18b. Plenum Press, New York, pp 1609–1615 8. Hinders MK, Guan A, Companion JA (1998) Ultrasonic periodontal probe. J Acoust Soc Am 104:1844 9. Hartman SK (1997) Goodbye gingivitis. Virginia Business 9 10. Companion JA (1998) Differential measurement periodontal structures mapping system. US Patent 5,755,571 11. Farr C (2000) Ultrasonic probing: the wave of the future in dentistry. Dent Today 19:86–91 12. Hinders MK, Lynch JE, McCombs GB (2001) Clinical tests of an ultrasonic periodontal probe. In: Thompson DO, Chimenti DE (eds), 28th review of progress in quantitative nondestructive evaluation 21b, pp 1880–1890 13. Lynch JE (2001) Ultrasonographic measurement of periodontal attachment levels. Ph.D thesis Department of Applied Science, College of William and Mary Williamsburg, VA 14. Lynch JE, Hinders MK (2002) Ultrasonic device for measuring periodontal attachment levels. Rev Sci Instrum 73:2686–2693 15. Hou J (2004) Ultrasonic signal detection and characterization using dynamic wavelet fingerprints. Ph.D thesis Department of Applied Science, College of William and Mary Williamsburg, VA 16. Carranza FA, Newman MG (1996) Clinical periodontology, 8th edn. W B Saunders Co, Philadelphia 17. Peterson PE, Ogawa H (2005) Strengthening the prevention of periodontal disease: the who approach. J Periodontol 76:2187–2193 18. Lang NP, Corbet EF (1995) Periodontal diagnosis in daily practice. Int Dent J 45:5–15 19. Varitz R (2007) Wavelet wavelet transform and pattern recognition method for heart sound analysis. US Patent 20070191725 20. Aussem A, Campbell J, Murtagh F (1998) Wavelet-based feature extraction and decomposition strategies for financial forecasting. J Comp Int Finance 6:5–12 21. Tang YY, Yang LH, Liu J, Ma H (2000) Wavelet theory and its application to pattern recognition. World Scientific, River Edge 22. Brooks RR, Grewe L, Lyengar SS (2001) Recognition in the wavelet domain: a survey. J Electron Imaging 10:757–784 23. Nason GP, Silverman BW (1995) The stationary wavelet transform and some statistical applications. In: Antoniadis A, Oppenheim G (eds) Wavelets and statistics lecture notes in statistics. Springer, Germany, pp 281–299 24. Pittner S, Kamarthi SV (1999) Feature extraction from wavelet coefficients for pattern recognition tasks. IEEE Trans Pattern Anal Mach Intel 21:83–88 25. Sabatini AM, A digital-signal-processing technique for ultrasonic signal modeling and classification. IEEE Trans Instrum Meas 50:15–21 26. Coifman R, Wickerhauser M (1992) Entropy based algorithms for best basis selection. IEEE Trans Inform Theory 38:713–718 27. Szu HH, Telfer B, Kadambe S (1992) Neural network adaptive wavelets for signal representation and classification. Opt Eng 31:1907–1916 28. Telfer BA, Szu HH, Dobeck GJ, Garcia JP, Ko H, Dubey A, Witherspoon N (1994) Adaptive wavelet classification of acoustic and backscatter and imagery. Opt Eng 33:2192–2203 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 169 29. Mallet Y, Coomans D, Kautsky J, Vel OD (1997) IEEE Trans Pattern Anal Mach Intel 19:1058– 1066 30. Mallat S, A theory for multiresolution signal processing: the wavelet representation. IEEE Trans Pattern Anal Mach Intel 11:674–693 31. Antoine J-P, Barachea D Jr, RMC, da Fontoura Costa L (1997) Shape characterization with the wavelet transform. Signal Process 62:674–693 32. Yeh CH (2003) Wavelet-based corner detection using eigenvectors of covariance matrices. Pattern Recogn Lett 24:2797–2806 33. Chapa JO, Raghuveer MR (1995) Optimal matched wavelet construction and its application to image pattern recognition. Proc SPIE 2491:518–529 34. Liang J, Parks TW (1996) A translation-invariant wavelet representation algorithm with applications. IEEE Trans Sig Proc 44:224–232 35. Maestre RA, Garcia J, Ferreira C (1997) Pattern recognition using sequential matched filtering of wavelet coefficients. Opt Commun 133:401–414 36. Murtagh F, Starck J-L, Berry MW (2000) Overcoming the curse of dimensionality in clustering by means of the wavelet transform. Comput J 43:107–120 37. Yu T, Lam ECM, Tang YY (2001) Feature extraction using wavelet and fractal. Pattern Recogn Lett 22:271–287 38. Tsai D-M, Chiang C-H (2002) Rotation-invariant pattern matching using wavelet decomposition. Pattern Recogn Lett 23:191–201 39. Du T, Lim KB, Hong GS, Yu WM, Zheng H (2004) 2d occluded object recognition using wavelets. In: 4th International conference on computer and information technology, pp 227– 232 40. Saito N, Coifman RR (1994) Local discriminant bases. Proc SPIE 2303:2–14 41. Livens S, Scheunders P, de Wouwer GV, Dyck DV, Smets H, Winkelmans J, Bogaerts W (2004) 2d occluded object recognition using wavelets. In: Hlavác V, Sára R (eds) Computer analysis of images and patterns V. Springer, Berlin, pp 538–543 42. Tansel IN, Mekdeci C, Rodriguez O, Uragun B (1993) Monitoring drill conditions with wavelet based encoding and neural networks. Int J Mach Tool Manu 33:559–575 43. Tansel IN, Mekdeci C, McLaughlin C (1995) Detection of tool failure in end milling with wavelet transformations and neural networks (wt-nn). Int J Mach Tool Manu 35:1137–1147 44. Learned RE, Wilsky AS (1995) A wavelet packet approach to transient signal classification. Appl Comput Harmon Anal 2:265–278 45. Wu Y, Du R (1996) Feature extraction and assessment using wavelet packets for monitoring of machining processes. Mech Syst Signal Process 10:29–53 46. Case TJ, Waag RC (1996) Flaw identification from time and frequency features of ultrasonic waveforms. IEEE Trans Ultrason Ferr Freq Cont 43:592–600 47. Drai R, Khelil N, Benchaala A (2002) Time frequency and wavelet transform applied to selected problems in ultrasonics nde. NDT& E Int’l 35:567–572 48. Buonsanti M, Cacciola M, Calcagno S, Morabito FC, Versaci M (2006) Ultrasonic pulseechoes and eddy current testing for detection, recognition and characterisation of flaws detected in metallic plates. In: Proceedings of the 9th European conference non-destructive testing. Berlin, Germany 49. Momenan R, Loew MH, Insana MF, Wagner RF, Garra BS (1990) Application of pattern recognition techniques in ultrasound tissue characterization. In: Proceedings of 10th international conference on pattern recognition, vol 1, pp 608–612 50. Bankman IN, Johnson KO, Schneider W (1993) Optimal detection, classification, and superposition resolution in neural waveform recordings. IEEE Trans Biomed Eng 40(8):836–841 51. Kalayci T, Özdamar O (1995) Wavelet preprocessing for automated neural network detection of eeg spikes. IEEE Eng Med Biol 14:160–166 52. Tate R, Watson D, Eglen S (1995) Using wavelets for classifying human in vivo magnetic resonance spectra. In: Antoniadis A, Oppenheim G (eds) Wavelets and statistics. Springer, New York, pp 377–383 170 C. B. Acosta and M. K. Hinders 53. Mojsilovic A, Popovic MV, Neskovic AN, Popovic AD (1995) Wavelet image extension for analysis and classification of infarcted myocardial tissue. IEEE Trans Biomed Eng 44:856–866 54. Georgiou G, Cohen FS (2001) Tissue characterization using the continuous wavelet transform. i. decomposition method. IEEE Trans Ultrason Ferr Freq Cont 48:355–363 55. Georgiou G, Cohen FS, Piccoli CW, Forsberg F, Goldberg BB (2001) Tissue characterization using the continuous wavelet transform. ii. application on breast rf data. IEEE Trans Ultrason Ferr Freq Cont 48:364–373 56. Lee W-L, Chen Y-C, Hsieh K-S (2003) Ultrasonic liver tissues classification by fractal feature vector based on m-band wavelet transform. IEEE Trans Med Imag 22:382–392 57. Alacam B, Yazici B, Bilgutay N, Forsberg F, Piccoli C (2004) Breast tissue characterization using farma modeling of ultrasonic rf echo. Ultrasound Med Biol 30:1397–1407 58. Hou J, Hinders MK (2002) Dynamic wavelet fingerprint identification of ultrasound signals. Mate Eval 60:1089–1093 59. Hou J, Leonard KR, Hinders MK (2004) Automatic multi-mode lamb wave arrival time extraction for improved tomographic reconstruction. Inverse Prob 20:1873 60. Jones R, Leonard KR, Hinders MK (2007) Wavelet thumbprint analysis of time domain reflectometry signals for wiring flaw detection. Eng Intell Syst 15:65–79 61. Bingham J, Hinders M, Friedman A (2009) Lamb wave detection of limpet mines on ship hulls. Ultrasonics 49:706–722 62. Bertoncini C, Hinders M (2010) Fuzzy classification of roof fall predictors in microseismic monitoring. Measurement 43:1690–1701 63. Bertoncini C, Nousain B, Rudd K, Hinders M (2012) Wavelet fingerprinting of radio-frequency identification (rfid) tags. IEEE Trans Ind Electron 59:4843–4852 64. Miller C, Hinders M (2014) Classification of flaw severity using pattern recognition for guided wave-based structural health monitoring. Ultrasonics 54:247–258 65. Lv H, Jiao J, Meng X, He C, Wu B (2017) Characterization of nonlinear ultrasonic effects using the dynamic wavelet fingerprint technique. J Sound Vib 389:364–379 66. Hinders, M. K. and Hou, J. R. (2004) Ultrasonic periodontal probing based on the dynamic wavelet fingerprint. In Thompson, D. O. and Chimenti, D. E. (eds.), 31st Review of Progress in Quantitative Nondestructive Evaluation 24b. AIP Conference Proceedings 67. Hou JR, Rose ST, Hinders MK (2005) Ultrasonic periodontal probing based on the dynamic wavelet fingerprint. Eurasip J on Appl Signal Processing 7:1137–1146 68. Rudd K, Bertoncini C, Hinders M (2009) Simulations of ultrasonographic periodontal probe using the finite integration technique. Open Acoustics 2:1–19 69. Lynch JE, Hinders MK, McCombs GB (2006) Clinical comparison of an ultrasonographic periodontal probe to manual and controlled-force probing. Measurement 39:429–439 70. Hinders MK, McCombs GB (2006) The potential of the ultrasonic probe. Dim Dent Hygiene 4:16–18 71. Haralick RM, Shapiro LG (1992) Computer and Robot Vision. Addison-Wesley 72. Horn BKP (1986) Robot Vision. MIT Press 73. Duin, R. P. W. (2000) PRTools Version 3.0: A Matlab Toolbox for Pattern Recognition. Delft University of Technology 74. Platt, J. C., Cristianini, N., and Shawe-taylor, J. (2000) Large margin dags for multiclass classification. Advances in Neural Information Processing Systems, pp. 547–553. MIT Press 75. Kuncheva LI (2004) Combining pattern classifiers methods. Wiley, New York 76. Bassani DG, Miranda LA, Gustafsson A (2007) Use of the limits of agreement approach in periodontology. Oral Health Prev Dent 5:119–24 77. Ahmed N, Watts TLP, Wilson RF (1996) An investigation of the validity of attachment level measurements with an automated periodontal probe. J Clin Periodontol 23:452–455 78. Yang MCK, Marks RG, Magnusson I, Clouser B, Clark WB (1992) Reproducibility of an electronic probe in relative attachment level measurements. J Clin Periodontol 19:306–311 79. Velden U (1979) Probing force and the relationship of the probe tip to the periodontal tissues. J Clin Periodontol 6:106–114 4 Pocket Depth Determination with an Ultrasonographic Periodontal Probe 171 80. Rams, T. E. and Slots, J. Comparison of two pressure-sensitive periodontal probes and a manual probe in shallow and deep pockets. Int J Periodont Rest Dent, 13, 521–529 81. Mayfield L, Bratthall G, Attström R (1996) Periodontal probe precision using 4 different periodontal probes. J Clin Periodontol 23:76–82 82. Tupta-Veselicky L, Famili P, Ceravolo FJ, Zullo T (1994) A clinical study of an electronic constant force periodontal probe. J Periodontol 65:616–622 Chapter 5 Spectral Intermezzo: Spirit Security Systems Mark K. Hinders Abstract Machine learning methods are sometimes applied in ways that make no sense, and then marketed as software or services. This is increasingly an issue when the sensors that can acquire enormous amounts of real-time data become so very inexpensive, e.g. infrared imagers attached to smartphones. As a light-hearted example of how this could go, we describe using infrared cameras in a security system for ghost detection. It is important to be skeptical of untested and untestable claims of machine learning systems. Keywords Infrared imaging · Machine learning Most camera-based home security systems don’t have sufficient sensitivity to pick up the presence of ghosts and other apparitions. Even the very expensive invisible laser systems that are used in museums and bank vaults are no help. Ghosts can pass through walls at will, so locks on the doors and window sensors provide no protection to pernicious hauntings. Our system is built around two new technologies that make it superior to everything else on the market. The first is infrared thermography, a sort of camera that sees heat variations to a sensitivity of 1/1000th of a degree [1–8]. These cameras came to the public’s attention during the first Gulf War because the Nighthawk Stealth Fighter used them to sneak undetected into downtown Baghdad and drop bombs on Saddam’s key military installations. It’s different from the greenish night vision imagery where available moonlight is simply amplified. Infrared is a sort of invisible light that anything which is even a little bit warm gives off. Contrast comes from even very tiny temperature variations where, for example, the engine of a vehicle or the body heat of a person is shown as “white hot” compared to the colder surroundings. Maintenance engineers can use these infrared cameras to take a simple M. K. Hinders (B) Department of Applied Science, William & Mary, Williamsburg, VA, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_5 173 174 M. K. Hinders snapshot of an electrical breaker box and tell by the infrared heat signature whether any of the circuits are overheating just a bit and ready to short out. Building inspectors can walk around the outside of a new building and take infrared video of the outside walls. Any places where the insulation is letting heat (or cold) leak out show up easily as hot (or cold) spots. Veterinarians are starting to use infrared cameras to detect inflammation in horses’ legs, since the place where the horse is being caused pain will have locally increased blood flow and will hence be a little hotter than other places. Some doctors are even using infrared thermography to detect breast cancer, since tumors stimulate the growth of new blood vessels to feed themselves that is called angiogenesis. This also shows up as a hot spot, and most importantly breast thermography is done without having to squeeze the breast flat like is required in X-ray mammography. A few years ago these IR cameras started showing up as snap-on attachments to smartphones, e.g., FLIR One, which means that the cost of the technology is now low enough that it can be incorporated into residential security systems [9] and for some applications we actually do use dynamic wavelet fingerprints [10]. It’s well known that a variety of ghostly presences can be felt by the hauntee as a localized cold spot (Fig. 5.1), although rarely poltergeists and other hyperactive apparitions will manifest as a very subtle transient heat flash. Our security systems employ very sensitive infrared cameras to watch for anomalous hot or cold spots, and then track them using sophisticated computer processing of the imagery. Artificial Intelligence (AI)-based image processing is the second new technology that sets our system apart from all competitors. It turns out that having a human watch infrared imagery for ghostly hot and cold patterns doesn’t work. Ghosts are somehow able to modulate and morph their temperature signatures as they move in a way such that muggles will almost always miss it. In our many years of research and development, we have demonstrated scientifically that humans are only able to recognize such transients using peripheral vision. Our studies have also proven conclusively that no human can be expected to concentrate for extended periods using only their peripheral vision. We’ve had testees able to do it for 8 or 10 minutes at a stretch, but of course you never know when the haunting is going to happen. We have mimicked human peripheral vision in our AI computer system and then quite obviously make use of the computer’s ability to remain ever vigilant. In our system, each room is outfitted with a small camera unit which can be built into either the ceiling or a light fixture. Only the 2 cm diameter hemispherical omni-directional fisheye lens actually needs to protrude into the room, and each lens has a full 360◦ field of view. One camera per room covers an entire house with no blind spots and no moving parts. Traditional cameras with their focused directional lenses either have to mechanically pan about the room or you have to have several cameras looking in different directions in order to get full coverage. Our computer image processors automatically undistort the images at up to several hundred frames per second, so that even the quickest of little beasties (ghosts, vermin, whatever) are captured without motion blurring. Moreover, once an event is detected we pass those image files to a second AI module based on a dynamically adaptive artificial neural network (ANN) that uses 5 Spectral Intermezzo: Spirit Security Systems 175 Fig. 5.1 The second oldest building at William & Mary is the Brafferton, constructed with funds from the estate of Robert Boyle, the famous English scientist, to house William & Mary’s Indian School. The royal charter of 1693 that established W&M stated as one of its goals “that the Christian faith may be propagated amongst the Western Indians, to the glory of Almighty God....” Over 30 years later, William & Mary Statutes reaffirmed the mission to “teach the Indian boys to read, and write, and vulgar Arithmetick.... to teach them thoroughly the Catechism and the Principles of the Christian Religion.” The only one of the university’s three colonial buildings to have escaped the ravages of fire, the Brafferton nonetheless suffered an almost complete loss of its interior during the Civil War, when the doors and much of the flooring were removed and used for firewood. The window frames and sash are said to have been removed and used in quarters for the Union officers at Fort Magruder. The exterior brick walls of the Brafferton are, however, the most substantially original of the university’s three colonial buildings. The exterior was restored to its colonial appearance in 1932 as part of the Rockefeller Restoration of Williamsburg, and today the Brafferton houses the offices of the president and provost of the university. It’s a bit haunted: http://flathatnews.com/2013/ 10/17/braffertons-running-boy/ wavelet fingerprints to classify the apparition into one of more than a hundred object classes according to their morphing, movements, etc. It then further categorizes the signals using machine learning techniques to identify the particular ghostly presence based on behavior patterns recorded in its database of past sightings. The system is also in continuous Internet contact with our central infrared imagery archival repository so that those ghosts who haunt more than one spatial and/or temporal locality can be tracked over both space and time. A side benefit of our system is that living pets and humans and even mammalian vermin show up very strongly: a mouse gives a signal about ten thousand times as strong as a typical poltergeist. A black cat in a dark room is even stronger. Our cameras are so sensitive that we can easily track the minute heat print left over on 176 M. K. Hinders Fig. 5.2 IR selfie taken with a FLIRone. Human body heat shows the bright colors. Cool orb shows as darker shades (https://www.flir. com) the floor from the cat’s paws. All human presences (Fig. 5.2) are tracked and noted with enough detail so that even the stealthiest cat burglar would be immediately identified and an alarm sounded. Last spring one of our customers had a problem with a raccoon coming in through their doggy door and scarfing down all the pet food. They were initially amused when the intruder was identified, but then subsequently quite concerned about whether the raccoon might be rabid. Review of the imagery by our experts showed conclusively that the body temperature of the raccoon was in the normal range and the animal was healthy, even though it seemed particularly addicted to kibbles and after a few more months would probably have been too fat to fit through the doggy door. I mentioned mice previously, but didn’t explain quite what our system can do. Rats and squirrels are warm blooded, so they show up very strongly. Our system automatically detects, counts, and tracks such critters, of course, but it can also spot a dray of squirrel babies nesting inside of walls because their body heat comes through the sheetrock or plaster or paneling or whatever. That allows exterminators to target their poison/traps and thus use the smallest quantity while being assured of complete pest eradication. Not leaving pizza lying around helps also. 5 Spectral Intermezzo: Spirit Security Systems 177 Now here’s the really cool part about our paranormal security system: countermeasures. Having an alert to the presence of a ghost is, of course, very useful but only in the movies are there ghostbusters that you can call to come and take the apparitions away. Also, if you remember from those movies that trapping and storing the ghosts didn’t actually work out all that well for the Big Apple. I guess I should explain what ghosts actually are, because it’s key to our apparition remediation program. When we die, that part of us that Deepak Chopra would call “information and energy” separates from our physical body and slips off into a different dimension. All of the various religions describe this otherworldly dimension in some form or another and each religion has its own explanation for what happens there and why. Of course, they all agree that being a “good” person makes your subsequent extra-dimensional experience much more pleasant. Ghosts are simply stranded bundles of information and energy that for some reason don’t make the transition to “the other side” as they are supposed to. Dying isn’t so bad I suppose, considering that your fate is precisely what you deserve to get. Eternal damnation and hellfire isn’t exactly something you aspire to, but at least over the eons you might just come to accept that it’s your own damn fault. Nobody deserves to get stuck in between life and death, though, especially since you aren’t actually anybody while in that state because you have no body. You are a nobody, but with full memory of the life you just finished living along with some inkling of what fate has waiting for you just across the chasm. People who think ghosts are able to come and go from “the other side” at will are very much mistaken. Ghosts can’t actually tell you much of anything because they haven’t really gone there yet. Nothing pisses a ghost off more than to be asked questions about what it’s like being dead and whether so-and-so is doing OK there. Probably the best analogy is that Tom Hanks movie “The Terminal” based on the true story of a guy who got trapped at JFK airport when his country’s government was overthrown and customs declared his visa invalid. He couldn’t go home because his passport was from a country that no longer existed, and he couldn’t walk out the door of the airport terminal to New York because he didn’t have a visa for that either. He lived at the airport terminal for years in and among the other travelers and only occasionally glimpsing America out the doors and windows. Initially, he didn’t even speak any English, so he could barely communicate with the airport travelers or customs officials and he had to wash as best he could in the restrooms, sleep on the chairs in quiet parts of the terminal, and return rental baggage carts to the dispenser to get enough money to eat something at the food court. Being a ghost must really suck. What ghosts really want to do is to get on with their travels, and we have figured out a way to get them their visa, so to speak. First I have to explain a bit about the electromagnetic spectrum. The light that we see with our eyes is an electromagnetic wave. It has reflected off of objects with some portion being absorbed, which is what we see as colors. Something that looks green to us is something that absorbs all the other colors of the rainbow but reflects the green part of the spectrum. What’s a little weird is that there are more colors in the rainbow than the human eye can see. Infrared cameras merely see electromagnetic waves that are “too red” for our eyes. At the other extreme is ultraviolet, which you might call “extra purple” because 178 M. K. Hinders it’s more violet than our eyes can see. There are others that are invisible to us, but that we know a lot about. Beyond ultraviolet are X-rays and then gamma rays. In the other direction, beyond infrared are microwaves and radiowaves. You probably never thought about it this way, but the different radio stations are all simply broadcast as different colors of radiowaves and the tuner crystal in a radio is just a sort of prism that separates out the different colors. Here’s what we do to give ghosts a nudge. We record their characteristic signatures in the infrared part of the spectrum using the infrared cameras, and then flip that signal over (what is technically called a frequency inversion) to get the corresponding mirror signal in the ultraviolet. Since ultraviolet is a higher energy part of the electromagnetic spectrum than is the infrared, projecting the ultraviolet image back onto the ghost gives it a gentle but firm push across the divide to where it’s supposed to have been all along. I should have said that ultraviolet is what most people call black light and is perfectly safe. It’s invisible, so you only see it when it makes your psychedelic felt poster glow in that groovy manner hippies liked so much in the 70s. Although it only takes an instant to nudge a ghost across the chasm and the effect is permanent, in most cases, it’s first necessary to record the sequential hauntings by an individual apparition for several dozen visits. Sometimes an individual ghost will be so lethargic that enough information can be acquired from a single visit worth of infrared imagery for the AI system to calculate the correct ultraviolet inversion profile, but most hauntings are surprisingly fleeting and you have to wait through almost a month’s worth of nightly visits. The “extra purple” inversion image projectors aren’t usually installed permanently, but are instead temporarily placed by our expert technicians in the ectoplasmic focal spot identified from the ghost’s demonstrated pattern of behavior in the infrared imagery. Once the ghost has been dispatched the haunting will stop and the unit is discarded much like old-fashioned one-time-use flashbulbs. The IR security system will continue to keep vigilant for other ghosts or mice or cat burglars. Afterlife Afterward I hope you figured out that, of course, this is all pseudoscientific bullsh*t intended to sound scientific. There’s no such thing as ghosts, despite competing ghost tours in places like Colonial Williamsburg. All the most famous haunted houses have been naked frauds, sorry. In Scooby Doo, it was always someone pretending to be a ghost or a ghoul in order to make money off the gullible. I included this little intermezzo to make the point that machine learning seems like magic to a lot of people who don’t have the necessary mathematical background to provide pushback to unscrupulous salesreps spouting pseudoscientific bullsh*t intended to sound scientific. 5 Spectral Intermezzo: Spirit Security Systems 179 References 1. Michael Vollmer, Klaus-Peter Mollmann (2018) Infrared thermal imaging: fundamentals, research, and applications, 2nd edn. Wiley-VCH, Weinheim 2. Meola C (2018) Infrared thermography: recent advances and future trends. Bentham Science Publishers, Sharjah 3. Wong WK, Tan PN, Loo CK, Lim WS (2009) An effective surveillance system using thermal camera. In: 2009 international conference on signal acquisition and processing 4. Gerald C (2000) Holst, common sense approach to thermal imaging. JCD Publishing 5. Kecskes I, Engle E, Wolfe CM, Thompson G (2017) Low-cost panoramic infrared surveillance system. In: Proceedings of SPIE, vol 10178 6. Carlo C (2012) Review article – infrared: a key technology for security systems. Hindawi Publishing Corporation 7. Gade R, Moeslund TB (2014) Thermal cameras and applications: a survey, machine vision and applications. Spring, Berlin 8. Tan TF, Toeh SS, Fow JE, Yen KS (2016) Embedded human detection system based on thermal and infrared sensors for anti-poaching application. In: 2016 IEEE conference on systems, process, and control 9. Trujillo VE, Hinders MK (2019) Container monitoring with infrared catadioptric imaging and automatic intruder detection. SN Appl Sci 1:1680. https://doi.org/10.1007/s42452-019-1721-8 10. Victor E, Trujillo II (2019) Global shipping container monitoring using machine learning with multi-sensor hubs and infrared catadioptric imaging. William and Mary, Department of Applied Science Doctoral Dissertation Chapter 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish Through Holes from Gouges Crystal B. Acosta and Mark K. Hinders Abstract Guided ultrasonic waves are often used in structural health monitoring because the dispersive modal behavior results in differing arrival time shifts for modes traveling through flaws. One implementation of Lamb waves for structural health monitoring is helical ultrasound tomography (HUT) in which an array of transducers scan over the perimeter of the sample resulting in a reconstructed image that accurately depicts the presence of flaws in plates and plate-like objects like pipes. However, these images cannot predict the type or severity of the flaw, which are needed in order to schedule maintenance for the structure. We describe the incorporation of pattern classification techniques to Lamb wave tomographic reconstructions of pipes in order to predict flaw type. The features used were extracted using image recognition techniques on dynamic wavelet fingerprint (DWFP) transforms of the unfiltered ray paths. Features were selected at the anticipated mode arrival times for tomographic reconstructions of a pipe with two different flaw types. The application of support vector machines (SVM) in a multiclass one-versus-one method resulted in a direct comparison of predicted and known flaw types. Keywords Helical ultrasound tomography · Pipe flaw classification · Wavelet fingerprint 6.1 Introduction Structural health monitoring involves the application of sensor technologies to identify and locate defects that develop over time. The addition of pattern recognition techniques to structural health monitoring may help to minimize false positives and false negatives [1]. The feature extraction process used to distinguish between damaged and undamaged samples for specific applications varies, including time-series analysis [2–4], energy [5, 6], Fourier transforms [7], wavelet energy [8, 9], novelty C. B. Acosta · M. K. Hinders (B) Department of Applied Science, William & Mary, Williamsburg, VA, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_6 181 182 C. B. Acosta and M. K. Hinders Fig. 6.1 A tomographic reconstruction of the pipe under study. The gouge flaw shows at the top of the 2D image, while the through-hole is visible in the middle of the image detection [10, 11], and principal component expansion of spectra [12]. Many of these are in situ experiments using classifiers that include discriminants, k-nearestneighbor, support vector machines, and neural networks. The classifiers are usually able to discriminate between specific flaws in the structure. However, these sensing techniques are one dimensional, so that a single interrogation of a region of the structure is associated with a single decision of whether or not a flaw exists in that region. Tomography, on the other hand, is two dimensional, which means the sensors gather data from multiple directions over an area of the structure. Lamb wave tomography has previously been used to detect flaws in plates and locally plate-like structures such as pipes [13–17]. In the process of helical ultrasound tomography (HUT), two parallel transducer arrays are installed around the perimeter of the pipe and guided waves travel between every pair of transducers in the arrays [18]. In the laboratory, the two arrays of transducers are approximated by two single transducers that are moved by motors around the perimeter of the area being scanned. The result of the HUT scan is an array of recorded ultrasonic waveforms for each position of the pair of transducers. For all the possible positions of the transmitting and receiving transducers, the result of the tomographic reconstruction is a 2D image where each pixel relates to the local slowness of that ray path. In this way, flaws in the sample under study are localized. However, tomographic reconstructions are not always possible because producing accurate images requires many ray paths and access to the structure in question may be limited. In addition, the tomographic reconstructions cannot always predict the cause or type of flaw. 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 183 Our research involves applying pattern recognition techniques to Lamb waves generated from a tomographic scan of an aluminum pipe. The application of pattern classification to tomography will be able to identify the source of flaws as well as their location using a limited number of ray paths. Figure 6.1 shows a tomographic reconstruction of the pipe used in this study. This pipe had two flaws, an internal gouge flaw where the transducers began scanning and appears at the top of the image, as well as a through hole that appears in the middle of the image. The hole flaw could be misinterpreted as a circular artifact due to the tomographic reconstruction process. These artifacts are common [19]. The addition of pattern classification to this technology will better identify the hole flaw, reducing the risk identifying artifacts as flaws. There are several instances of pattern recognition studies of pipes in the literature. Ravanbod [20] used ultrasonic pipeline inspection gauges (PIGs) commonly used for inspection of oil pipelines. Neural network decision was made using fuzzy logic in simulated and real data with features to accurately detect the location and type of external and internal flaws, in addition to producing an image of the predicted flaw. Zhao et al. [21] used electromagnetic transducers mounted on a PIG to inspect several steel pipes with artificial flaws introduced. Features were extracted through time-domain correlation analysis, selected with principal component analysis, and classified using discriminant analysis. Flaws of two different types and at least 2.5 mm deep were successfully identified with this method. Lastly, Sun et al. [22] applied fuzzy pattern recognition to detect weld defects in steel pipes using X-rays. However, to our knowledge there are no instances of applying pattern classification to helical ultrasound tomography in pipes for defect characterization. 6.2 Theory The structural acoustics of a pipe is commonly regarded as corresponding to Lamb waves in an unrolled plate [23]. Therefore, the theory of Lamb waves propagating in a pipe can be analogously developed by examining Lamb waves in plates. Lamb waves occur in a plate or other solid layer with free boundaries in which the elastic waves propagate both perpendicularly and within the plane of the plate [24]. The solution to the free plate problem yields symmetric and antisymmetric modes. The phase velocity (c p ) and group velocity (cg ) of Lamb waves in aluminum versus the frequency-thickness product f · d are shown in Fig. 6.2 for the first two symmetric (S0, S1) and antisymmetric (A0, A1) modes. This modal behavior of Lamb waves illustrates their usefulness for NDE inspections. As the Lamb waves propagate along the plate, if the plate has a flaw, such as a gouge or corrosion, then the thickness of the plate is reduced at that point. Since the thickness changes, the frequency-thickness product f · d in Fig. 6.2 changes, resulting in a different arrival time of the mode [25]. Depending on the frequency of the ultrasonic transducer used as well as the material and thickness of the plate, it is often sufficient to detect flaws by measuring the arrival time of the mode. If the experimentally detected arrival time 184 C. B. Acosta and M. K. Hinders Fig. 6.2 The solution to the Rayleigh–Lamb equations for the first two symmetric and antisymmetric modes of cg and c p in an aluminum plate are plotted here. Frequency-thickness product ( f · d) is plotted on the abscissa versus phase and group velocity, respectively. The value of f · d used in the experiments is indicated with a line. Note that four modes are present at the selected f · d value, with S1 appearing first since its velocity is higher. Even a small change of thickness at this f · d value will result in a larger change in mode arrival time.eps of the mode differs from its anticipated value, a flaw can be suspected. Changing the transducer frequency can allow for improved detection capabilities since different frequency-thickness products f · d change the expected arrival times at different rates. The application of pattern classification to tomographic Lamb wave scans of a pipe includes the following steps: 1. Sensing: Helical ultrasound tomography is applied to an aluminum pipe with known flaws, resulting in a 2D array of recorded Lamb waveforms. 2. Feature extraction: An integral transformation, the Dynamic Wavelet Fingerprint (DWFP), is computed on the waveforms. The transformation maps the 1D waveform to a 2D binary image. Image recognition techniques measure properties of those fingerprints over time. 3. Feature selection: Fingerprint properties are chosen at the anticipated mode arrival times for tomographic scans performed at different transducer frequencies. 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 185 4. Classification: The dataset is split via a resampling algorithm into training and testing subsets. The classifier is trained on the training set and tested with the testing set. 5. Decision: The predicted class of the testing set is finalized and the performance of the classifier is evaluated. Formally [26], consider a two-class problem with labels ω1 , ω2 with probabilities of each class occuring is given by p(ω1 ), p(ω2 ). Now consider a feature vector x, which is the vector of measurements made by the sensing apparatus for one ray path. Then x is assigned to class ω j whenever p(ω j |x) > p(ωk |x), k = j. (6.1) By using Bayes theorem, Eq. (6.1) can be rewritten as p(x|ω j ) p(ω j ) > p(x|ωk ) p(ωk ), k = j. (6.2) In this way, the sensed object associated with the feature vector x is assigned to the class ω j with the highest likelihood. In practice, there are several feature vectors xi , each with an associated class label wi taking on the value of one of the ω j . Classification generally involves calculating those posterior probabilities p(ωi |x) using some mapping. If N represents the number of objects to be classified and M is the number of features in the feature vector x, then pattern classification will be performed on the features xi that have an associated array of class labels wi that take on values ω1 , ω2 , ω3 for i = 1, . . . , N . The most useful classifier for this dataset was support vector machines (SVM) [27]. 6.3 Method 6.3.1 Apparatus The experimental apparatus is shown in Fig. 6.3a. In operating the scanner, the transmitting transducer remains fixed while the receiving transducer steps by motor along the circumference of the pipe until it returns to the starting position. The transmitting transducer indexes by one unit step, and the process repeats. This process gives a dataset with helical criss-cross propagation paths which allows for mode-slowness reconstruction via helical ultrasound tomography. In this study, tomographic scans of a single aluminum pipe were used (Fig. 6.1). The pipe was 4 mm thick with a circumference of 19 inches, and the transducers were placed 12.25 in apart. Tapered delay lines were used between the transducer face and the pipe surface. Two different kinds of flaws were introduced into the pipe: a shallow, interior-surface gouge flaw approximately 3.5 cm in diameter, and 186 C. B. Acosta and M. K. Hinders Fig. 6.3 a The experimental apparatus, in which motors drive a transmitting and receiving transducer around the perimeter of the pipe, and b an illustration of the ray path wrapping of the “unrolled” pipe. In (b), the shortest ray path for the same transducer positions is actually the one that wraps across the 2D boundary a through-hole 3/8 in in diameter. The positions of the flaws introduced to the pipe can be seen in Fig. 6.1, where the geometry of the figure corresponds to the unrolled pipe. There were 180 steps of each of the transducers used. Multiple scans of the pipe were performed while the frequency of the transducer ranged from 0.8 − 0.89 MHz in units of 0.1 Mhz. For classification, only three of these frequencies were selected: 0.8 Mhz, 0.84 MHz, and 0.89 MHz. The pipe was scanned under these conditions, then the hole was increased to 1/2 in diameter, and the process was repeated. 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 187 6.3.2 Ray Path Selection In order to perform classification on the ray paths involved, we first calculated which transducer and receiver index positions, called i 1 and i 2 , respectively, correspond to a ray path intersecting one of the flaws. To this end, the physical specifications of the pipe are mapped to that of a plate, with the positions of the transmitting and receiving transducers on the left- and right-hand sides. Figure 6.3b demonstrates the unwrapping of the pipe into 2D geometry and shows that some transmitting and receiving transducers result in a ray path that wraps around the boundary of the 2D space. The equation for the distance the Lamb wave travels on the surface of the pipe can be derived from the geometry as shown in Fig. 6.4 and is given by [16] s= L 2 + a2 = L 2 + γ 2r 2 , where L is the axial distance between the transducers, a is the arc length distended by the axial and actual distance between the transducers, and r is the radius of the pipe. The variable γ is the smallest angle between the transducers, γ = min {(φ1 − φ2 + 2π ), |φ1 − φ2 |, (φ1 − φ2 − 2π )} , where φ1 and φ2 are the respective angles of the transducers. The transmitting and receiving transducers have indices represented by i 1 and i 2 , respectively, so that i 1 , i 2 = 1, . . . , 180. Then if L is the distance between the transducers, and Y is the circumference of the pipe, both measured in centimeters, then the abscissa position of the transducers is x1 = 0 and x2 = L, and the radius of the pipe is given by r = Y/(2π ). The indices can be converted to angles using the fact that there are 180 transducer positions in one full rotation around the pipe. This gives Fig. 6.4 The derivation of the equation for the distance a ray path travels on the surface of a pipe is shown, where L is the axial distance between the transducer, s is the distance traveled by the ray path, and a is the arc length distended by the axial and actual distance between the transducers 188 C. B. Acosta and M. K. Hinders φ1 = i 1 2π 180 and similarly for φ2 . Substituting these into the expression for γ , the minimum angle is 2π min {(i 1 − i 2 + 180), |i 1 − i 2 |, (i 1 − i 2 − 180)} (6.3) γ = 180 and the axial distance between the transducers is already given as f z = L. Then substitution into Eqn. (6.3) yields the helical ray path distance between the transducers. The positions of the flaws were added to the simulation space, and ray paths were drawn between the transducers using coordinates (x1 , y1 ) and (x2 , y2 ) that depend on i 1 and i 2 , as described above. If the ray path intersected any of the flaws, that ray path was recorded to have a class label corresponding to that type of flaw. The labels included no flaw encountered (ωi = 1), gouge flaw (ωi = 2), and hole flaw (ωi = 3). For the identification of class labels, the flaws were approximated as roughly octagonal in shape, and the ray paths were approximated as lines with a width determined by the smallest pixel size, which is 0.1 · (Y/180) = 0.2681 mm. In reality, Lamb waves have some horizontal spread. These aspects of the ray path simulation may result in some mislabeled ray paths. 6.4 Classification As already mentioned, classification will be performed on ray paths with a limitation on the distance between the transducers. In preliminary tests, it was noted that when all the distances were considered, many of the ray paths that actually intersected the hole flaw were labeled as intersecting no flaws. The explanation is that Lamb waves tend to scatter from hole flaws [28], which means no features will indicate a reduction in thickness of the plate, but the distance traveled will be longer. Therefore, limiting the classification on ray paths of a certain distance improve the classification by reducing the influence of scattering effects. 6.4.1 Feature Extraction Let D represents the distance limit. The next step of classification is to find Mmany features for feature vectors xi such that si ≤ D, i = 1, .., N . Feature extraction used dynamic wavelet fingerprinting (DWFP), while feature selection involved either selecting points at the relevant mode arrival times for a tomographic scan using a single transducer frequency, or selecting points at only one or two mode arrival times for three different transducer frequencies at once. 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 189 6.4.2 DWFP The DWFP technique (Fig. 6.5) applies a wavelet transform on the original timedomain waveform, which results in “loop” features that resemble fingerprints. It has previously shown promise for a variety of applications including an ultrasonographic periodontal probe [29–32], detection of ultrasonic echoes in thin multilayered structures [33], and structural monitoring with Lamb waves [17, 34–36]. The Lamb wave tomographic waveforms were fingerprinted using the DWFP algorithm without any preprocessing or filtering. Let φi (t) represent a waveform selected from a Lamb wave tomographic scan (i = 1, . . . , N ). The first step of the DWFP (Fig. 6.5a, b) involves applying a wavelet transform on each of the waveforms. The continuous wavelet transform can be written as +∞ φ(t)ψa,b (t)dt. (6.4) C(a, b) = −∞ Here, φ(t) represents a square-integrable 1D function, where we are assuming φ(t) = φi (t), and ψ(t) represents the mother wavelet. The mother wavelet is transformed in time (t) and scaled in frequency ( f ) using a, b ∈ R, respectively, where a ∝ f and b ∝ t, in order to form the ψa,b (t) in Eq. (6.4). The wavelet transform on a single waveform (Fig. 6.5a) results in wavelet coefficients (Fig. 6.5b). Then, a slicing algorithm is applied to create an image analogous to the gradient of the wavelet coefficients in the time-scale plane, resulting in a binary image, I (a, b). The mother wavelets selected were those that previously showed promise for similar applications, including Debauchies 3 (db3) and Symelet 5 (sym5). The resulting image I contains fingerprint-like binary contours of the initial waveform φi (t). The next step is to apply image processing routines to collect properties from each fingerprint object in each waveform. First, the binary image I is labeled with the 8-connected objects, allowing each individual fingerprint in I to be recognized as a separate object using the procedure in Haralick and Shapiro [37]. Next, properties Fig. 6.5 The DWFP technique begins with (a) the ultrasonic signal, where it generates (b) wavelet coefficients indexed by time and scale, where scale is related to frequency. Then the coefficients are sliced and projected onto the time-scale plane in an operation similar to a gradient, resulting in (c) a binary image that is used to select features for the pattern classification algorithm 190 C. B. Acosta and M. K. Hinders are measured from each fingerprint. Some of these properties include counting the on- and off-pixels in the region, but many involve finding an ellipse matching the second moments of the fingerprint and measuring properties of that ellipse such as eccentricity. In addition to the orientation measure provided by the ellipse, another measurement of inclination relative to the horizontal axis was determined by Horn’s method for a continuous 2D object [38]. Lastly, further properties were measured by determining the boundary of the fingerprint and fitting 2nd or 4th order polynomials. The image processing routines result in fingerprint properties Fi,ν [t] relative to the original waveform φi (t), where ν represents an index of the image processingextracted fingerprint properties (ν = 1, . . . , 17). These properties are discrete in time because the values of the properties are matched to the time value of the fingerprint’s center of mass. Linear interpolation yields a smoothed array of property values, Fi,ν (t). 6.4.3 Feature Selection The DWFP algorithm results in a 2D array of fingerprint features for each waveform, while only a 1D array of features can be used for classification. For each waveform φi (t), the feature selection procedure finds M-many features xi, j , j ≤ M from the 2D array of wavelet fingerprint features Fi,ν (t). In this case, the DWFP features selected were wavelet fingerprint features that occurred at the predicted mode arrival times for all fingerprint features under study. At the frequency-thickness product used, there were are four modes available: S0, S1, A0, and A1. However, as Fig. 6.6 Expected arrival times from tomographic scan at i1=45, i2=135 1500 1000 500 0 Waveform S0 S1 A0 A1 −500 −1000 −1500 0 1000 2000 3000 4000 5000 Samples Fig. 6.6 A sample waveform for a single pair of transducer positions (i 1 , i 2 ) = (45, 135) is shown here along with the predicted Lamb wave mode arrival times 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 191 shows, the S1 arrival time often occurs early in the signal, and there may not always be DWFP fingerprint feature available. In addition, because there were several different transducer frequencies studied, in order that the number of features remained manageable, there were two different feature selection schemes used. 1. All modes: All four mode arrival times for all 17 fingerprint properties from both mother wavelets are used, but only one transducer frequency is studied from the range {0.8, 0.84, 0.89} MHz. There are M = 136 features selected. 2. All frequencies: One or two mode arrival times for all 17 fingerprint properties from both mother wavelets are used for all three frequencies at once. The modes used include S0, A0, A1, in which case there are M = 102 features used. There were also combinations of S0& A0, S0& A1, and A0& A1 mode arrival times used for all properties, frequencies, and mother wavelets, in which case there were M = 204 features selected. The class imbalance problem must be considered for this dataset [39]. The natural class distribution for the full tomographic scan has a great majority of ray paths intersecting with no flaws. In this case, only 3% of the data intersects with the hole flaw, and 10.5% intersects with the gouge flaw. These are poor statistics to build a classifier. Instead, ray paths were randomly selected from the no flaw cases to be included for classification so that |ω1 |/|ω2 | = 2. In the resulting class distribution used for classification, 9% of the ray paths intersect with the hole flaw and 30% intersect with the gouge flaw, so that the number of ray paths used for classification is reduced from N = 32, 400 to N = 11, 274 for all ray path distances. One advantage of limiting the ω1 cases is so that classification can proceed more rapidly. Randomly selecting the ω1 cases to be used does not adversely affect the results, and later, the ω1 ray paths not chosen for classification will be used to test the pipe flaw detection algorithm. 6.4.4 Summary of Classification Variables The list below provides a summary of the variables involved in the classification design. 1. The same pipe was scanned twice, once when the hole was 3/8 in diameter, and again when the hole was enlarged to 1/2 in diameter. 2. The pipe had two different flaws, a gouge flaw (ω2 ) and a hole flaw (ω3 ). The ray paths that intersected no flaw (ω1 ) were also noted. 3. A range of transducer frequencies was used, including 0.8 − 0.89MHz. For classification, only three of these frequencies were selected: 0.8MHz, 0.84MHz, and 0.89MHz. 4. Classification was restricted by ray paths that had a maximum path length D such that s ≤ D. There were 91 different path lengths for all transducer positions. Three different values of D were selected to limit the ray paths selected for classification. 192 5. 6. 7. 8. 9. C. B. Acosta and M. K. Hinders These correspond to the 10th, 20th, and 91st path length distances, or D10 = 31.21 cm, D20 = 31.53 cm, and D91 = 39.38 cm. The latter case considers all the ray path distances. For the feature extraction, two different wavelets were used (db3 and sym5), and 17 different wavelet fingerprint properties were extracted. Two different feature selection schemes were attempted, varying either the modes selected or the frequencies used. One classifier was selected here (SVM). Other classifiers (such as quadratic discriminant) had lower accuracy. Classification will be performed on training and testing datasets drawn from each individual tomographic scan using the reduced class distribution. The resulting flaw detection algorithm will be tested with the ω1 ray paths that were excluded from the distribution used for classification. The classification tests on the 1/2 in hole tomographic scan will use the tomographic scan of the 3/8 in hole solely for training the classifier and the 1/2 in hole raypaths solely for testing the classifier. This does not substantially alter the expression of the classification routine in Table 6.1 6.4.5 Sampling The SVM classifier was used on the classifier configured by the options listed above in Sect. 6.4.4. However, SVM is a binary classifier, and three classes were considered in this study. Therefore, the one-versus-one approach was used, in which pairs of classes are compared at one time for classification, and the remaining class is ignored [40]. The process is repeated until all permutations of the available classes are considered. In this case, classification compared ω1 versus ω2 , ω1 versus ω3 , and ω2 versus ω3 . For each pair of binary classes, the training and testing sets were split via bagging by randomly selecting roughly twice as many samples from the more highly populated class as the less populated class for training the SVM classifier and splitting those sets in half for training and testing. The process is repeated until each ray path has been selected several times for training. The results are collapsed by majority rule normalized by the number of samples drawn in the bagging process. Table 6.1 displays pseudocode representing the sampling algorithm that splits the data into training and testing sets and the means by which the majority vote is decided. 6.5 Decision As Table 6.1 shows, a single configuration of the classifier variables C (described in Sect. 6.4.4) takes an array of feature vectors xi , i = 1, . . . , N and their corresponding class labels wi ∈ {ω1 , ω2 , ω3 } and produces an array of predicted class labels λi after 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 193 Table 6.1 The sampling algorithm used to split the data into training and testing sets for SVM classification is described here. It is similar to bagging dimensionality reduction. Each index i corresponds to a single ray path between the two transducer indices. The classifier performance can be evaluated by measuring its accuracy, which is defined as A(ωk ) = |(wi = ωk )&(λi = ωk )| , i = 1, . . . , N ; k = 1, 2, 3. |(wi = ωk )| (6.5) However, a further step is required in order to determine whether or not there are in fact any flaws present in the pipe scanned by Lamb wave tomography. The predicted labels of the ray paths are not sufficient to decide whether or not there are flaws. Therefore, we will use the ray path drawing algorithm described in Sect. 6.3.2 to superimpose the ray paths that receive predicted labels λi in each class ω1 , ω2 , ω3 . If several ray paths intersect on a pixel, their value is added, so the higher the value of the pixel, the more ray path intersections occur at that point. This technique averages out the misclassifications that occur in the predicted class labels. The ray paths for 194 C. B. Acosta and M. K. Hinders a) b) c) 0 0 5 5 10 10 15 15 15 20 20 20 25 C [cm] 5 10 C [cm] C [cm] 0 25 25 30 30 30 35 35 35 40 40 40 45 45 0 10 20 L [cm] 30 15 10 5 45 0 0 10 20 L [cm] 4 2 0 30 8 6 10 20 L [cm] 4 2 30 6 Fig. 6.7 The ray paths that have received a predicted class label of (a) no flaw, (b) gouge flaw, and (c) hole flaw are drawn here. The classifier configured here used all the available ray path distances (D = D91 ) 0 0 5 5 10 10 15 15 15 20 20 20 25 C [cm] 5 10 C [cm] C [cm] c) b) a) 0 25 25 30 30 30 35 35 35 40 40 40 45 45 45 0 10 0 1 30 20 L [cm] 2 0 0 10 30 20 L [cm] 2 4 0 10 0 1 20 L [cm] 2 30 3 Fig. 6.8 The ray paths that have received a predicted class label of (a) no flaw, (b) gouge flaw, and (c) hole flaw are drawn here. Unlike Fig. 6.7, the ray path lengths limit used was D20 the predicted class labels associated with each flaw type are drawn separately, so that the more ray paths that have been predicted to be associated with a particular flaw intersect in the same region, the more likely a flaw exists at that point. Figures 6.7 and 6.8 show the ray paths drawn for each predicted class label. Both of these examples were configured for the 3/8 in hole and selected the A0 mode from all three transducer frequencies. However, Fig. 6.7 classified all ray paths regardless of length, and Fig. 6.8 restricted the ray path distances to D20 . The ray path intersections were originally drawn at 10× resolution but the images were later smoothed by 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 195 averaging over adjacent cells to 1× resolution. The larger the pixel value, the more ray paths intersected in that region, and the more likely a flaw exists at that point. Note that in Fig. 6.8b, c, the pixel values are higher at the locations where their respective flaws actually exist. Figure 6.8a also shows more intersections at that point, because of the geometry of the pipe and the way the transducers scanned, those regions of the pipe actually did have more ray path intersections than elsewhere. That’s why those regions were selected to introduce flaws into the pipe. But the largest pixel value when the ray paths predicted to intersect no flaws is smaller than the ray paths predicted to intersect either of the other flaws. Also, Fig. 6.8a shows a higher average pixel value, showing that more intersections occur throughout the pipe rather than focused on a smaller region, such as Fig. 6.8b, c. However, due to the scattering of the Lamb waves, Fig. 6.7 is not as precise. Figure 6.7a does show a higher average pixel value, but Fig. 6.7b, c seem to both indicate hole flaws in the middle of the pipe, and neither seem to indicate the correct location of the gouge flaw. In order to automate the flaw detection process from these ray intersection plots, image recognition routines and thresholds were applied. The process of automatically detecting flaws in the image of ray path intersections (U ) include 1. Apply a threshold h I to the pixel values of the image. If |U > h I | = ∅, then no flaws are detected. Otherwise, define U = U (U > h I ) 2. Check that the size of the nonzero elements of U are smaller than half its total area 1 U (i1, i2) < |U |. 2 i1 i2 3. Apply a threshold h a to the area of U . If i1 i2 U (i1, i2) ≤ h a , then no flaws are detected. Otherwise, decide that U accurately represents the flaws in the region. Return an image of U . This algorithm is only intended to be performed on the ray paths that predicted a flaw location. It does not tend to work well on the ray path intersections that predicted no flaw, since those depend on the geometry of the object being scanned. Figure 6.9 shows predicted flaw locations relative to the sample ray path intersection images given in Figs. 6.7 and 6.8. Specifically, Fig. 6.9a, b gives the predicted flaw locations from Fig. 6.7b, c, while Fig. 6.9c, d gives the predicted flaw locations from Fig. 6.8b, c. Note that the images produced by the classifier are designed to accept all ray path distances (Fig. 6.9a, b) shows a flaw location that is more discrete and closer to the size of the actual flaw, but it is not as accurate at predicting whether or not flaws exist as the classifier designed to accept ray paths for classification restricted by path length (Fig. 6.9c, d). 196 C. B. Acosta and M. K. Hinders b) 1 flaws detected a) 3 flaws detected 0 Circumference of pipe (cm) Circumference of pipe (cm) 0 10 20 30 40 0 20 10 Length of pipe (cm) 10 20 30 40 0 30 30 d) 1 flaws detected c) 1 flaws detected 0 Circumference of pipe (cm) 0 Circumference of pipe (cm) 20 10 Length of pipe (cm) 10 20 30 40 0 20 10 Length of pipe (cm) 30 10 20 30 40 0 20 10 Length of pipe (cm) 30 Fig. 6.9 The results of the automatic pipe flaw detector routine on the ray path intersection images in Figs. 6.7 and 6.8 that predicted flaws are shown here. The approximate location of the flaw is shown in the image. Here, subplots (a) and (b) show the predicted flaw locations for the gouge and hole flaws, respectively, when all distances were used in classification. Similarly, subplots (c) and (d) show the predicted flaw locations for the gouge and hole flaws when the ray path distance was limited to D20 6.6 Results and Discussion 6.6.1 Accuracy The classifier variables described in Sect. 6.4.4 were explored using the classification routine sketched out in Table 6.1 with predicted labels collapsed majority voting. The accuracy of the classifier will be examined before the predictions of the pipe flaw detector are given. Each table of the accuracy results shows the classifier configuration, including two different tomographic scans of the aluminum pipe with different hole flaw diameters, as well as the frequencies and modes selected for classification. The classifiers were also limited by the maximum distance between the transducers. For ease of comparison, the average accuracy over each class type ωk is computed in the last column. 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 197 Table 6.2 shows the accuracy A(ωk ) for both tomographic scans of the pipe when only a single transducer frequency was used at a time, and therefore all the available Lamb wave modes were selected. The table is grouped by different maximum distance lengths used in the classifier. Meanwhile, Tables 6.3, 6.4 and 6.5 show the classifier accuracy for classifiers in which features were selected for only certain modes but for three different values of the transducer frequency at once. These classification results show that no classifier configuration had more than 80% average accuracy per class. Many classifiers scored lower than 50% average accuracy. However, the detection routine does not require high accuracy classifiers, and as already mentioned, the diffraction effect of Lamb waves around the hole flaw could certainly explain the lower accuracy of the classifiers for that hole type. Other patterns that emerge in the data show that the smaller values of the maximum path length distance yield higher average accuracy by as much as 20%. In addition, the feature selection strategy that utilizes only select modes for all three transducer frequencies seems to work slightly better than the feature selection routine that selects all modes for only one transducer frequency. Lastly, the tomographic scan of the aluminum pipe with the 3/8 in hole seems to have higher accuracy than the scan of the pipe with the 1/2 in hole. These results make sense given the fact that the 3/8 in hole was trained and tested on unique subsets of the ray paths drawn from the 3/8 in hole tomographic scan, which yields somewhat optimistic results. Meanwhile, the results for the 1/2 in hole scan were tested with ray paths drawn from the 1/2 in hole scan but trained with ray paths selected from the 3/8 in hole scan. 6.6.2 Flaw Detection Algorithm As mentioned in Sect. 6.5, a more promising way of determining whether or not flaws exist in the pipe is to use the ray path simulation routine described in Sect. 6.3.2. Because the result of the pipe flaw detection routine is an image (Fig. 6.9), and an exhaustive depiction of the resulting flaw prediction images would take up an unnecessary amount of space, the results will be presented in qualitative format. The pipe detection routine was tested on all of the classifier configurations and a judgment was made about which flaw the routine was predicting. Figure 6.10 shows some of the different ways that the flaw detector routine can present results. Figure 6.10a shows a correctly identified flaw location for the gouge flaw, despite the fact that the flaw detector claims the flaws drawn at the bottom of the plot to be separate. That’s because the pipe in reality wraps the 2D plot into a cylinder, so the regions identified as flaws at the bottom of the plot actually can be associated with the correct location of the gouge flaw at the top of the plot. However, Fig. 6.10b shows an incorrect assignment of an indentation flaw in the center of the plot, where the hole flaw actually resides. Similarly, Fig. 6.10c shows a correct identification of the hole flaw, despite the fact that the two identified flaws in the middle of the plot do not seem connected. This is an artifact of the drawing algorithm. But Fig. 6.10d 198 C. B. Acosta and M. K. Hinders Table 6.2 The classifier accuracy when the feature selection utilized only one transducer frequency and all modes are shown below. The results are grouped by the maximum ray path distance selected. The average accuracy per class is shown in the last column Hole Dia. [in] Frequencies used (MHz) Modes used A(ω) (%) D (cm) ω1 ω2 ω3 Average 3/8 0.80 S0, A0, A1 D10 77.7 62.4 72.4 70.8 3/8 0.84 S0, A0, A1 D10 78.5 67.3 42.9 62.9 3/8 0.89 S0, A0, A1 D10 70.8 77.6 69.5 72.6 1/2 0.80 S0, A0, A1 D10 78.5 43.2 22.8 48.2 1/2 0.84 S0, A0, A1 D10 84.0 10.9 30.9 41.9 1/2 0.89 S0, A0, A1 D10 52.3 62.4 39.8 51.5 3/8 0.80 S0, A0, A1 D20 70.5 50.6 62.8 61.3 3/8 0.84 S0, A0, A1 D20 73.9 59.9 41.9 58.5 3/8 0.89 S0, A0, A1 D20 68.3 70.0 58.1 65.5 1/2 0.80 S0, A0, A1 D20 75.8 47.7 23.3 48.9 1/2 0.84 S0, A0, A1 D20 93.0 21.2 14.6 42.9 1/2 0.89 S0, A0, A1 D20 52.6 64.9 34.4 50.6 3/8 0.80 S0, A0, A1 D91 93.8 14.1 27.4 45.1 3/8 0.84 S0, A0, A1 D91 93.8 14.6 19.8 42.7 3/8 0.89 S0, A0, A1 D91 91.7 17.0 28.2 45.6 1/2 0.80 S0, A0, A1 D91 73.7 42.7 16.1 44.1 1/2 0.84 S0, A0, A1 D91 82.9 35.7 7.6 42.0 1/2 0.89 S0, A0, A1 D91 41.9 61.1 15.4 39.5 shows a falsely-positive-identified region at the top of the plot predicted to be a hole flaw where the gouge flaw actually exists. This is a failure of specificity rather than sensitivity. Tables 6.6, 6.7 and 6.8 show the performance of the pipe flaw detector routine. All of the different frequencies, modes, and hole diameters are displayed in each table, but Table 6.6 shows the results for a maximum ray path length of D = D10 used in the classification, and likewise Tables 6.7 and 6.8 use a maximum ray path distance of D20 and D91 , respectively. The latter case considers all possible ray path lengths. The results are grouped in this way because the threshold values of h I and h a had to be adjusted for each value of D. Obviously, the smaller the value of D, the fewer number of ray paths included in the classification and the smaller the intersection between the predicted ray paths. The last two columns show the pipe flaw detector routine applied to ray paths with a predicted class of either gouge flaw (ω2 ) or hole flaw (ω3 ), respectively. The type of flaw predicted by the detection routine is displayed under those columns, including the possibility of no flaws displayed (ω1 ) or both the gouge flaw and the indentation flaw displayed (ω2 , ω3 ). These judgments are made according to the guidelines for false positives and true positives shown in Fig. 6.10. 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 199 Table 6.3 The classifier accuracy when the feature selection used only one or two modes at once but all transducer frequencies are shown here. The maximum ray path distance used here was D10 Hole Dia. Frequencies Modes A(ω) (%) [in] used (MHz) used D (cm) ω1 ω2 ω3 Average 3/8 3/8 3/8 3/8 3/8 3/8 1/2 1/2 1/2 1/2 1/2 1/2 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 S0 A0 A1 S0, A0 S0, A1 A0, A1 S0 A0 A1 S0, A0 S0, A1 A0, A1 D10 D10 D10 D10 D10 D10 D10 D10 D10 D10 D10 D10 71.5 84.2 75.7 87.5 81.0 85.3 82.8 67.5 67.7 82.3 81.8 74.6 65.0 72.6 68.0 79.2 74.9 80.9 34.0 46.9 34.7 40.6 38.6 49.8 61.9 64.8 52.4 73.3 74.3 72.4 16.3 34.1 40.7 29.3 23.6 39.8 66.1 73.8 65.4 80.0 76.7 79.5 44.4 49.5 47.7 50.7 48.0 54.8 Table 6.4 The classifier accuracy when the feature selection used only one or two modes at once but all transducer frequencies are shown here. The maximum ray path distance used here was D20 Hole Dia. Frequencies Modes A(ω) (%) [in] used (MHz) used D (cm) ω1 ω2 ω3 Average 3/8 3/8 3/8 3/8 3/8 3/8 1/2 1/2 1/2 1/2 1/2 1/2 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 0.80, 0.84, 0.89 S0 A0 A1 S0, A0 S0, A1 A0, A1 S0 A0 A1 S0, A0 S0, A1 A0, A1 D20 D20 D20 D20 D20 D20 D20 D20 D20 D20 D20 D20 69.4 76.5 74.2 81.8 79.9 81.1 82.6 65.4 67.8 80.1 79.5 71.0 57.8 64.6 64.1 66.5 70.3 71.7 33.2 51.0 46.0 44.5 43.4 56.2 60.5 54.9 54.9 66.0 64.7 60.9 21.3 31.6 30.4 26.9 25.7 30.0 63.6 70.5 69.2 74.2 75.1 76.4 57.9 58.2 56.9 62.3 61.5 63.6 These results show that Table 6.6, in which D = D10 = 31.21 cm is the maximum permitted path length used for classification, has the highest number of true positives and the fewest number of false negatives for all the configurations used. Table 6.8 shows the worst discriminative ability when all the possible path lengths (D = D91 ) were applied. However, only one of these classifier configurations needs to perform 200 C. B. Acosta and M. K. Hinders Table 6.5 The classifier accuracy when the feature selection used only one or two modes at once but all transducer frequencies are shown here. All ray paths were used in this configuration (D = D91 ) Hole Dia. [in] Frequencies used (MHz) Modes used A(ω) (%) D (cm) ω1 ω2 ω3 Average 3/8 0.80, 0.84, 0.89 S0 D91 92.6 14.3 24.1 43.7 3/8 0.80, 0.84, 0.89 A0 D91 93.1 14.6 25.9 44.5 3/8 0.80, 0.84, 0.89 A1 D91 92.6 15.2 29.7 45.8 3/8 0.80, 0.84, 0.89 S0, A0 D91 93.8 16.1 32.1 47.3 3/8 0.80, 0.84, 0.89 S0, A1 D91 93.2 16.7 31.7 47.2 3/8 0.80, 0.84, 0.89 A0, A1 D91 93.5 16.9 32.7 47.7 1/2 0.80, 0.84, 0.89 S0 D91 67.4 40.8 17.0 41.7 1/2 0.80, 0.84, 0.89 A0 D91 59.6 44.8 20.9 41.8 1/2 0.80, 0.84, 0.89 A1 D91 57.8 48.8 19.9 42.2 1/2 0.80, 0.84, 0.89 S0, A0 D91 64.1 46.0 23.5 44.6 1/2 0.80, 0.84, 0.89 S0, A1 D91 64.6 48.5 19.6 44.2 1/2 0.80, 0.84, 0.89 A0, A1 D91 59.4 53.0 22.5 44.9 Table 6.6 The predicted flaw types using the pipe flaw detection routine are shown here. The columns show the predicted ray path types that whose intersection forms the figures as shown in Figs. 6.9 and 6.10. The rows show the different variables used to configure the classifier. All of the classifiers shown here used D = D10 , h I = 1, h a = 10 Classifier configuration Flaw type tested Hole Frequencies Modes Dia. (cm) used (MHz) used ω2 ω3 3/8 0.8 S0, A0, A1 ω2 ω3 3/8 0.84 S0, A0, A1 ω2 ω3 3/8 0.89 S0, A0, A1 ω2 ω3 3/8 0.8, 0.84, 0.89 S0 ω2 ω3 3/8 0.8, 0.84, 0.89 A0 ω2 ω3 3/8 0.8, 0.84, 0.89 A1 ω2 ω3 3/8 0.8, 0.84, 0.89 S0, A0 ω2 ω3 3/8 0.8, 0.84, 0.89 S0, A1 ω2 ω3 3/8 0.8, 0.84, 0.89 A0, A1 ω2 ω3 1/2 0.8 S0, A0, A1 ω2 ω1 1/2 0.84 S0, A0, A1 ω1 ω3 1/2 0.89 S0, A0, A1 ω2 , ω3 ω3 1/2 0.8, 0.84, 0.89 S0 ω2 ω1 1/2 0.8, 0.84, 0.89 A0 ω2 , ω3 ω3 1/2 0.8, 0.84, 0.89 A1 ω2 ω3 1/2 0.8, 0.84, 0.89 S0, A0 ω2 ω1 1/2 0.8, 0.84, 0.89 S0, A1 ω2 ω1 1/2 0.8, 0.84, 0.89 A0, A1 ω2 , ω3 ω3 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 201 Table 6.7 Similarly to Table 6.3, the qualitative performance of the pipe flaw detector routine is shown here using D = D20 , h I = 2, h a = 5 Classifier Configuration Flaw Type Tested Hole Frequencies Modes Dia. (cm) used (MHz) used ω2 ω3 3/8 3/8 3/8 3/8 3/8 3/8 3/8 3/8 3/8 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 0.8 0.84 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8 0.84 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 S0, A0, A1 S0, A0, A1 S0, A0, A1 S0 A0 A1 S0, A0 S0, A1 A0, A1 S0, A0, A1 S0, A0, A1 S0, A0, A1 S0 A0 A1 S0, A0 S0, A1 A0, A1 ω2 ω2 ω2 ω2 ω2 ω2 ω2 ω2 ω2 ω2 ω2 ω2 , ω3 ω2 ω2 , ω3 ω2 ω2 ω2 ω2 ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω1 ω1 ω1 ω3 ω1 ω3 ω1 ω1 well in order to select a classifier to be applied in further applications. It is not necessary (or possible) that all combinations of the classifier variables should perform well. Clearly it is possible to find at least one classifier configuration that accurately discriminates between the types of flaws. In addition, the results for the second tomographic scan of the pipe, when the hole diameter was increased to 1/2 in, tends to have more false positives and false negatives than the original scan of the pipe at a hole diameter of 3/8 in. As described above, the classification for the 1/2 in hole pipe was performed using the 3/8 in hole pipe as a training set, so the accuracy of the classifier tends to be lower than that of the 3/8 in pipe when mutually exclusive sets of ray paths drawn from the same tomographic scan were used for training and testing. Since the same threshold values h I , h a were used for all the classifier variables that used the same maximum distance D, it might be possible to adjust these threshold values in the future to optimize for a training set based on a wider variety of data and a testing set drawn from a new tomographic scan. 202 C. B. Acosta and M. K. Hinders Table 6.8 Similarly to Tables 6.3 and 6.5, the qualitative performance of the pipe flaw detector routine is shown here using D = D91 (so all ray paths were used), h I = 3, h a = 20 Classifier Configuration Flaw Type Tested Hole Frequencies Modes Dia. (cm) used (MHz) used ω2 ω3 3/8 3/8 3/8 3/8 3/8 3/8 3/8 3/8 3/8 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 0.8 0.84 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8 0.84 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 0.8, 0.84, 0.89 S0, A0, A1 S0, A0, A1 S0, A0, A1 S0 A0 A1 S0, A0 S0, A1 A0, A1 S0, A0, A1 S0, A0, A1 S0, A0, A1 S0 A0 A1 S0, A0 S0, A1 A0, A1 ω2 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω3 ω2 , ω3 ω1 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 ω2 , ω3 Lastly, recall that the distribution of ray paths drawn from the three classes ω1 , ω2 , ω3 was adjusted so that a smaller number of ω1 cases were randomly selected to be used in classification. The ω1 ray paths not chosen for classification would later be used to test the pipe flaw detection algorithm. The training set used for classification was the same as before but it was tested using the ω1 ray paths originally excluded from classification in the results above. As expected, some of these ray paths were falsely identified by the classifier as ω2 or ω3 . The pipe flaw detection algorithm was performed on these predicted classes to see if the intersection of these misclassified ray paths would yield a falsely identified flaw. In fact, for all of the classifier variables described in Sect. 6.4.4, all concluded with an assignment of ω1 (no flaw identified). 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … b) a) 0 0 10 10 20 20 30 30 40 40 0 20 10 0 30 10 c) 0 10 10 20 20 30 30 40 40 10 20 30 20 30 d) 0 0 203 20 30 0 10 Fig. 6.10 The qualitative evaluation of different flaw images produced by the flaw detector routine for the (a)–(b) gouge and (c)–(d) hole flaws. False positives are shown in subplots (b) and (d), while subplots (a) and (c) show true positives 6.7 Conclusion These results demonstrate classifiers that were able to distinguish between gouge flaws, hole flaws, and no hole present in an aluminum pipe. The type of image produced by the flaw detection routine is similar to the result of the Lamb wave tomographic scan itself, which is a 2D color plot of the changes in Lamb wave velocity over the surface of the pipe. However, because it utilizes pattern classification and image recognition techniques, the method described here can be more specific, identifying the flaw type, and it may be able to work on smaller flaws. The higher accuracy rate of the classifier on limited ray path distances demonstrates that pattern classification aids the successful detection of flaws in geometries where full scans cannot be completed due to limited access. The method does require building a training dataset of different types of flaws, and it can take long computer times to classify and test an entire scan. It may be best used on isolated locations where a flaw is suspected rather than in an exhaustive search over a large area. In order to be most accurate, the training dataset should include flaws from different indi- 204 C. B. Acosta and M. K. Hinders vidual tomographic scans but of a structure similar to the intended application of the intelligent flaw detector. The application of pattern classification techniques to Lamb wave tomographic scans may be able to improve the detection capability of structural health monitoring. Acknowledgements The authors would like to thank Dr. Jill Bingham for sharing the data used in this paper and Dr. Corey Miller for assistance with the tomographic reconstructions. Johnathan Stevens constructed the pipe scanner apparatus. References 1. Mallet L, Lee BC, Staszewski WJ, Scarpa F (2004) Structural health monitoring using scanning laser vibrometry: II. Lamb waves for damage detection. Smart Mater Struct 13:261–269 2. Farrar CR, Nix DA, Duffey TA, Cornwell PJ, Pardoen GC (1999) Damage identification with linear discriminant operators. In: Proceedings of the 17th Intl Modal Anal Conference, vol 3727, pp 599–607 3. Lynch JP (2004) Linear classification of system poles for structural damage detection using piezoelectric active sensors. Proc SPIE 5391:9–20 4. Nair KK, Kiremidjian AS, Law KH (2006) Time series-based damage detection and localization algorithm with application to the asce benchmark structure. J Sound Vib 291:349–368 5. Kercel SW, Klein MB, Pouet BF (2000) Bayesian separation of lamb wave signatures in laser ultrasonics. In: Priddy KL, Keller PE, Fogel DB (eds) Proceedings of the SPIE, vol 4055, pp 350–361 6. Kessler SS, Agrawal P (2007) Technical Report, Metis Design Corporation 7. Mita A, Fujimoto A (2005) Active detection of loosened bolts using ultrasonic waves and support vector machines. In: Proceedings of the 5th International Workshop Structural Health Monitoring, pp 1017–1024 8. Park S, Yun C-B, Roh Y, Lee J-J (2006) Pzt-based active damage detection techniques for steel bridge components. Smart Mater Struct 15:957–966 9. Park S, Lee J, Yun C, Inman DJ (2007) A built-in active sensing system-based structural health monitoring technique using statistical pattern recognition. J Mech Sci Technol 21:896–902 10. Worden K, Pierce SG, Manson G, Philp WR, Staszewski WJ, Culshaw B (2000) Detection of defects in composite plates using lamb waves and novelty detection. Int J Syst Sci 31:1397– 1409 11. Sohn H, Allen DW, Worden K, Farrar CR (2005) Structural damage classification using extreme value statistics. J Dyn Syst Meas Contr 127:125–132 12. Mirapeix J, Garcia-Allende P, Cobo A, Conde O, Lopez-Higuera J (2007) Real-time arcwelding defect detection and classification with principal component analysis and artificial neural networks. NDT&E Int 40:315–323 13. Lynch JE (2001) Ultrasonographic measurement of periodontal attachment levels. Ph.D. thesis, Department of Applied Science, College of William and Mary, Williamsburg, VA 14. McKeon JCP, Hinders MK (1999) Parallel projection and crosshole lamb wave contact scanning tomography. J Acoust Soc Am 106:2568–2577 15. Hou J, Leonard KR, Hinders MK (2005) Multi-mode lamb wave arrival time extraction of improved tomographic reconstruction. In: Thompson DO, Chimenti DE (eds) Review of Progress in Quantitative Nondestructive Evaluation, vol 24a. Melville, NY pp 736–743 16. Rudd KE, Leonard KR, Bingham JP, Hinders MK (2007) Simulation of guided waves in complex piping geometries using the elastodynamic finite integration technique. J Acoust Soc Am 121:1449–1458 6 Classification of Lamb Wave Tomographic Rays in Pipes to Distinguish … 205 17. Bingham JP, Hinders MK (2009) Lamb wave detection of delaminations in large diameter pipe coatings. Open Acoust (in press) 18. Leonard KR, Hinders MK (2003) Guided wave helical ultrasonic tomography of pipes. J Acoust Soc Am 114:767–774 19. Malyarenko EV, Hinders MK (2001) Ultrasonic lamb wave diffraction tomography. Ultrasonics 39:269–281 20. Ravanbod H (2005) Application of neuro-fuzzy techniques in oil pipeline ultrasonic nondestructive testing. NDT&E Int 38:643–653 21. Zhao X, Varma VK, Mei G, Ayhan B, Kwan C (2005) In-line nondestructive inspection of mechanical dents on pipelines with guided shear horizontal wave electromagnetic acoustic transducers. J Pressure Vessel Technol 127:304–309 22. Sun Y, Bai P, Sun H, Zhou P (2005) Real-time automatic detection of weld defects in steel pipe. NDT&E Int 38:522–528 23. Pierce AD, Kil H (1990) Elastic wave propagation from point excitations on thin-walled cylindrical shells. J Vib Acoust 112:399–406 24. Viktorov IA (1967) Rayleigh and lamb waves: physical theory and applications. Plenum Press, New York 25. Rose JL (1999) Ultrasonic waves in solid media. Cambridge University Press, New York 26. Webb AR (2002) Statistical pattern recognition, 2nd edn. Wiley, Hoboken, NJ 27. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, New York 28. Fromme P, Sayir MB (2002) Measurement of the scattering of a lamb wave by a through hole in a plate. J Acoust Soc Am 111:1165–1170 29. Hou J, Hinders MK (2002) Dynamic wavelet fingerprint identification of ultrasound signals. Mat Eval 60:1089–1093 30. Hinders MK, Hou JR (2005) Ultrasonic periodontal probing based on the dynamic wavelet fingerprint. In: Thompson DO, Chimenti DE, (eds) Review of Progress in Quantitative Nondestructive Evaluation, vol 24b. AIP Conference Proceedings, Melville, New York, pp 1549–1556 31. Hou JR, Rose ST, Hinders MK (2005) Ultrasonic periodontal probing based on the dynamic wavelet fingerprint. Eurasip J Appl Sign Process 7:1137–1146 32. Lynch JE, Hinders MK, McCombs GB (2006) Clinical comparison of an ultrasonographic periodontal probe to manual and controlled-force probing. Measurement 39:429–439 33. Hinders MK, Hou J, McKeon JCP (2005) Ultrasonic inspection of thin multilayers. In: Thompson DO, Chimenti DE (eds) 31st Review of Progress in Quantitative Nondestructive Evaluation 24b, vol 760, pp 1137–1144 (AIP Conference Proceedings) 34. Hou J, Leonard KR, Hinders MK (2004) Automatic multi-mode lamb wave arrival time extraction for improved tomographic reconstruction. Inverse Prob 20:1873–1888 35. Bingham JP, Hinders MK (2009) Lamb wave characterization of corrosion-thinning in aircraft stringers: experiment and 3D simulation. J Acoust Soc Am 126:103–113 36. Bingham JP, Hinders MK, Friedman A (2009) Lamb wave detection of limpet mines on ship hulls. Ultrasonics (in press) 37. Haralick RM, Shapiro LG (1992) Computer and robot vision, vol I. Addison-Wesley, Boston, MA 38. Horn BKP (1986) Robot vision. MIT Press, Cambridge, MA 39. Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6:429–449 40. Melgani F, Bruzzone L (2004) Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans Geosci Remote Sens 42:1778–1790 Chapter 7 Classification of RFID Tags with Wavelet Fingerprinting Corey A. Miller and Mark K. Hinders Abstract Passive radio frequency identification (RFID) tags lack the resources for standard cryptography, but are straightforward to clone. Identifying RF signatures that are unique to an emitter’s signal is known as physical-layer identification, a technique that allows for distinction between cloned devices. In this work, we study the effect real-world environmental variations have on the physical-layer fingerprints of passive RFID tags. Signals are collected for a variety of reader frequencies, tag orientations, and ambient conditions, and pattern classification techniques are applied to automatically identify these unique RF signatures. We show that identically programmed RFID tags can be distinguished using features generated from DWFP representations of the raw RF signals. Keywords RFID tag · Wavelet fingerprint · Pattern classification 7.1 Introduction Radio frequency identification (RFID) tags are widespread throughout the modern world, commonly used in retail, aviation, health care, and logistics [1]. As the price of RFID technology decreases with advancements in manufacturing techniques [2], new implementations of RFID technology will continue to rise. The embedding of RFID technology into currency, for example, is being developed overseas to potentially cut down on counterfeiting [3]. Naturally, the security of these RF devices has become a primary concern. Using techniques that range in complexity from simple eavesdropping to reverse engineering [4], researchers have shown authentication vulnerabilities in a wide range of current RFID applications for personal identification and security purposes, with successful cloning attacks made on proximity cards [5], credit cards [6], and even electronic passports [7]. C. A. Miller · M. K. Hinders (B) Department of Applied Science, William & Mary, Williamsburg, VA, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_7 207 208 C. A. Miller and M. K. Hinders Basic passive RFID tags, which lack the resources to perform most forms of cryptographic security measures, are especially susceptible to privacy and authentication attacks because they have no explicit counterfeiting protections built in. These lowcost tags are the type found in most retail applications, where the low price favors the large quantity of tags required. One implementation of passive RFID tags is to act as a replacement for barcodes. Instead of relaying a sequence of numbers identifying only the type of object a barcode is attached to, RFID tags use an Electronic Product Code (EPC) containing not only information about the type of object, but also a unique serial number used to individually distinguish the object. RFID tags also eliminate the need for line-of-sight scanning that barcodes have, avoiding scanning orientation requirements. In a retail setting, these RFID tags are being explored for point-of-sale terminals capable of scanning all items in a passing shopping cart simultaneously [8]. Without security measures, however, it is straightforward to surreptitiously obtain the memory content of these basic RFID tags and reproduce a cloned signal [9]. An emerging subset of RFID short-range wireless communication technology is near-field communication (NFC), operating within the high-frequency RFID band at 13.56 MHz. Compatible with already existing RFID infrastructures, NFC involves an initiator that generates an RF field and a passive target, although interactions between two powered devices are possible. The smartphone industry is one of the leading areas for NFC research, as many manufacturers have begun putting NFC technology to their products. With applications enabling users to pay for items such as groceries and subway tickets by waving their phone in front a machine, NFC payment systems are an attractive alternative to the multitude of credit cards available today [10]. Similarly, NFC-equipped mobile phones are being explored for use as boarding passes, where the passenger can swipe their handset like a card, even when its batteries are dead [11]. A variety of approaches exist to solve this problem of RFID signal authentication, in which an RFID system identifies an RFID tag as being legitimate as opposed to a fraudulent copy. One such method involves the introduction of alternate tag-reader protocols, including the installation of a random number generator in the reader and tag [12], a physical proximity measure when scanning multiple tags simultaneously [13], or re-purposing the kill PIN in an RFID tag, which normally authorizes the deactivation of the tag [14]. Rather than changing the current tag-reader protocols, we approach this issue of RFID tag authentication by applying a wavelet-based RF fingerprinting technique, utilizing the physical layer of RF communication. The goal is to identify unique signatures in the RF signal that provides hardware-specific information. First pioneered to identify cellular phones by their transmission characteristics [15], RF fingerprinting has been recently explored for wireless networking devices [16], wired Ethernet cards [17], universal software radio peripherals (USRP) [18], and RFID devices [19–22]. Our work builds on that of Bertoncini et al. [20], in which a classification routine was developed using a novel wavelet-based feature set to identify 150 RFID tags collected with fixed tag orientation and distance relative to the reader with RF shielding. That dataset, however, was collected in an artificially protected environment and did not include physical proximity variations relative to the reader, one 7 Classification of RFID Tags with Wavelet Fingerprinting 209 of the most commonly exploited benefits of RFID technology over existing barcodes. The resulting classifier performance therefore can’t be expected to translate to real-world situations. Our goal is to collect signals from a set of 40 RFID tags with identical Electronic Product Codes (EPC) at a variety of orientation and RFID reader frequencies as well as over several days to test the robustness of the classifier. The effects of tag damage in the form of water submersion and physical crumpling are also briefly explored. Unlike Bertoncini et al., we use a low-cost USRP to record the RF signals in an unshielded RF environment, resulting in more realistic conditions and SNR values than previously examined. 7.2 Classification Overview The application of pattern classification for individual RFID tag identification begins with data collection, where each individual RFID tag is read and the EPC regions are windowed and extracted from all of the tag-reader events. A feature space is formed by collecting a variety of measurements from each of the EPC regions. Feature selection then reduces this feature space to a more optimal subset, removing irrelevant features. Once a dataset has been finalized, it is then split into training and testing sets via a resampling algorithm, and the classifier is trained on the training set and tested on the testing set. The classifier output is used to predict a finalized class label for the testing set, and the classifier’s performance can be evaluated. Each tag is given an individual class label; however, we are only interested in whether or not a new signal corresponds to an EPC from the specific tag of interest. The goal for this application is to identify false, cloned signals trying to emulate the original tag. We therefore implement a binary one-against-one classification routine, where we consider one individual tag at a time (the classifier tag), and all other tags (the testing tags) are tested against it one at a time. This assigns one of two labels to a testing signal, either ω = 1 declaring that the signal corresponds to an EPC from the classifier tag, or ω = −1 indicating that the signal does not correspond to an EPC from the classifier tag. 7.3 Materials and Methods Avery-Dennison AD-612 RFID tags were used in this study, which follow the EPCglobal UHF Class 1 Generation 2 (EPCGen2) standards [23]. There were 40 individual RFID tags available, labeled AD01, AD02, . . . , AD40. The experimental procedure involves writing the same EPC code onto each tag with a Thing Magic Mercury 5e RFID Reader1 paired with an omni-directional antenna (Laird Technologies2 ). Stand-alone RFID readers sold today perform all of the signal amplification, modulation/demodulation, mixing, etc. in special-purpose hardware. While this is 1 Cambridge, 2 St MA (http://www.thingmagic.com). Louis, MI (http://www.lairdtech.com). 210 C. A. Miller and M. K. Hinders Fig. 7.1 Experimental setup for RFID data collection is shown, with the RFID reader, tag, antenna, connection to the VSA, and USRP2 software-defined radio beneficial for standard RFID use where only the demodulated EPC is of interest, it is inadequate for our research because we seek to extract the raw EPC RF signal. Preliminary work [20] collected raw RF signals through a vector signal analyzer recording 327.50 ms of data at a 3.2 MHz sampling frequency, a laboratory-grade instrument often used in the design and testing of electronic devices. While the vector signal analyzer proved useful for data collection in the preliminary work, it is not a practical tool that could be implemented in real-world applications. We thus explore the use of an alternate RF signal recording device, a software-defined radio (SDR) system. Software-defined radios are beneficial over standard RFID units as they contain their own A/D converters and the majority of their signal processing is software controlled, allowing them to transmit and receive a wide variety of radio protocols based solely on the software used. The SDR system used here is from the Universal Software Radio Peripheral (USRP) family of products developed by Ettus Research LLC,3 specifically the USRP2, paired with a GnuRadio [24] interface. With board schemes and open-source drivers widely available, the flexibility of the USRP system provides a simple and effective solution for our RF interface (Fig. 7.1). Data was collected in two separate sessions: the first taking place in an environment that was electromagnetically shielded over the span of a week and the second without any shielding taking place at William and Mary (W&M) over the span of 2 weeks. The first session included 25 individual AD-612 RFID tags labeled AD01 − AD25. The same EPC code was first written onto each tag with the Thing Magic Mercury 5e RFID Reader, and no further modifications were performed to the tags. Data was collected by placing one tag at a time in a fixed position near the antenna. Tag 3 Mountain View, CA (http://www.ettus.com). 7 Classification of RFID Tags with Wavelet Fingerprinting 211 Fig. 7.2 Tag orientations used for data collection. Parallel (PL) oblique (OB) and upside-down (UD) can be seen, named because of the tag position relative to the antenna. Real-world degradation was also applied to the tags in the form of water submersion, as well as both light and heavy physical deformations transmission events were recorded for 3 seconds for each tag using the USRP2, saving all data as a MATLAB format. Each tag was recorded at three different RFID reader operating frequencies (902, 915, and 928 MHz), with three tag orientations relative to the antenna being used at each frequency (parallel (PL), upside-down (UD), and a 45◦ oblique angle (OB)). The second session of data collection at W&M included a second, independent set of 15 AD-612 RFID tags labeled AD26 − AD40. Similar to before, the same EPC code was first written onto each tag with the Thing Magic Mercury 5e RFID Reader. The second session was different from the first in that the tags were no longer in a fixed position relative to the antenna, but rather simply held by hand near the antenna. This introduces additional variability into the individual tag-reader events throughout each signal recording. Tag transmission events were again recorded for 3 seconds for each tag. Data was collected at a single operating frequency (902 MHz) with a constant orientation relative to the antenna (parallel (PL)); however, data was collected on four separate days allowing for environmental variation (temperature, humidity, etc.). These tags were then split into two subsets, one of which was used for a water damage study while the other was used for a physical damage study. For the water damage, tags AD26 − AD32 were submerged in water for 3 hours, at which point they were patted dry and used to record data (labeled as Wet). They were then allowed to dry overnight, and again used to record data (Wet-to-Dry). For the physical damage, tags AD33 − AD40 were first lightly crumpled by hand (light damage) and subsequently heavily crumpled (heavy damage). Pictures of the tag orientation variations as well as the tag damage can be seen in Fig. 7.2. From these datasets, four separate studies were performed. First, a frequency comparison was run in which the three operating frequencies were used as training 212 C. A. Miller and M. K. Hinders Table 7.1 There were 40 individual Avery-Dennison AD 612 RFID tags used for this study, split into z subsets Dz for the various comparisons. Tag numbers τi are given for each comparison Comparison type Dz Tags used (τi ) Frequency variations Orientation variations Different day recordings Water damage Physical samage 902, 915, 928 MHz PL, UD, OB Day 1, 2, 3, 4 Wet, Wet-to-dry Light, Heavy damage i i i i i = 1, . . . , 25 = 1, . . . , 25 = 26, . . . , 40 = 26, . . . , 32 = 33, . . . , 40 and testing datasets for the classifiers, collected while maintaining a constant PL orientation. Second, an orientation comparison was performed in which the three tag orientations were used as training and testing datasets, collected at a constant 902 MHz operating frequency. Third, the 4 days worth of constant PL and 902 MHz handheld recordings were used as training and testing datasets. Finally, the classifiers were trained on the 4 days’ worth of recordings, and the additional damage datasets were used as testing sets. The specific tags used for each comparison are summarized in Table 7.1. 7.4 EPC Extraction In most RFID applications, the RFID reader only has a few seconds to identify a specific tag. For example, consumers would not want a car’s key-less entry system that required the user to stand next to the car for half a minute while it interrogated the tag. Rather, the user expects access to their car within a second or two of being within the signal’s range. These short transmission times result in only a handful of individual EPCs being transmitted, making it important that each one is extracted efficiently and accurately. In our dataset, each tag’s raw recording is a roughly 3 second tag-to-reader communication. During this time there is continuous communication between the antenna and any RFID tags within range. This continuous communication is composed of repeated individual tag-reader (T⇔R) events. The structure and duration of each T⇔R event is pre-defined by the specific protocols used. The AD-612 RFID tags are built to use the EPCGen2 protocols [23], so we can use the inherent structure within these protocols to automatically extract the EPCs within each signal. Previous attempts at identifying the individual EPC codes within the raw signals involved a fixed-window cross-correlation approach, where a manually extracted EPC region was required for comparison [20]. With smart window sizing, this approach can identify the majority of EPC regions within a signal. As the communication period is shortened and the number of EPCs contained in each recording decreases, however, this technique becomes insufficient. We have developed an alternative technique that automatically identifies components of the EPCGen2 communication protocols. The new extraction algorithm is 7 Classification of RFID Tags with Wavelet Fingerprinting 213 Table 7.2 The EPC extraction routine used to find the EPC regions of interest for analysis For each tag AD01-AD40 { Raw recorded signal is sent to getEPC.m ◦ Find and window “downtime” regions between tag/reader communication periods ◦ Envelope windowed sections, identify individual T⇔R events For each T⇔R event • Envelope the signal, locate flat [EPC+] region • Set start/finish bounds on [EPC+] • Return extracted [EPC+] regions Each extracted [EPC+] is sent to windowEPC.m ◦ Generate artificial Miller (M=4) modulated preamble ◦ Locate preamble in recorded signal via cross-correlation ◦ Identify all subsequent Miller (M=4) basis functions via crosscorrelation ◦ Extract corresponding bit values ◦ Verify extracted bit sequence matches known EPC bit sequence ◦ Return start/end locations of EPC region Save EPC regions } outlined in Table 7.2, with a detailed explanation to follow. It should be noted that the region identified as [EPC+] is a region of the signal that is composed of a preamble which initiates the transmission, a protocol-control element, the EPC itself, as well as a final 16-bit cyclic-redundancy check. The first step in the EPC extraction routine is to window each raw signal by locating the portions that occur between reader transmission repetitions. These periods of no transmission are referred to here as “downtime” regions. These are the portions of the signal during which the RFID reader is not communicating with the tag at all. An amplitude threshold is sufficient to locate the downtime regions, which divide the raw signal into separate sections, each of which contains several individual T⇔R events. There is another short “dead” zone between each individual T⇔R event where the RFID reader stops transmitting briefly. Because of this, the upper envelope of the active communication region is taken and another amplitude threshold is applied to identify these dead zones, further windowing the signal into its individual T⇔R events. Each individual T⇔R event is then processed to extract the individual [EPC+] region within. First, the envelope of the T⇔R event is taken, which highlights the back-and-forth communicating between the tag and the RFID reader. The [EPC+] region, being the longest in time duration out of all the communication, is relatively consistent in amplitude compared to the up-and-down structure of the signal. Therefore, a region is located that meets a flatness as well as time duration requirements corresponding to this [EPC+]. Once this [EPC+] region is found, an error check is applied that envelopes the region and checks this envelope for outliers that would indicate an incorrectly chosen area. 214 C. A. Miller and M. K. Hinders Fig. 7.3 Excerpt from the EPC Class 1 Gen 2 Protocols showing the Miller basis functions and a generator state diagram [23] The next step in this process is to extract the actual EPC from the larger [EPC+] region. For all Class 1 Gen 2 EPCs, the tags encode the backscattered data using either FM0 baseband or Miller modulation of a subcarrier, the encoding choice made by the reader. The Thing Magic Mercury 5e RFID Reader uses Miller (M=4) encoding, the basis functions of which can be seen in Fig. 7.3. The Miller (M=4) preamble is then simulated and cross correlated with the [EPC+] region to determine its location within. From the end of the preamble, the signal is broken up into individual bits, and cross-correlation is used to determine which bits are present for the remainder of the signal (positive or negative, 0 or 1). Upon completion, the bit sequence is compared to a second known bit sequence generated from the output of the RFID reader’s serial log for verification, shown in Table 7.3. The bounds of this verified bit sequence are then used to window the [EPC+] region down to the EPC only. A single T⇔R event as well as a close-up of a [EPC+] region can be seen in Fig. 7.4. The goal of the classifier is to identify individual RFID tags despite the fact that all the tags are of the same type, from the same manufacturer, and written with the same EPC. The raw RFID signal s(t) is complex valued, so an amplitude representation α(t) is used for the raw signal [25]. An “optimal” version of our signal was also reverse engineered using the known Miller (M=4) encoding methods, labeled s0 (t). We then subtract the raw signal from the optimal representation, producing an EPC error signal as well, labeled e E PC . These are summarized by s(t) = r(t) + ic(t) α(t) = r 2 (t) + c2 (t) e E PC (t) = s0 (t) − s(t) (7.1) This signal processing step of reducing the complex-valued s(t) to either α(t) or e E PC (t) will be referred to as EPC compression. A signal that has been compressed using either one of these methods will be denoted ŝ(t) for generality. Figure 7.5 compares the different EPC compression results on a typical complex RFID signal. 7 Classification of RFID Tags with Wavelet Fingerprinting 215 Table 7.3 Thing Magic Mercury 5e RFID Reader Serial Log (17:05:31.625 - TX(63)): 00 29 CRC:1D26 (17:05:31.671 - RX(63)): 04 29 00 00 00 00 00 01 CRC:9756 (17:05:31.671 - TX(64)): 03 29 00 07 00 CRC:F322 (17:05:31.718 - RX(64)): 19 29 00 00 00 07 00 01 07 72 22 00 80 30 00 30 08 33 B2 DD D9 01 40 35 05 00 00 42 E7 CRC:4F31 (17:05:31.734 - TX(65)): 00 2A CRC:1D25 (17:05:31.765 - RX(65)): 00 2A 00 00 CRC:01E8 (17:05:31.765 - TX(66)): 00 2A CRC:1D25 (17:05:31.796 - RX(66)): 00 2A 00 00 CRC:01E8 (17:05:31.796 - TX(67)): 05 22 00 00 00 00 FA CRC:0845 (17:05:32.093 - RX(67)): 04 22 00 00 00 00 00 01 CRC:7BA9 (17:05:32.093 - TX(68)): 00 29 CRC:1D26 (17:05:32.140 - RX(68)): 04 29 00 00 00 00 00 01 CRC:9756 (17:05:32.140 - TX(69)): 03 29 00 07 00 CRC:F322 (17:05:32.187 - RX(69)): 19 29 00 00 00 07 00 01 07 72 22 00 80 30 00 30 08 33 B2 DD D9 01 40 35 05 00 00 42 E7 CRC:4F31 (17:05:32.203 - TX(70)): 00 2A CRC:1D25 (17:05:32.234 - RX(70)): 00 2A 00 00 CRC:01E8 (17:05:32.234 - TX(71)): 00 2A CRC:1D25 (17:05:32.265 - RX(71)): 00 2A 00 00 CRC:01E8 EPC code (in hex) is in bold. 7.5 Feature Generation For each RFID tag τi , where i is indexed according to the range given in Table 7.1, the EPC extraction routine produces N -many different EPCs ŝi, j (t), j = 1, . . . , N . Four different methods are then used to extract features from these signals: dynamic wavelet fingerprinting (DWFP), wavelet packet decomposition (WPD), higher order statistics, and Mellin transform statistics. Using these methods, M feature values are extracted which make up the feature vector X = xi, j,k , k = 1, . . . , M. It should be noted here that due to the computation time required to perform this analysis, the computer algorithms were adapted to run on William and Mary’s Scientific Computer Cluster.4 4 http://www.compsci.wm.edu/SciClone/. 216 C. A. Miller and M. K. Hinders Amplitude 0.06 0.04 0.02 0 0.625 0.63 0.635 0.64 0.645 0.65 0.655 Time (s) Fig. 7.4 A single T⇔R event with the automatically determined [EPC+] region highlighted in gray. Close-up view of a single [EPC+] region with the EPC itself highlighted in gray 7.5.1 Dynamic Wavelet Fingerprint The DWFP technique is used to generate a subset of the features used for classification. Wavelet-based measurements provide the ability to decompose noisy and complex information and patterns into elementary components. To summarize this process, the DWFP technique first applies a continuous wavelet transform on each original time-domain signal ŝi, j (t)[26]. The resulting coefficients are then used to generate “fingerprint”-type images Ii, j (a, b) that are coincident in time with the raw signal. Mother wavelets used in this study include the Daubechies-3 (db3), Symelet-5 (sym5), and Meyer (meyr) wavelets, chosen based on preliminary results. Since pattern classification uses one-dimensional feature vectors to develop decision boundaries for each group of observations, the dimension of the binary fingerprint images Ii, j (a, b) that are generated for each EPC signal needs to be reduced. A subset of ν individual values that best represent the signals for classification will be selected. The number ν (ν < M) of DWFP features to select is arbitrary, and can be adjusted based on memory requirements and computation time restraints. For this RFID application, we consider all cases of ν ∈ [1, 5, 10, 15, 20, 50, 75, 100]. 7 Classification of RFID Tags with Wavelet Fingerprinting 217 Amplitude 0.05 0 -0.05 0.629 0.6295 0.63 0.6305 0.631 0.629 0.6295 0.63 0.6305 0.631 0.629 0.6295 0.63 0.6305 0.631 Amplitude 0.07 0.06 0.05 0.04 0.03 Amplitude 0.2 0 -0.2 -0.4 Time (s) Fig. 7.5 The different EPC compression techniques are shown here, displaying the real r (t) and imaginary c(t) components of a raw [EPC+] region (black and gray, respectively, top), the amplitude α(t) (middle), and the EPC error e E PC (t) (bottom). The EPC portion of the [EPC+] signal is bound by vertical red dotted lines Using standard MATLAB routines,5 the feature extraction process consists of several steps: 1. Label each binary image with individual values for all sets of connected pixels. 2. Relabel concentric objects centered around a common area (useful for the ringlike features found in the fingerprints). 3. Apply thresholds to remove any insignificant objects in the images. 4. Extract features from each labeled object. 5. Linearly interpolate in time between individual fingerprint locations to generate a smoothed array of feature values. 6. Identify points in time where the feature values are consistent among individual RFID tags yet separable between different tags. 5 MATLAB’s Image Processing Toolbox (MATLAB, 2008, The Mathworks, Natick, MA). 218 C. A. Miller and M. K. Hinders Fig. 7.6 An example of 8-connectivity and its application on a binary image The binary nature of the images allows us to consider each pixel of the image as having a value of either 1 or 0. The pixels with a value of 0 can be thought of as the background, while pixels with an value of 1 can be thought of as the significant pixels. The first step in feature extraction is to assign individual labels to each set of 8-connected components in the image [27], demonstrated in Fig. 7.6. Since the fingerprints are often concentric shapes, different concentric “rings” are often not connected to each other, but still are components of the same fingerprint object. Therefore, the second step in the process is to relabel groups of concentric objects using their center of mass, which is the average time coordinate of each pixel, demonstrated in Fig. 7.7. The third step in the feature extraction process is to remove any fingerprint objects from the image whose area (sum of the pixels) is below a particular threshold. Objects that are too small for the computations in later steps are removed; however, this threshold is subjective and depends on the mother wavelet used. At this point in the processing, the image is ready for features to be generated. Twenty-two measurements are made on each remaining fingerprint object, including the area, centroid, diameter of a circle with the same area, Euler number, convex image, solidity, coefficients of second and fourth degree polynomials fit to the fingerprint boundary, as well as major/minor axis length, eccentricity, and orientation of an ellipse that has the same normalized second central moment as the fingerprint. The property measurements result in a sparse property array Pi, j,n [t], where n represents the property index n = 1, . . . , 22, since each extracted value is matched to the 7 Classification of RFID Tags with Wavelet Fingerprinting 219 Fig. 7.7 An example of the fingerprint labeling process. The components of the binary image and the resulting 8-connected components, where each label index corresponds to a different index on the “hot” colormap in this image. Concentric objects are then relabeled, resulting in unique labels for each individual fingerprint object, shown here as orange and white fingerprint objects for clarity 220 C. A. Miller and M. K. Hinders time value of the corresponding fingerprint’s center of mass. Therefore, these sparse property vectors are linearly interpolated to produce a smoothed vector of property values, Pi, j,n (t). This process is shown for a typical time-domain EPC signal in Fig. 7.8. Once an array of fingerprint features for each EPC has been generated, it still needs to be reduced into a single vector of ν-many values to be used for classification. Without this reduction, not only is the feature set too large to process even on a computing cluster, but also most of the information contained within it is redundant. Since we are implementing a one-against-one classification scheme, where one testing tag (τt ) will be compared against features designed to identify one classifier tag (τc ), we are looking for feature values that are consistent among each individual RFID tag, yet separable between different tags. First, the dimensionality of the property array is reduced by calculating the intertag mean property value for each tag τi , μi,n (t) = 1 Pi, j,n (t). | j| j (7.2) Each inter-tag mean vector is then normalized to the range [0, 1]. Next, the difference in inter-tag mean vectors for property n is considered for all binary combinations of tags τi1 , τi2 , (7.3) dn (t) = |μi1 ,n (t) − μi2 ,n (t)| for i 1 , i 2 ∈ i for values of i shown in Table 7.1. We are left with a single vector representing the average intra-class difference in property n values as a function of time. Similarly, we compute the standard deviation within each class, 2 1 σi,n (t) = Pi, j,n (t) − μi,n (t) . | j| j (7.4) We next identify the maximum value of standard deviation among all tags τi at each point in time t, essentially taking the upper envelope of all values of σi,n (t), σn (t) = max σi,n (t) . i (7.5) Times tm , where m = 1, . . . , ν, are then identified for each property n at which the average intra-class difference dn (tm ) is high while the inter-class standard deviation σn (tm ) remains low. The resulting DWFP feature vector for EPC signal ŝi, j (t) is xi, j,km = Pi, j,n m (tm ). (7.6) 7 Classification of RFID Tags with Wavelet Fingerprinting 221 Fig. 7.8 The DWFP is applied to an EPC signal ŝi, j (t), shown in gray (a). A close-up of the signal is shown for clarity (b), from which the fingerprint image (c) is generated, shown here with white peaks and gray valleys for distinction. Each fingerprint object is individually labeled and localized in both time and scale (d). A variety of measures are extracted from each fingerprint and interpolated in time, including the area of the on-pixels for each object (e), as well as the coefficients of a fourthorder polynomial ( p1 x 4 + p2 x 3 + p3 x 2 + p4 x + p5 ) fit to the boundary of each fingerprint object, with coefficients p3 shown here (f) 222 C. A. Miller and M. K. Hinders 7.5.2 Wavelet Packet Decomposition Another wavelet-based feature used in classification is generated by wavelet packet decomposition [28]. First, each EPC signal is filtered using a stationary wavelet transform, removing the first three levels of detail as well as the highest approximation level. A wavelet packet transform (WPT) is applied to the filtered waveform with a specified mother wavelet and the number of levels to decompose the waveform, generating a tree of coefficients similar in nature to the continuous wavelet transform. From the WPT tree, a vector containing the percentages of energy corresponding to the T terminal nodes of the tree is computed, known as the wavelet packet energy. Because the WPT is an orthonormal transform, the entire energy of the signal is preserved in these terminal nodes [29]. The energy matrix E j for each RFID tag τi can then be represented as E i = e1,i , e2,i , . . . , e N ,i , (7.7) where N is the number of EPCs extracted from tag τi and ei, j [b] is the energy from bin number b = 1, . . . , T of the energy map for signal j = 1, . . . , N . Singular value decomposition is then applied to each energy matrix E i : E i = Ui Σi Vi∗ , (7.8) where Ui is composed of T -element left singular column vectors ub,i Ui = u1,i , u2,i , . . . , uT,i , . (7.9) The Σi matrix is a T × N singular value matrix. The row space and nullspace of E i are defined in the N × N matrix Vi∗ , and are not used in the analysis of the energy maps. For the energy matrices E i , we found that there was a dominant singular value relative to the second highest singular value, implying that there was a dominant representative energy vector corresponding to the first singular vector u1,i . From the set of all singular vectors ub,i , the significant bins that have energies above a given threshold are identified. The threshold is lowered until all the vectors return a common significant bin. Finally, the WPT elements corresponding to the extracted bin are used as features. In the case of multiple bins being selected, all corresponding WPT elements are included in the feature set. Wavelet packet decomposition uses redundant basis functions and can therefore provide arbitrary time–frequency resolution details, improving upon the wavelet transform when analyzing signals containing close, high-frequency components. 7 Classification of RFID Tags with Wavelet Fingerprinting 223 7.5.3 Statistical Features Several statistical features were generated from the raw EPC signals ŝi, j (t): 1. The mean of the raw signal μi, j = 1 ŝi, j (t), |ŝ| t where |ŝ| is the length of ŝi, j (t) 2. The maximum cross-correlation of ŝi, j (t) with another EPC from the same tag, ŝi,k (t), where τ j = τk max ŝi,∗ j (t)ŝi,k (t + τ ) . t 3. The Shannon entropy ŝi,2 j (t) ln(ŝi,2 j (t)). t 4. The unbiased sample variance 1 (ŝi, j (t) − μi, j )2 . |ŝ| − 1 t 5. The skewness (third central moment) 1 σi,3 j |ŝ| t (ŝi, j (t) − μi, j )3 . 6. The kurtosis (fourth central moment) κi, j = 1 σi,4 j |ŝ| (ŝ j (t) − μi, j )4 . t Statistical moments provide insight by highlighting outliers due to any specific flawtype signatures found in the data. 7.5.4 Mellin Features The Mellin transform is an integral transform, closely related to the Fourier transform and the Laplace transform, that can represent a signal in terms of a physical attribute similar to frequency known as scale. The β-Mellin transform is defined as [30] 224 C. A. Miller and M. K. Hinders ∞ M f ( p) = ŝ(t)t p−1 dt, (7.10) 0 for the complex variable p = − jc + β, with fixed parameter β ∈ R and independent variable c ∈ R. This variation of the Mellin transform is used because the β parameter allows for the selection of a variety of more specific transforms. In the case of β = 1/2, this becomes a scale-invariant transform, meaning invariant to compression or expansion of the time axis while preserving signal energy, defined on the vertical line p = − jc + 1/2. This scale transform is defined as ∞ 1 D f (c) = √ 2π ŝ(t)e(− jc−1/2) ln t dt. (7.11) 0 This transform has the key property of scale invariance, which means that ŝ is a scaled version of a function ŝ, and they will have the same transform magnitude. Variations in each RFID tag’s local oscillator can lead to slight but measurable differences in the frequency of the returned RF signal, effectively scaling the signal. Zanetti et al. call this the time interval error (TIE), and extract the TIE directly to use as a feature for individual tag classification [21]. We observed this slight scaling effect in our data and therefore explore the use of a scale-invariant feature extraction technique. Mellin transform’s relationship with the Fourier transform can be highlighted by setting β = 0, which results in a logarithmic-time Fourier transform: M f (c) = ∞ ŝ(t)e− jc(ln t) d(ln t). (7.12) −∞ Similarly, the scale transform of a function ŝ(t) can be defined using the Fourier transform of g(t) = ŝ(et ): M f (c) = ∞ −∞ g(t)e− jct d(t) = F (g(t)). (7.13) References [30, 33] discuss the complexities associated with discretizing the fast Mellin transform (FMT) algorithm, as well as provide a MATLAB-based implementation.6 The first step in implementing this is to define both an exponential sampling step along with the number of samples needed for a given signal in order to exponentially resample it, an example of which can be seen in Fig. 7.9. Once the exponential axis has been defined, an exponential point-by-point multiplication with the original signal is performed. A fast Fourier transform (FFT) is then computed, followed by an energy normalization step. This process is summarized in Fig. 7.10. 6 http://profs.sci.univr.it/~desena/FMT. 7 Classification of RFID Tags with Wavelet Fingerprinting 225 Fig. 7.9 An example of uniform sampling (in blue, top) and exponential sampling (in red, bottom) Once the Mellin transform is computed, features are extracted from the resulting Mellin domain including the mean of the Mellin transform, as well as the standard deviation, the variance, the second central moment, the Shannon entropy, the kurtosis, and the skewness of the mean-removed Mellin transform [31, 32]. 7.6 Classifier Design Since the goal of our classification routine is to distinguish individual RFID tags from nominally identical copies, each individual RFID tag is assigned a unique class label. This results in a multiclass problem, with the number of classes being equivalent to the number of tags being compared. There are two main methods which can be used to address multiclass problems. The first uses classifiers that have multi-dimensional discriminant functions, which often output classification probabilities for each test object that then need to be reduced for a final classification decision. The second method uses a binary comparison between all possible pairs of classes utilizing a two-class discriminant function, with a voting procedure used to determine final classification. We have discussed in Sect. 7.2 our choice of the binary classification approach, allowing us to include intrinsically two-class discriminants in our analysis. Therefore, only two tags will be considered against each other at a time, a classifier tag τc ∈ D R and a testing tag τt ∈ DT , where D R represents the training dataset used and DT the testing dataset, outlined in Table 7.1. For each binary combination of tags (τc , τt ), a training set (R) is generated composed of feature vectors from k-many EPCs associated with tags τc and τt from 226 C. A. Miller and M. K. Hinders Fig. 7.10 A raw EPC signal si, j (t) (top, left), the exponentially resampled axis (top, right), the signal resampled according to the exponential axis (middle, left), this same signal after pointby-point multiplication of the exponential axis (middle, right), and the resulting Mellin domain representation (bottom) dataset D R . Corresponding known labels (ωk ) are ωk = 1 when k ∈ c, and ωk = −1 when k ∈ t. The testing set (T ) is composed of feature vectors of tag τt only from dataset DT , where predicted labels are denoted yk = ±1. In other words, the classifier is trained on data from both tags in the training dataset and tested on the testing tag only from the testing dataset. When D R = DT , which means the classifier is trained and tested on the same dataset, a holdout algorithm is used to split the data into R and T . The problem of class imbalance, where the class labels ωk are unequally distributed (i.e., |ωk = 1| |ωk = −1|, or vice versa), can affect classifier performance 7 Classification of RFID Tags with Wavelet Fingerprinting 227 and has been a topic of further study by several researchers [34–36]. While the number of EPCs extracted from each tag here does not present a significant natural imbalance as all recordings are approximately the same length in time, it is not necessarily true that the natural distribution between classes, or even a perfect 50:50 distribution, is ideal. To explore the effect of class imbalance on the classifier performance, a variable ρ is introduced here, defined as ρ= |ωk = −1| , k ∈ R. |ωk = 1| (7.14) This variable defines the ratio of negative versus positive EPC labels in R, with ρ ∈ Z+ . When ρ = 1, the training set T contains an equal number of EPCs from tags τc as it does τt where under-sampling is used as necessary for equality. As ρ increases, additional tags are included at random from τm , m = c, t with ωm = −1 until ρ is satisfied. When all of the tags in D R are included in the training set, ρ is denoted as “all.” The process of selecting which classifiers to use is a difficult problem. The No Free Lunch Theorem states that there is no inherently best classifier for a particular application, and in practice several classifiers are often compared and contrasted. There exists a hierarchy of possible choices that are application dependent. We have previously determined that supervised, statistical pattern classification techniques using both parametric and nonparametric probability-based classifiers are appropriate for consideration. For parametric classifiers, we include a linear classifier using normal densities (LDC) and a quadratic classifier using normal densities (QDC). For nonparametric classifiers, we include a k-nearest-neighbor classifier (KNNC) for k = 1, 2, 3, and a linear support vector machine (SVM) classifier. The mathematical explanations for these classifiers can be found in [37–59]. For implementation of these classifier functions, we use routines from the MATLAB toolbox PRTools [58]. For the above classifiers that output densities, a function is applied that converts the output to a proper confidence interval, where the sum of the outcomes is one for every test object. This allows for comparison between classifier outputs. Since each EPC’s feature vector is assigned a confidence value for each class, the final label is decided by the highest confidence of all the classes. 7.7 Classifier Evaluation Since we have implemented a binary classification algorithm, a confusion matrix L (c, t), where τc is the classifier tag and τt is the testing tag, can be used to view the results of a given classifier. Each entry in a confusion matrix represents the number of EPCs from the testing tag that are labeled as the classifier tag, denoted by a label of yt = 1, and is given by 228 C. A. Miller and M. K. Hinders L (c, t) = |yt = 1| when τc ∈ R, τt ∈ T. |yt | (7.15) A perfect classifier would therefore have values of L = 1 whenever τc = τt (on the diagonal) and values of L = 0 when τc = τt (off-diagonal). Given the number of classifier configuration parameters used in this study, it does not make sense to compare individual confusion matrices to each other to determine classifier performance. Each entry of the confusion matrix is a measure of the number of EPCs from each testing set that is determined to belong to each possible training class. We can therefore apply a threshold h to each confusion matrix, where the value of h lies within the range [0, 1]. All confusion matrix entries that are above this threshold are positive matches for class membership, and all entries below the threshold are identified as negative matches for class membership. It follows that we can determine the number of false positive ( f + ), false negative, ( f − ), true positive(t+ ), and true negative (t− ) rates for each confusion matrix, given by f + (h) = t+ (h) = f − (h) = t− (h) = |L (c, t) > h|, c = t |L (c, t) > h|, c = t |L (c, t) ≤ h|, c = t |L (c, t) ≤ h|, c = t. (7.16) From these values, we can calculate the sensitivity (χ ) and specificity (ψ), χ (h) = ψ(h) = t+ (h) t+ (h)+ f − (h) t− (h) . t− (h)+ f + (h) (7.17) The concept of sensitivity and specificity values is inherent in binary classification, where testing data is identified as either a positive or negative match for each possible class. High values of sensitivity indicate that the classifier successfully classified most of the testing tags whenever the testing tag and classifier tag were the same, while high values of specificity indicate that the classifier successfully classified most of the testing tags being different than the classifier tag whenever the testing tag and the classifier tag were not the same. Since sensitivity and specificity are functions of the threshold h, they can be plotted with sensitivity (χ (h)) on the y-axis and 1 − specificity (1 − ψ(h)) on the x-axis for 0 < h ≤ 1 in what is known as a receiver operatic characteristic (ROC) [60]. The resulting curve on the ROC plane is essentially a summary of the sensitivity and specificity of a binary classifier as the threshold for discrimination changes. Points on the diagonal line y = x represent a result as good as random guessing, where classifiers performing better than chance have curves above the diagonal in the upper left-hand corner of the plane. The point (0, 1) corresponding to χ = 1 and ψ = 1 represents perfect classification. The area under each classifier’s ROC curve (|AUC|) is a common measure of a classifier’s performance, and is calculated in practice using simple trapezoidal integration. Higher |AUC| values generally correspond to classifiers with better performance [61]. This is not a strict rule, however, as a classifier with a higher |AUC| 7 Classification of RFID Tags with Wavelet Fingerprinting 229 Fig. 7.11 A comparison of confusion matrices L for classifiers of varying performance, with 0 → black and 1 → white. A perfect confusion matrix has values of L = 1 whenever τc = τt , seen as a white diagonal here, and values of L = 0 whenever τc = τt , seen as a black off-diagonal here. In general, |AUC| = 1 corresponds to a perfect classifier, while |AUC| = 0.5 performs as well as random guessing. This trend can be seen in the matrices may perform worse in specific areas of the ROC plane than another classifier with a lower |AUC| [62]. Several examples of confusion matrices can be seen in Fig. 7.11, where each corresponding |AUC| value is provided to highlight their relationship to performance. It can be seen that the confusion matrix with the highest |AUC| has a clear, distinct diagonal of positive classifications while the lowest |AUC| has positive classifications scattered throughout the matrix. The use of |AUC| values for directly comparing classifier performance has recently been questioned [63, 64], identifying the information loss associated with summarizing the ROC curve distribution as a main concern. We therefore do not use |AUC| as a final classifier ranking measure. Rather, they are only used here to narrow the results down from all the possible classifier configurations to a smaller subset of the “best” ones. The values of χ (h) and ψ(h) are still useful measures of the remaining top classifiers. At this point, however, they have been calculated for a range of threshold values that extend over 0 < h ≤ 1. A variety of methods can be used to 230 C. A. Miller and M. K. Hinders Fig. 7.12 ROC curves for the classifiers corresponding to the confusion matrices in Fig. 7.11, with |AUC| = 0.9871 (dotted line), |AUC| = 0.8271 (dash-dot line), |AUC| = 0.6253 (dashed line), and |AUC| = 0.4571 (solid line). The “perfect classifier” result at (0, 1) in this ROC space is represented by the black star (∗), and each curve’s closest point to this optimal result at threshold ĥ is indicated by a circle (◦) determine a final decision threshold h for a given classifier configuration, the choice of which depends heavily on the classifier’s final application. A popular approach involves sorting by the minimum number of misclassifications, min( f + + f − ); however, this does not account for differences in severity between the different types of misclassifications [65]. Instead, the overall classifier results were sorted here using their position in the ROC space corresponding to the Euclidean distance from the point (0, 1) as a metric. Formally, this is dROC (h) = (χ − 1)2 + (1 − ψ)2 . (7.18) For each classifier configuration, the threshold value ĥ corresponding to the minimum distance was determined, ĥ = arg min dROC (h) = {h|∀h : dROC (h ) ≥ dROC (h)}. (7.19) h In other words, ĥ is the threshold value corresponding to the point in the ROC space that is closest to the (0, 1) “perfect classifier” result. The classifier configurations are then ranked by the lowest distance dROC (ĥ). Figure 7.12 shows an example of the ROC curves for the classifiers that are generated from the confusion matrices found in Fig. 7.11. In it, the point corresponding to ĥ is indicated by a circle, with the (0, 1) point indicated by a star. 7 Classification of RFID Tags with Wavelet Fingerprinting 231 7.8 Results 7.8.1 Frequency Comparison The ultra-high-frequency (UHF) range of RFID frequencies spans from 868– 928 MHz; however, in North America UHF can be used unlicensed from 902–928 MHz (± 13 MHz from a 915 MHz center frequency). We test the potential for pattern classification routines to uniquely identify RFID tags at several operating frequencies within this range. Data collected at three frequencies (902, 915, and 928 MHz) while being held at a single orientation (PL) were used as training and testing frequencies for the classifier. Only amplitude (α(t)) signal compression was used in this frequency comparison. Table 7.4 shows the top individual classifier configuration for the RFID reader operating frequency comparison. Results are presented as sensitivity and specificity values for the threshold value ĥ that corresponds to the minimum distance dROC in the ROC space. Similarly, confusion matrices are presented in Fig. 7.13 for each classifier configuration listed in Table 7.4. Table 7.4 The classifier configurations ranked by dROC (ĥ) over all values of the classifier configuration variables when trained and tested on the frequency parameters 902, 915, and 928 MHz. D R and DT correspond to the training and testing datasets, respectively. ρ represents the ratio of negative versus positive EPCs in the training set. The threshold ĥ corresponding to the minimum distance dROC is presented, along with the values of χ(ĥ) and ψ(ĥ) DR DT |AUC| Classifier configuration #DWFP Classifier ρ Results χ (ĥ) ψ(ĥ) ĥ (%) Features (ν) Accuracy (%) 902 902 100 QDC (MATLAB) 3 0.9983 1.000 0.997 96.6 99.7 902 915 1 LDC (PRTools) 5 0.9898 1.000 0.943 8.5 94.6 902 928 50 3NN 1 0.9334 0.960 0.950 10.9 95.0 915 902 1 LDC (PRTools) 12 0.4571 0.640 0.543 2.3 54.7 915 915 1 QDC (MATLAB) 9 0.9977 1.000 0.995 82.2 99.5 915 928 10 LDC (MATLAB) all 0.5195 0.720 0.538 1.9 54.6 928 902 10 1NN 3 0.4737 0.520 0.757 9.1 74.7 928 915 1 LDC (MATLAB) 7 0.6587 0.880 0.498 2.0 51.4 928 928 75 QDC (MATLAB) 2 1.0000 1.000 1.000 86.9 100.0 C. A. Miller and M. K. Hinders 25 20 20 20 15 10 15 10 5 5 1 1 1 5 10 15 τ t ∈ DT 20 25 τ c ∈ DR 25 τ c ∈ DR 25 15 10 5 1 5 10 15 τ t ∈ DT 20 1 25 25 20 20 20 15 10 15 10 5 5 1 1 1 5 10 15 τ t ∈ DT 20 25 τ c ∈ DR 25 τ c ∈ DR 25 1 5 10 15 τ t ∈ DT 20 1 25 20 20 15 10 5 1 1 1 5 10 15 τ t ∈ DT 20 25 τ c ∈ DR 20 5 10 15 τ t ∈ DT 20 25 1 5 10 15 τ t ∈ DT 20 25 1 5 10 15 τ t ∈ DT 20 25 5 25 10 5 10 25 15 1 15 25 τ c ∈ DR τ c ∈ DR τ c ∈ DR τ c ∈ DR 232 15 10 5 1 5 10 15 τ t ∈ DT 20 25 1 Fig. 7.13 Confusion matrices (L ) for the classifier configurations corresponding to the minimum distance dROC (ĥ) over all combinations of (D R , DT ) where D R , DT ∈ 902, 915, 928 MHz. Values of L range from [0, 1] with 0 → black and 1 → white here The classifier performed well when trained on the dataset collected at 902 MHz, regardless of what frequency the testing data was collected. Accuracies were above 94.6% for all three testing frequencies, and sensitivity (χ (ĥ)) and specificity (ψ(ĥ)) values were all above 0.943, very close to the ideal value of 1.000. These confusion matrices shown in Fig. 7.13 all display the distinct diagonal line which indicates accurate classification. When the classifier was trained on either the 915 MHz or 928 MHz datasets, however, the classification accuracy was low. Neither case was able to identify tags from other frequencies very well, even though they did well classifying tags from their own frequency. When D R = 915 MHz and DT = 928 MHz, for example, the |AUC| value was only 0.5195, not much higher than the 0.5000 value associated with random guessing. The corresponding confusion matrix 7 Classification of RFID Tags with Wavelet Fingerprinting 233 shows no diagonal but instead vertical lines at several predicted tag labels, indicating that the classifier simply labeled all of the tags as one of these values. 7.8.2 Orientation Comparison A second variable that is inherent in real-world RFID application is the orientation relative to the antenna at which the RFID tags are read. This is one of the main reasons why RFID technology is considered advantageous compared to traditional barcodes; however, antenna design and transmission power variability result in changes in the size and shape of the transmission field produced by the antenna [66]. It follows that changes in the tag orientation relative to this field will result in changes in the pre-demodulated RF signals. To test how the pattern classification routines will behave with a changing variable like orientation, data was collected at three different orientations (PL, OB, and UD) while being held at a common operating frequency (902 MHz). This data was used as training and testing sets for the classifiers. Again, only amplitude (α(t)) signal compression was used. Table 7.5 shows the top individual classifier configuration for the RFID tag orientation comparison. Results are presented as sensitivity and specificity values for a threshold value ĥ that corresponds to the minimum distance dROC in the ROC space. Similarly, confusion matrices are presented in Fig. 7.14 for each classifier configuration listed in Table 7.5. Similar to the frequency results, the classification results again show a single orientation that performs well as a training set regardless of the subsequent tag orientation of the testing set. When trained on data collected at the parallel (PL) orientation, the classification accuracies range from 94.9 to 99.7% across the three testing tag orientations. Values of χ (ĥ) range from 0.880–1.000, meaning that over 88% of the true positives are correctly identified, and ψ(ĥ) values range from 0.952 to 0.997, indicating that over 95% of the true negatives are accurately identified as well. These accuracies are verified in the confusion matrix representations found in Fig. 7.13. When the classifiers are trained on either the oblique (OB) or upside-down (UD) orientations, we again see that the classifiers struggles to identify testing data from alternate tag orientations. The best performing of these results is for D R = OB and DT = PL, where χ (ĥ) = 0.920 and ψ(ĥ) = 0.770 suggesting accurate true positive classification with slightly more false positives as well, resulting in an overall accuracy of 77.6%. When D R = UD, the testing results are again only slightly better than random guessing, with |AUC| values of 0.5398 for DT = PL and 0.5652 for DT = OB. 234 C. A. Miller and M. K. Hinders Table 7.5 The classifier configurations ranked by dROC (ĥ) over all values of the classifier configuration variables when trained and tested on the orientation parameters PL, UD, and OB. D R and DT correspond to the training and testing datasets, respectively. ρ represents the ratio of negative versus positive EPCs in the training set. The threshold ĥ corresponding to the minimum distance dROC is presented, along with the values of χ(ĥ) and ψ(ĥ) DR DT |AUC| Classifier configuration #DWFP Classifier ρ Results χ ψ h (%) Features (ν) Accuracy (%) PL PL 1 QDC (MATLAB) 13 0.9979 1.000 0.997 88.9 99.7 PL UD 75 3NN 1 0.9489 0.960 0.953 12.1 95.4 PL OB 20 1NN 1 0.8627 0.880 0.952 2.8 94.9 UD PL 10 LDC (MATLAB) 19 0.5398 0.680 0.658 2.9 65.9 UD UD 1 QDC (MATLAB) 5 0.9994 1.000 0.995 73.6 99.5 UD OB 5 LDC (MATLAB) 15 0.5652 0.680 0.622 1.9 62.4 OB PL 10 LDC (MATLAB) 13 0.8250 0.920 0.770 5.8 77.6 OB UD 5 1NN 4 0.6042 0.760 0.622 1.9 62.7 OB OB 75 QDC (MATLAB) 2 1.0000 1.000 1.000 47.7 100.0 7.8.3 Different Day Comparison We next present classification results when data recorded on multiple days were used as training and testing datasets. The following analysis provides a better understanding of how signals taken from the same tag, same frequency, same orientation, but in subsequent recordings on multiple days are comparable to each other. It is important to note that the data used here was collected with the RFID tag being held by hand above the antenna. While it was held as consistently as possible, it was not fixed in position. Additionally, each subsequent recording was done when environmental conditions were intentionally different from the previous recordings (humidity, temperature, etc.). Data was collected on four different days (Day 1, 2, 3, and 4). This data was used as training and testing sets for the classifiers. Amplitude (α(t)) and EPC error (e E PC (t)) were both used as signal compression methods. Table 7.6 shows the top individual classifier configuration for the different day tag recording comparison. Results are presented as sensitivity and specificity values for a threshold value ĥ that corresponds to the minimum distance dROC in the ROC space. Similarly, confusion matrices are presented in Fig. 7.15 for each classifier configuration listed in Table 7.6. 235 25 20 20 20 15 10 15 10 5 5 1 1 1 5 10 15 τ t ∈ DT 20 25 τ c ∈ DR 25 τ c ∈ DR 25 15 10 5 1 5 10 15 τ t ∈ DT 20 1 25 25 20 20 20 15 10 15 10 5 5 1 1 1 5 10 15 τ t ∈ DT 20 25 τ c ∈ DR 25 τ c ∈ DR 25 1 5 10 15 τ t ∈ DT 20 1 25 20 20 15 10 5 1 1 1 5 10 15 τ t ∈ DT 20 25 τ c ∈ DR 20 5 10 15 τ t ∈ DT 20 25 1 5 10 15 τ t ∈ DT 20 25 1 5 10 15 τ t ∈ DT 20 25 5 25 10 5 10 25 15 1 15 25 τ c ∈ DR τ c ∈ DR τ c ∈ DR τ c ∈ DR 7 Classification of RFID Tags with Wavelet Fingerprinting 15 10 5 1 5 10 15 τ t ∈ DT 20 25 1 Fig. 7.14 Confusion matrices (L ) for the classifier configurations corresponding to the minimum distance dROC (ĥ) over all combinations of (D R , DT ) where D R , DT ∈ PL, UD, OB. Values of L range from [0, 1] with 0 → black and 1 → white here The first thing to note in these results is the prevalence of the EPC error (e E PC (t)) signal compression compared to the amplitude (α(t)) signal compression. This suggests that e E PC (t) is more able to correctly classify the RFID tags than the raw signal amplitude. Unlike the two previous sets of results, where one frequency and one orientation classified well compared to the others, there is no dominant subset here. All the different days were classified similarly when tested against each other. This is expected, since there should be no reason data trained on a specific day should perform better than any other. |AUC| values were mainly above 0.6700 yet below 0.7500, with accuracies ranging from 63.6 to 80.9% when D R = DT . The confusion matrix representations of these classification results (Fig. 7.15) again indicate there is no single dominant training subset. We see that D R = DT 236 C. A. Miller and M. K. Hinders Table 7.6 The classifier configurations ranked by dROC (ĥ) over all values of the classifier configuration variables when trained and tested on the different day parameters Day 1, 2, 3, and 4. D R and DT correspond to the training and testing datasets, respectively. ρ represents the ratio of negative versus positive EPCs in the training set. The threshold ĥ corresponding to the minimum distance dROC is presented, along with the values of χ(ĥ) and ψ(ĥ) DR DT |AUC| Classifier configuration EPC Comp. #DWFP Features (ν ) Classifier ρ Results χ ψ h (%) Accuracy (%) Day 1 Day 1 α, e E PC 15 QDC (MATLAB) 2 0.9949 1.000 0.986 23.9 98.7 Day 1 Day 2 e E PC 20 LDC (MATLAB) 4 0.7432 0.867 0.657 39.5 67.1 Day 1 Day 3 e E PC 10 1NN 12 0.6735 0.800 0.662 9.1 67.1 Day 1 Day 4 e E PC 10 QDC (MATLAB) 5 0.7287 0.667 0.724 37.6 72.0 Day 2 Day 1 e E PC 20 LDC (MATLAB) 10 0.7443 0.800 0.748 42.4 75.1 Day 2 Day 2 α, e E PC 20 QDC (MATLAB) 2 0.9990 1.000 0.986 42.4 98.7 Day 2 Day 3 e E PC 20 3NN 1 0.7990 0.800 0.790 52.6 79.1 Day 2 Day 4 e E PC 1 SVM 1 0.7083 0.800 0.733 20.1 73.8 Day 3 Day 1 e E PC 1 SVM 1 0.7014 0.867 0.619 21.9 63.6 Day 3 Day 2 e E PC 15 3NN 8 0.6919 0.800 0.719 4.6 72.4 Day 3 Day 3 α, e E PC 50 QDC (MATLAB) 5 1.0000 1.000 1.000 72.8 100.0 Day 3 Day 4 e E PC 3NN 7 0.6390 0.800 0.648 4.6 65.8 Day 4 Day 1 α, e E PC 1 3NN 3 0.7705 0.800 0.710 17.9 71.6 Day 4 Day 2 e E PC 5 1NN 3 0.7395 0.733 0.719 29.8 72.0 Day 4 Day 3 α 1 3NN 1 0.7422 0.667 0.819 57.6 80.9 Day 4 Day 4 α, e E PC 50 LDC (PRTools) 1 1.0000 1.000 1.000 95.7 100.0 50 results all show distinct diagonal lines, even with D R , DT = Day 4 where there are additional high off-diagonal entries in the matrix. This is indicated in Table 7.6 by a relatively high threshold (ĥ) value of 95.7. When D R = DT , there are still faint diagonal lines present in some of the confusion matrices. For example, when D R = Day 2 and DT = Day 3, diagonal entries coming out of the lower left-hand corner are somewhat higher in accuracy (closer to white in the confusion matrix) than their surrounding off-diagonal entries. We see in Table 7.6 that this classifier has an |AUC| equal to 0.7990 and a 79.1% overall accuracy. 40 35 35 35 35 30 26 35 τ t ∈ DT 40 30 26 26 30 35 τ t ∈ DT 40 26 26 30 35 τ t ∈ DT 40 40 40 35 35 35 35 30 30 26 26 26 30 35 τ t ∈ DT 30 35 τ t ∈ DT 40 30 35 τ t ∈ DT 40 35 35 35 35 30 26 26 26 30 35 τ t ∈ DT 30 35 τ t ∈ DT 40 30 35 τ t ∈ DT 40 35 35 35 35 30 26 26 26 30 35 τ t ∈ DT 40 τ c ∈ DR 40 τ c ∈ DR 40 τ c ∈ DR 40 30 30 35 τ t ∈ DT 40 26 30 35 τ t ∈ DT 40 26 30 35 τ t ∈ DT 40 26 30 35 τ t ∈ DT 40 30 26 26 40 26 26 40 30 35 τ t ∈ DT 30 26 26 40 τ c ∈ DR 40 τ c ∈ DR 40 τ c ∈ DR 40 30 30 26 26 40 30 26 30 26 26 40 τ c ∈ DR 40 τ c ∈ DR 40 τ c ∈ DR τ c ∈ DR 30 30 τ c ∈ DR 30 26 26 τ c ∈ DR 40 τ c ∈ DR 40 30 τ c ∈ DR 237 40 τ c ∈ DR τ c ∈ DR 7 Classification of RFID Tags with Wavelet Fingerprinting 26 26 30 35 τ t ∈ DT 40 Fig. 7.15 Confusion matrices (L ) for the classifier configurations corresponding to the minimum distance dROC (ĥ) over all combinations of (D R , DT ) where D R , DT ∈ Day 1, 2, 3, and 4. Values of L range from [0, 1] with 0 → black and 1 → white here 7.8.4 Damage Comparison We next present the results of the RFID tag damage analysis to explore how physical degradation affects the RFID signals and the resulting classification accuracy. The datasets from Day 1, 2, 3, and 4 are combined and used here as a single training set. The tags which make up this dataset, AD26 − AD40, are split into two subsets: tags AD26 − AD32 were subjected to water damage study, while tags AD33 − AD40 were subjected to a physical damage study. The AD-612 tags are not waterproof nor are they embedded in a rigid shell of any kind, although many RFID tags exist that are sealed to the elements and/or encased in a shell for protection. For the water damage, each tag was submerged in water for 3 hours, at which point they were patted dry 238 C. A. Miller and M. K. Hinders Table 7.7 The classifier configurations ranked by dROC (ĥ) over all values of the classifier configuration variables when trained and tested on the tag damage comparisons for both water and physical damage. D R and DT correspond to the training and testing datasets, respectively. ρ represents the ratio of negative versus positive EPCs in the training set. The threshold ĥ corresponding to the minimum distance dROC is presented, along with the values of χ(ĥ) and ψ(ĥ) DR DT |AUC| Classifier configuration EPC #DWFP Comp. Features (ν ) Classifier ρ Results χ ψ h (%) Accuracy (%) Day 1, 2, 3, 4 Wet α 1 SVM 1 0.6361 0.714 0.786 80.1 73.8 Day 1, 2, 3, 4 Wet-todry α 1 3NN 17 0.7789 0.857 0.738 4.8 75.5 Day 1, 2, 3, 4 Light α Damage 5 1NN 16 0.7589 0.750 0.839 17.9 82.8 Day 1, 2, 3, 4 Heavy α Damage 20 LDC (PRTools) 7 0.7980 1.000 0.589 44.5 64.1 to remove any excess water and used to collect data (labeled as Wet). They were then allowed to air-dry overnight, and were again used to collect data (Wet-to-dry). For the physical damage, each tag was first gently crumpled by hand (light damage) and subsequently balled up and then somewhat flattened (heavy damage), with data being collected after each stage. Table 7.7 shows the top individual classifier configuration for the two RFID tag damage comparisons. Results are presented as sensitivity and specificity values for a threshold value ĥ that corresponds to the minimum distance dROC in the ROC space. Similarly, confusion matrices are presented in Fig. 7.16 for each classifier configuration listed in Table 7.7. The RFID tag damage classification results provide similar to the previous different day comparison. The water damage did not seem to have a severe effect on the classification accuracy, while the more severe physical damage showed lower classifier accuracy. However, rather than relatively equal χ (ĥ) and ψ(ĥ) values, the heavy damage resulted in values of χ (ĥ) = 1.000 and ψ(ĥ) = 0.589, which means that the classifier was optimistically biased and over-classified positive matches. This lower accuracy was not unexpected, as deformation of the tag’s antenna should distort the RF signal and therefore the classifier’s ability to identify a positive match for the tag. 7.9 Discussion The results presented above suggest that a dominant reader frequency, 902 MHz in this case, may exist at which data can be initially collected for the classifier to be trained on and then used to correctly identify tags read at alternate frequencies. In our analysis, we have explored reader frequencies that span the North American UHF range, yet were only part of the full 865–928 MHz UHF range for which the AD-612 30 τ c ∈ DR τ c ∈ DR 7 Classification of RFID Tags with Wavelet Fingerprinting 26 30 26 30 τ t ∈ DT τ c ∈ DR 26 τ c ∈ DR 239 37 33 26 30 τ t ∈ DT 33 37 τ t ∈ DT 37 33 33 37 τ t ∈ DT Fig. 7.16 Confusion matrices (L ) for the classifier configurations corresponding to the minimum distance dROC (ĥ) over all combinations of (D R , DT ) where D R = Day 1, 2, 3, and 4, and DT ∈ wet, wet-to-dry, light damage, and heavy damage. Values of L range from [0, 1] with 0 → black and 1 → white here tags used here were optimized. Therefore, the dominant 902 MHz read frequency we observed lies at the center of the actual tags operating frequency range. It is of no surprise that the tags perform best at the center of their optimized frequency range rather than at the upper limit. Similarly, a classifier can be trained on a tag orientation (relative to the reader antenna) that may result in accurate classification of RFID tags regardless of their subsequent orientation to the reader antenna. Antenna design for both the readers and the tags is an active field of research [2], and it is expected that the RF field will be non-uniform around the antennas. This could explain why only one of the experimental orientations used here performs better than the others. Regardless of the field strength, however, the unique variations in the RF signature of an RFID tag should be present. It is promising that the classifier still had an accuracy of over 60% with these variations, and up to 94.9% accuracy if trained on the parallel (PL) orientation. 240 C. A. Miller and M. K. Hinders Changes in the environmental conditions, like ambient temperature and relative humidity, were also allowed in the different day study where the RFID tags were suspended by hand near the antenna (in generally the same spot) for data collection on successive afternoons. It is important to note that the tags were not fixtured for this study, and that slight variations in both distance to the reader as well as orientation were inherent due to the human element. Even so, the classifier was generally able to correctly identify the majority of the RFID tags as being either a correct match or a correct mismatch when presented with a dataset it had never seen before, with accuracies ranging from 63.6 to 80.9%. This study represents a typical real-world application of RFID tags due to these environmental and human variations. As previously mentioned, the EPC compression method tended to favor the EPC error signal e E PC (t), although there was not a large difference in classifier performance between the different day comparison that used both α(t) and e E PC (t) compression, and the frequency/orientation comparisons that used only α(t) compression. The parameter ρ had a large spread of values across the classifiers, indicating that the classification results may not be very sensitive to class imbalance within the training set. The number of DWFP features also shows no consistent trend in our results, other than being often larger than 1, indicating that there may be room for feature reduction. With any application of pattern classification, a reduction in the feature space through feature selection can lead to improved classification results [44]. Individual feature ranking is one method that can be used to identify features on a one-by-one basis; however, it can overlook the usefulness of combining feature variables. In combination with a nested selection method like sequential backward floating search (SBFS), the relative usefulness of the DWFP features, as well as the remaining features, can be evaluated [51]. The results in Tables 7.4–7.6 where D R = DT are comparable to those observed by Bertoncini et al. [20], with some classifiers having 100% accuracy and the rest near 99%. In these instances, the classifier was trained on subsets of the testing data, so it is expected that the classifier performs better in these cases. It is also important to note that the final decision threshold ĥ used can still vary greatly depending on the classifier’s application. It is important to note that adjusting the classifier’s final threshold value does not alter the classification results of said classifier. The |AUC| takes into account all possible threshold values, and is therefore fixed for each classifier configuration. The threshold values only determine the distribution of error types, χ vs. ψ, within the results. Aside from the minimum dROC metric, weights can be applied to determine an alternate threshold if the application calls for a trade-off between false negative and false positive results. For example, if a user is willing to allow up to five false positives before allowing a false negative, a minimizing function can be used to identify this weighted optimal threshold. A comparison of different methods to determine a final threshold can be seen in Table 7.8, where the classifier configuration trained on Day 1 data and tested on Day 4 data from Table 7.6 is presented for several alternate threshold values. First, the threshold is shown for the minimum distance dROC (ĥ), as was previously presented in Table 7.6. The threshold is then shown for the minimum number of total misclassifications ( f + + f − ), followed by minimum number of false positives ( f + ), 7 Classification of RFID Tags with Wavelet Fingerprinting 241 Fig. 7.17 A plot of sensitivity χ(h) (solid line) and specificity ψ(h) (dashed line) versus threshold for the classifier trained on Day 1 and tested on Day 4 from Table 7.6. Threshold values are shown corresponding to the min( f + + f − ) (h = 63.4, dash-dot line) as well as ĥ (h = 37.6, dotted line). The threshold value determines the values of χ and ψ, and is chosen based on the classifier’s final application and then by that of the lowest number of false negatives ( f − ). Several weighting ratios are then shown, where the cost of returning a false positive ( f + ) is increased compared to the cost of returning a false negative ( f − ). For example, a weighting ratio of 1 : 5 [ f − : f + ] means that 5 f + cost as much as a single f − , putting more emphasis on reducing the number of f − present. It can be seen that the values of χ (h) and ψ(h) change as the threshold h changes. An overly optimistic classifier is the result of threshold values that are too low, when all classifications are identified as positive matches (χ ≈ 1 and ψ ≈ 0). Alternatively, an overly pessimistic classifier is the result of threshold values that are too high, resulting in all classifications identified as negative matches (χ ≈ 0 and ψ ≈ 1). The weighting ratio 1 : 10 [ f − : f + ] returns the most even values of χ and ψ, which matches the ĥ threshold. Figure 7.17 shows an example of the trade-off between the values of χ (h) and ψ(h) as the threshold h is increased, where two examples from Table 7.8 are highlighted. It is useful to discuss a few examples here to better understand the different threshold results. In a security application, for example, where RFID badges are used to control entry into a secure area, it is most important to minimize the number of f + results because allowing a cloned RFID badge access to a secure area could be devastating. In this situation, we would want the value of ψ to be as close to 1.0 as possible. In Table 7.8, this result corresponds to a threshold value of h = 65.3. Unfortunately, the value of χ at this threshold is 0.067, which means that almost all of the true ID badges would also be identified as negative matches. Therefore, that specific classifier is not appropriate for a security application. An alternate example is the use of RFID-embedded credit cards in retail point of sale. To a store, keeping the business of a repeat customer may be much more valuable than losing some merchandise to a cloned RFID credit card. In this sense, it is useful to determine an appropriate weight of [ f − : f + ] that evens the gains and losses of both cases. If it were determined that a repeat customer would bring in 20 times as much revenue as it would cost to refund a fraudulent charge due to a cloned RFID account, then a weight of 1 : 20 [ f − : f + ] could be used to determine the optimal DT Day 4 .. . DR Day 1 .. . .. . e E PC .. . 10 QDC (MATLAB) .. . Classifier configuration EPC #DWFP Classifier Comp. Features .. . 5 ρ .. . 0.7287 |AUC| 0.133 0.067 1.000 0.200 0.667 0.867 0.933 min( f + ) min( f − ) 1:5 [ f + : f − ] 1:10 [ f + : f − ] 1:15 [ f + : f − ] 1:20 [ f + : f − ] 0.667 min dROC (ĥ) min( f + + f − ) Results χ Sorted by 1.000 0.000 0.981 0.724 0.529 0.443 0.995 0.724 ψ 65.3 0.1 56.7 37.6 30.8 28.2 63.4 37.6 h (%) 93.8 6.7 92.9 72.0 55.1 47.6 93.8 72.0 Accuracy (%) Table 7.8 The classifier configuration trained on Day 1 and tested on Day 4 data from Table 7.6 using several different metrics to determine the final threshold h. Metrics include min( f + + f − ), min( f + ), min( f − ), a weight of 1:5, 1:10, 1:15, and 1:20 for [ f + : f − ], meaning that up to 20 f + are allowed for each f − . The choice of metric used to determine the final threshold value depends on the classifier’s final application 242 C. A. Miller and M. K. Hinders 7 Classification of RFID Tags with Wavelet Fingerprinting 243 classifier threshold. From Table 7.8, it can be seen that a corresponding threshold is h = 28.2, resulting in values of χ = 0.933 and ψ = 0.443. This classifier would incorrectly identify 7% of the repeat customers as being fraudulent while correctly identifying 44% of the cloned signals as being fraudulent. This specific classifier could be useful in this retail example. 7.10 Conclusion The USRP software-defined radio system has been shown to capture signals at a usable level of detail for RFID tag classification applications. Since the signal manipulations are performed in software, this allows us to extract not only the raw RF signal, but it also allows us to generate our own, ideal signal to compare against. A new signal representation has been created this way, that is, the difference between the recorded and ideal signal representations, e E PC (t), and has proven to be very useful in the classification routines. The binary classification routine has been explored on more real-world grounds, including exposure to a variety of environmental conditions without the use of RF shielding to boost the SNR level. To explore classifier robustness without fixed proximity and orientation relative to the RFID reader, several validation classifications were performed, including an RFID reader frequency comparison, a tag orientation comparison, a multi-day data collection comparison, as well as physical damage and water exposure comparisons. The frequency comparison was performed to determine the effect that the variability of potential RFID readers inspection frequencies would have on the classification routines. The results were promising, although not perfect, and suggest that while it is best to train a classifier on all possible scenarios, a main frequency (i.e., center frequency) could potentially be used for a master classifier training set. A similar orientation comparison was done, altering the RFID tag’s orientation relative to the antenna. Again, the results showed it was best to train the classifiers on the complete set of data; however, there was again promise for a potential single main orientation that could be used to train a classifier. In the multi-day collection comparison, data was collected by hand in an identical fashion but on separate days. The results showed that the inconsistency associated with holding an RFID tag near the antenna cause the classifiers to have trouble correctly identifying EPCs as coming from their correct tag. Two further comparisons were performed to assess the degree that physical degradation had on the RFID tags. When subjected to water, promising classifier configurations were found that were on the same level of accuracy as results seen for undamaged tags, suggesting that the water may not have a significant effect on the RFID classification routines. A separate subset of the RFID tags was subjected to a similar degradation analysis, this time with physical bending as a result of being crumpled by hand. The results show that, as expected, bending of the RFID tag’s antenna caused degradation in the raw signal that caused the classifier to misclassify many tags. 244 C. A. Miller and M. K. Hinders Applications of RFID technology that implement a fixed tag position are a potential market for the classification routine we present. One example of this is with ePassports, which are embedded with an RFID chip containing digitally signed biometric information [7]. These passports are placed into a reader that controls the position and distance of the RFID chip relative to the antennas. Additionally, passports are generally protected from the elements and can be replaced if they undergo physical wear and tear. We have demonstrated a specific emitter identification technique that performs well given these restrictions. Acknowledgements This work was performed using computational facilities at the College of William and Mary which were provided with the assistance of the National Science Foundation, the Virginia Port Authority, Sun Microsystems, and Virginia’s Commonwealth Technology Research Fund. Partial support for the project is provided by the Naval Research Laboratory and the Virginia Space Grant Consortium. The authors would like to thank Drs. Kevin Rudd and Crystal Bertoncini for their many helpful discussions. References 1. Ngai EWT, Moon KKL, Riggins FJ, Yi CY (2008) RFID research: an academic literature review (1995–2005) and future research directions. Int J Prod Econ 112(2):510–520 2. Abdulhadi AE, Abhari R (2012) Design and experimental evaluation of miniaturized monopole UHF RFID tag antennas. IEEE Antennas Wirel Propag Lett 11:248–251 3. Khan MA, Bhansali US, Alshareef HN (2012) High-performance non-volatile organic ferroelectric memory on banknotes. Adv Mat 24(16):2165–2170 4. Juels A (2006) RFID security and privacy: a research survey. IEEE J Sel Areas Commun 24(2):381–394 5. Halamka J, Juels A, Stubblefield A, Westhues J (2006) The security implications of verichip cloning. J Am Med Inform Assoc 13(6):601–607 6. Heydt-Benjamin T, Bailey D, Fu K, Juels A, O’Hare T (2007) Vulnerabilities in first-generation RFID-enabled credit cards. In: Dietrich S, Dhamija R (eds) Financial Cryptography and Data Security, vol 4886. Lecture Notes in Computer Science. Springer, Berlin/Heidelberg, pp 2–14 7. Richter H, Mostowski W, Poll E (2008) Fingerprinting passports. In: NLUUG Spring conference on security, pp 21–30 8. White D (2005) NCR: RFID in retail. In: RFID: applications, security, and privacy, pp 381–395 9. Westhues J (2005) Hacking the prox card. In: RFID: applications, security, and privacy, pp 291–300 10. Smart Card Alliance (2007) Proximity mobile payments: leveraging NFC and the contactless financial payments infrastructure. Whitepaper 11. Léopold E (2009) The future of mobile check-in. J Airpt Manag 3(3):215–222 12. Wang MH, Liu JF, Shen J, Tang YZ, Zhou N (2012) Security issues of RFID technology in supply chain management. Adv Mater Res 490:2470–2474 13. Juels A (2004) Yoking-proofs for RFID tags. In: Proceedings of the second annual IEEE pervasive computing and communication workshops (PERCOMW 2004), PERCOMW ’04, pp 138–143 14. Juels A (2005) Strengthening epc tags against cloning. In: Proceedings of the 4th ACM workshop on wireless security, WiSe ’05, pp 67–76. ACM, New York 15. Riezenman MJ (2000) Cellular security: better, but foes still lurk. IEEE Spectr 37(6):39–42 16. Suski WC, Temple MA, Mendenhall MJ, Mills RF (2008) Using spectral fingerprints to improve wireless network security. In: Global telecommunications conference, 2008. IEEE GLOBECOM 2008. IEEE. IEEE, New Orleans, pp 1–5 7 Classification of RFID Tags with Wavelet Fingerprinting 245 17. Gerdes RM, Mina M, Russell SF, Daniels TE (2012) Physical-layer identification of wired ethernet devices. IEEE Trans Inf Forensics Secur 7(4):1339–1353 18. Kennedy IO, Scanlon P, Mullany FJ, Buddhikot MM, Nolan KE, Rondeau TW (2008) Radio transmitter fingerprinting: a steady state frequency domain approach. In: Proceedings of the IEEE 68th vehicular technology conference (VTC 2008), pp 1–5 19. Danev B, Heydt-Benjamin TS, Čapkun S (2009) Physical-layer identification of RFID devices. In: Proceedings of the 18th conference on USENIX security symposium, SSYM’09. USENIX Association, Berkeley, CA, USA, pp 199–214 20. Bertoncini C, Rudd K, Nousain B, Hinders M (2012) Wavelet fingerprinting of radio-frequency identification (RFID) tags. IEEE Trans Ind Electron 59(12):4843–4850 21. Zanetti D, Danevs B, Čapkun S (2010) Physical-layer identification of UHF RFID tags. In: Proceedings of the sixteenth annual international conference on Mobile computing and networking, MobiCom ’10. ACM, New York, NY, USA, pp 353–364 22. Romero HP, Remley KA, Williams DF, Wang C-M (May 2009) Electromagnetic measurements for counterfeit detection of radio frequency identification cards. IEEE Trans Microw Theory Tech 57(5):1383–1387 23. EPCglobal Inc. (2008) EPC radio-frequency identity protocols: class-1 generation-2 UHF RFID protocol for Communications at 860 MHz–960 MHz Version 1.2.0 24. GNU Radio Website (2011) Software. http://www.gnuradio.org 25. Ellis KJ, Serinken N (2001) Characteristics of radio transmitter fingerprints. Radio Sci 36(4):585–597 26. Hou JD, Hinders MK (2002) Dynamic wavelet fingerprint identification of ultrasound signals. Mater Eval 60(9):1089–1093 27. Haralick RM, Shapiro LG (1992) Computer and robot vision, vol 1. Addison-Wesley, Boston, MA 28. Learned RE, Willsky AS (1995) A wavelet packet approach to transient signal classification. Appl Comput Harmon Anal 2(3):265–278 29. Feng Y, Schlindwein FS (2009) Normalized wavelet packets quantifiers for condition monitoring. Mech Syst Signal Process 23(3):712–723 30. de Sena A, Rocchesso D (2007) A fast Mellin and scale transform. EURASIP J Adv Signal Process 2007(1):089170 31. Harley JB, Moura JMF (2011) Guided wave temperature compensation with the scale-invariant correlation coefficient. In: 2011 IEEE International Ultrasonics Symposium (IUS), pp 1068– 1071, Orlando, FL 32. Harley JB, Ying Y, Moura JMF, Oppenheim IJ, Sobelman L (2012) Application of Mellin transform features for robust ultrasonic guided wave structural health monitoring. AIP Conf Proc 1430:1551–1558 33. Sundaram H, Joshi SD, Bhatt RKP (1997) Scale periodicity and its sampling theorem. IEEE Trans Signal Proc 45(7):1862–1865 34. Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449 35. Weiss GM, Provost F (2001) The effect of class distribution on classifier learning: Technical Report ML-TR-44. Department of Computer Science, Rutgers University 36. Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proceedings of the 14th international conference on machine learning. Morgan Kaufmann Publishers, Inc, Burlington, pp 179–186 37. Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York 38. Fukunaga K (1990) Introduction to statistical pattern recognition. Computer science and scientific computing, 2nd edn. Academic Press, Boston 39. Kuncheva LI (2004) Combining pattern classifiers. Wiley, New York 40. Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin 41. Webb AR (2012) Statistical pattern recognition. Wiley, Hoboken 42. Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge 246 C. A. Miller and M. K. Hinders 43. Kanal L (1974) Patterns in pattern recognition: 1968–1974. IEEE Trans Inf Theory 20(6):697– 722 44. Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Patt Anal Mach Intell 22(1):4–37 45. Watanabe S (1985) Pattern recognition: human and mechanical. Wiley-Interscience publication, Hoboken 46. Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1):245–271 47. Jain AK, Chandrasekaran B (1982) Dimensionality and sample size considerations in pattern recognition practice. In: Krishnaiah PR, Kanal LN (eds) Classification pattern recognition and reduction of dimensionality. Handbook of Statistics, vol 2. Elsevier, Amsterdam, pp 835–855 48. Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517 49. Dash M, Liu H (1997) Feature selection for classification. Intellect Data Anal 1(1–4):131–156 50. Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171 51. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182 52. Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin 20(1):101–148 53. Romero E, Sopena JM, Navarrete G, Alquézar R (2003) Feature selection forcing overtraining may help to improve performance. In: Proceedings of the international joint conference on neural networks, 2003, vol 3. IEEE, Portland, OR, pp 2181–2186 54. Lambrou T, Kudumakis P, Speller R, Sandler M, Linney A (1998) Classification of audio signals using statistical features on time and wavelet transform domains. In: Proceedings of the 1998 IEEE international conference on acoustics, speech and signal processing. vol 6. IEEE, Washington, pp 3621–3624 55. Smith SW (2003) Digital signal processing: a practical guide for engineers and scientists. Newnes, Oxford 56. Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Applications of mathematics. Springer, Berlin 57. Theodoridis S, Koutroumbas K (1999) Pattern recognition. Academic Press, Cambridge 58. Duin RPW, Juszczak P, Paclik P, Pekalska E, de Ridder D, Tax DMJ, Verzakov S (2007) PRTools4.1, A Matlab toolbox for pattern recognition 59. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge 60. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874 61. Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148(3):839–843 62. Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159 63. Lobo JM, Jiménez-Valverde A, Real R (2007) Auc: a misleading measure of the performance of predictive distribution models. Glob Ecol Biogeogr 17(2):145–151 64. Hanczar B, Hua J, Sima C, Weinstein J, Bittner M, Dougherty ER (2010) Small-sample precision of ROC-related estimates. Bioinformatics 26(6):822–830 65. Hand DJ (2009) Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach Learn 77(1):103–123 66. Rao KVS, Nikitin PV, Lam SF (2005) Antenna design for UHF RFID tags: a review and a practical application. IEEE Trans Antennas Propag 53(12):3870–3876 Chapter 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot Eric A. Dieckman and Mark K. Hinders Abstract In order to perform useful tasks for us, robots must have the ability to notice, recognize, and respond to objects and events in their environment. This requires the acquisition and synthesis of information from a variety of sensors. Here we investigate the performance of a number of sensor modalities in an unstructured outdoor environment including the Microsoft Kinect, thermal infrared camera, and coffee can radar. Special attention is given to acoustic echolocation measurements of approaching vehicles, where an acoustic parametric array propagates an audible signal to the oncoming target and the Kinect microphone array records the reflected backscattered signal. Although useful information about the target is hidden inside the noisy time-domain measurements, the dynamic wavelet fingerprint (DWFP) is used to create a time–frequency representation of the data. A small-dimensional feature vector is created for each measurement using an intelligent feature selection process for use in statistical pattern classification routines. Using experimentally measured data from real vehicles at 50 m, this process is able to correctly classify vehicles into one of five classes with 94% accuracy. Keywords Acoustic parametric array · Mobile robotics · Wavelet fingerprint · Vehicle classification 8.1 Overview Useful robots are able to notice, recognize, and respond to objects and events and then make decisions based on this information in real time. I find myself strongly disapproving of drivers checking their smartphones while I’m trying to jaywalk with a cup of coffee from WaWa. I think they should watch out for me, because I’m trying not to spill. Since I’m too lazy to go down to the crosswalk and wait for the light, I usually check for oncoming traffic and note both what type of vehicles are coming my way and estimate my chances of being able to cross safely without spilling on E. A. Dieckman · M. K. Hinders (B) Department of Applied Science, William & Mary, Williamsburg, VA 23187-8795, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_8 247 248 E. A. Dieckman and M. K. Hinders myself. While this may be a simple task for a human, an autonomous robotic assistant must exhibit many human behaviors to successfully complete the task. Now consider having a robotic assistant fetch me a cup of coffee from across the street [1] which I would find useful. In particular, we will consider the sensing aspects of creating a useful autonomous robot. A robot must be able to sense and react to objects and events occurring over a range of distances. For our case of a mobile walking-speed robot, this includes long-range sensors that can detect dangers such as oncoming motor vehicles, in time to evade, as well as close-range sensors that provide more information about stationary objects in the environment. In addition, sensors must be able to provide useful information in a variety of environmental conditions. While an RGB camera may provide detailed information in a well-lit environment, it is less useful on a foggy night. The key to creating a useful autonomous robot is to equip the robot with a number of complementary sensors so that it can learn about its environment and make decisions. In particular, we are interested in the use of acoustic echolocation as a long-range sensor modality for mobile robotics. While sonar has long been used as a sensor in underwater environments, the short propagation of ultrasonic waves in air has restricted its use elsewhere. Lower frequency acoustic signals in the audible range are able to propagate long distances in air, but traditional methods of creating highly directional audible acoustic signals require very large speaker arrays not feasible for a mobile robot. In addition, the complex interactions of these signals with objects in the environment and ubiquitous environmental noise make the reflected signals very difficult to analyze. We use an acoustic parametric array to generate our acoustic echolocation signal. This is a physically small speaker that uses nonlinear acoustics to create a tight beam of low-frequency sound that can propagate long distances [34–38]. Such a highly directional signal provides good spatial resolution that allows a distinction between the target and environmental clutter. Systematic experimental investigations and simulations allow us to study the propagation of these nonlinear sound beams and their interaction with scatterers [39, 40]. These sensor signals are very noisy, making it difficult for the robot to extract useful information. One common technique that can provide additional insight is to transform the problem into an alternate domain. For the simple case of one-dimensional time-domain signal this most commonly takes the form of Fourier transform. While this converts the signal to the frequency domain and can reveal previously hidden information, all time-domain information is lost in the transformation. A better solution for time-domain data is to transform the original signal into a joint time– frequency domain. This can be accomplished by a number of methods, but there is no one best time–frequency representation. Uncertainty limits restrict simultaneous time and frequency resolution, some methods are very complex and hard to implement, and the resulting two-dimensional images can be even more difficult to analyze than the original signal. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 249 8.2 rMary Creating an autonomous robot able to act independently of human control has long been an area of active research in robotics. New low-cost sensors and recent advances in signal processing necessary to analyze large amounts of streaming data have only increased the number of researchers focusing on autonomous robotics, buoyed by a greater public awareness of the field. In particular, DARPA-funded competitions have enabled focused efforts to create autonomous vehicles and humanoid robots. The 2004 DARPA Grand Challenge and follow-on 2007 DARPA Urban Challenge [15] focused on creating autonomous vehicles that could safely operate over complex courses in rural and urban environments. These competitions rapidly expanded the boundaries of the field, leading to recent near-commercial possibilities such as Google’s self-driving cars [16, 17]. Similarly, the recent and rapid rise of unmanned aerial vehicles (AUVs) has led to a large amount of research in designing truly autonomous drone aircraft. Designing autonomous vehicles, whether surface and aerial, comes with its own difficulties, namely, collecting and interpreting data at a fast enough rate to make decisions. This often requires expensive sensors that only large research programs can afford. Commercialization of such technologies will require lower cost alternative sensor modalities. A more tractable problem is the design of walking-speed robotic platforms. Compared to road or air vehicles, the lower speed of such platforms allows the use of low-cost sensor modalities that take longer to acquire and analyze data, since more time is available to make a decision. Using commercially available wheeled platforms (such as all-terrain vehicles) shifts focus from the engineering problems in creating a humanoid robot to the types of sensors used and how such data can be combined. For these reasons, we will focus on the analysis of different sensor modalities for a walking-speed robot. The goal in autonomous robotics is to create a robot with the ability to perform tasks normally accomplished by a human. An added bonus is the ability to do tasks that are too dangerous for humans such as entering dangerous environments in disaster situations. 8.2.1 Sensor Modalities for Mobile Robots We can classify sensor modalities as “active” or “passive” depending on whether they transmit a signal (i.e., radar) or use information already present in the environment (i.e., an RGB image), respectively. The use of passive sensors is often preferred to reduce the possibility of detection in covert operations and to reduce annoyance. Another important consideration is the range at which a given sensor performs. For example, imaging systems can provide detailed information about objects near the sensor but may not detect fast-moving hazards (such as an oncoming vehicle) at a great enough distance to allow a robot time to evade. Long-range sensors such as 250 E. A. Dieckman and M. K. Hinders radar or LIDAR are able to detect objects at a greater distance, giving the robot more time to maneuver out of the way. This long-range detection often requires expensive sensors which don’t provide detailed information about the target. A combination of near- and long-range sensors will give a robot the most information about its environment. Once the sensor has measured some information about its environment, the robot needs to know how to interact. A real-world example of this difficulty comes from agriculture, where smart machines have the ability to replace human workers in repetitive tasks. One agricultural application in particular is the thinning of lettuce, where human laborers paid by the acre thin healthy plants unnecessarily. A robotic “Lettuce Bot” is towed behind a tractor, imaging individual lettuce plants as it passes and using computer vision algorithms to comparing these images to a database of over a million images to decide which plants to remove by dousing them with a concentrated dose of fertilizer [18]. Though this machine claims 98% accuracy while driving at 2 kph and may be cost-competitive with manual labor, it also highlights issues with image-based analysis on mobile robots. Creating a large enough database for different types of lettuce is a monumental task, given the different colors, shapes, soil types, and other variables. Even the sun creates problems, causing shadows that are difficult for the computer vision software to correctly match. Shielding the sensor and restraining the image database to a particular geographical region (thereby reducing the number of lettuce variants and soil types) allow these techniques to work for this particular application but the approach is not scalable to more unstructured environments. While the human brain has evolved to process images quickly and easily, automated interpretation of images is a difficult problem. Using non-imaging sensors can ease the signal processing requirements, but requires sophisticated machine learning techniques to deal with large amounts of abstract data. In addition to the range limitations of different sensors and the difficulty in analyzing the resulting data, individual sensor modalities tend to work better in particular environmental conditions. For example, a webcam can create detailed images in a well-lit environment but fail to provide much useful information on a dark night while passive infrared images can detect subtle changes in emissivity from surfaces in a variety of weather conditions. Because of the limitations of individual sensors, intelligent combinations of complementary sensors must be used to create the most robust awareness of an unstructured environment. The exact manner in which these modalities are combined is referred to as data fusion [19]. Our focus is on the performance of different sensor modalities in real-world, unstructured environments under a variety of environmental conditions. Our robotic sensor platform, rMary, contains a number of both passive and active sensors. Passive vision-based sensors include a standard RGB webcam and infrared sensors operating in both the near-infrared and long-wave regions of the infrared spectrum. Active sensors include a three-dimensional depth mapping system that uses infrared projection, a simple radar system, and a speaker/microphone combination to perform acoustic echolocation in the audible range. We will apply machine learning algorithms to this data to automatically detect and classify oncoming vehicles at long distances. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 251 8.3 Investigation of Sensor Modalities Using rMary To collect data in unstructured outdoor environments, we have created a mobile sensor platform with multiple sensors (Fig. 8.1). This platform, named rMary, was first placed in service in 2003 to collect infrared data from passive objects [20]. The robot is remotely operated using a modified RC aircraft controller and is steered using four independent drive motors synced to allow agile skid-steering. Power is supplied to these motors from a small battery bank built into the base of the platform, where the control electronics are also located. The low center of gravity, inflatable rubber tires, and a custom-built suspension system allow off-road transit to acquire measurements in otherwise inaccessible locations. The current sensors on rMary include the following: • Raytheon ControlIR 2000B long-wave infrared camera • Microsoft Kinect (2010) – Active IR projector – IR and RGB sensors – 4-channel microphone array • Sennheiser Audiobeam acoustic parametric array • Coffee can FMCW ISM-band radar A parabolic dish microphone can also be attached, but the Kinect microphone array provides superior audio recordings. Sensor control and data acquisition are accomplished using a low-powered Asus EeePC 1000h Linux netbook. This underpowered laptop was deliberately used to show that data can be easily acquired with commodity hardware. The computer’s single internal USB hub does restrict the number of simultaneous data streams, which only became an issue when trying to collect video data from multiple hardware devices using non-optimized drivers. Each of the sensors, whose location is shown in Fig. 8.1, will be discussed in the sections that follow. 8.3.1 Thermal Infrared (IR) A Raytheon ControlIR 2000B infrared camera couples a long-wave focal plane array microbolometer detector to a 50 mm lens to provide 320 × 240 resolution at 30 Hz over a 18◦ × 13.5◦ field of view. Although thermal imaging cameras are now low cost and portable enough to be used by home inspectors for energy audits, this was one of the few uncooled, portable infrared imaging systems available when first installed in 2006. These first experiments showed that the sensor was able to characterize passive (non-heat-generating) objects through small changes in their thermal signatures [21, 22]. The sensor measures radiance in the long-wave region (8-15 μm) of the infrared spectrum where radiation from passive objects is maximum. 252 E. A. Dieckman and M. K. Hinders Fig. 8.1 The rMary sensor platform contains a forward-looking long-wave infrared camera, mounted upright in an enclosure for stability and weather protection, an acoustic parametric array, the Microsoft Kinect sensor bar, and a coffee can radar. All sensors are powered by the on-board battery bank and controlled with a netbook computer running Linux For stability and protection from the elements, the camera is mounted vertically in an enclosed locker. A polished aluminum plate with a low emissivity value makes a good reflector of thermal radiation and allows the camera to image objects in front of rMary. Figure 8.2 shows several examples of images of passive objects acquired with the thermal infrared camera, both indoors and outside. 8.3.2 Kinect Automated interpretation of images from the thermal infrared camera requires segmentation of the images to distinguish areas of interest, which can be a difficult image processing task. In addition, the small field-of-view and low resolution of the infrared camera used here led us to investigate possible alternatives. While there are still relatively few long-wave thermal sensors with enough sensitivity to measure the small differences in emissivity between passive objects, other electronics now contain infrared sensors. One of the most exciting alternatives was the Microsoft Kinect, released in November 2010 as an accessory to the Xbox 360 gaming console. The Kinect was immensely popular, selling millions of units in the first several months, and integrates active infrared illumination, an IR sensor, and an RGB camera to output 640 × 480 RGBD (RGB + depth) video at 30 Hz. It also contains a tilt motor, accelerometer, and 4-channel microphone array, all at total cost of less than USD $150. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 253 Fig. 8.2 Examples of passive objects imaged with the thermal IR camera include (clockwise from top left) a car in front of a brick wall, a tree trunk with foliage, a table and chairs in front of a bookcase, and a window in a brick wall Access to this low-cost suite of sensors is provided by two different open source driver libraries: libfreenect [23], with a focus on audio support and motor controls and OpenNI [24], with greater focus on skeletal tracking and object segmentation. Other specialized libraries such as nestk [25] used these drivers to provide highlevel functions and ease of use. In June 2011, Microsoft released their own SDK that provides access to the raw sensor streams and high-level functions, but these libraries only worked in Windows 7 and are closed-source with restrictive licenses [26]. In addition, Microsoft changed the license agreement in March 2012 to require use of the “Kinect for Windows” sensor instead of the identical but cheaper Kinect sensor for Xbox. We investigate the usefulness and limitations of the Kinect sensor for robotics, particularly the raw images recorded from the infrared sensor and the depth-mapped RGB-D images. Since our application is more focused on acquiring raw data for later processing than utilizing the high-level skeletal tracking algorithms, we are using the libfreenect libraries to synchronously capture RGB-D video and multi-channel audio streams from the Kinect. The Kinect uses a structured light approach similar in principle to [27] to create a depth mapping. An infrared projector emits a known pattern of dots, allowing the calculation of depth based on triangulation of the specific angle between the emitter 254 E. A. Dieckman and M. K. Hinders and receiver, an infrared sensor with 1280 × 1024 resolution. The projected pattern is visible in some situations in the raw image from the infrared sensor, to which the open-source drivers allow access. To reduce clutter in the depth mapping, the infrared sensor also has a band-stop filter at the projector’s output frequency of 830 nm. The Kinect is able to create these resulting 640 × 480 resolution, 11-bit depth-mapped images at video frame rates (30 Hz). The stated range of the depth sensing is 1.2–3.5 m, but in the right environments can extend to almost 6 m. An example of this image for an indoor environment is shown in Fig. 8.3, along with the raw image from the IR sensor and a separate photograph for comparison. In addition to this colormapped depth image, the depth information can also be overlaid on the RGB image acquired from a 1280 × 1024 RGB sensor to create a three-dimensional point-cloud representation (Fig. 8.4). Since the Kinect was designed to work as an accessory to a gaming system, it works well in indoor environments, and others have evaluated its applications to indoor robotics, object segmentation and tracking, and three-dimensional scanning. Figure 8.5 shows a sampling of the raw IR and depth-mapped images for several outdoor objects. The most visible feature when compared to images acquired in indoor environments is that the raw infrared images are very well illuminated, or even over-exposed. Because of this, the projector pattern is difficult to detect in the infrared image, and the resulting depth-mapped images don’t tend to have much structure. There is more likely to be useful depth information if the object being imaged is not in direct sunlight and/or is located very close to the sensor. Unlike the thermal IR camera which operates in the long-wave region of the IR regime, the Kinect’s infrared sensor operates in the near-infrared. This is necessary so that a distance can be calculated from the projected image, but the proximity of the near-infrared to the visible spectrum allows the sensor to become saturated. Figure 8.6 shows a series of images of the same scene as the sun emerges from behind a cloud. As there is more sunlight, the infrared sensor becomes saturated and no depth mapping can be constructed. Although the Kinect’s depth mapping is of little use in outdoor environments during the day, it may still be useful outside at night. However, the point-cloud library representation will not work at night because it requires well-illuminated webcam image on which to overlay the depth information. An example of the usefulness of the Kinect depth mapping at night is shown in Fig. 8.7, where the depth mapping highlights obstacles not visible in normal webcam images. In summary, the Kinect’s depth sensor will work outside under certain circumstances. Unfortunately, the Kinect’s infrared sensor will not replace the more expensive thermal imaging camera to detect small signals from passive objects since it operates in the near-infrared regime instead of long-wave regime that is more sensitive to such signals. However, very small and inexpensive LWIR cameras are now available as smartphone attachments. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot Fig. 8.3 In an indoor environment, the Kinect is able to take a raw infrared image (top) and convert it to a corresponding depth-mapped image (middle), which overlays depth information on an RGB image. The speckle pattern barely visible on the raw infrared image is the projected infrared pattern that allows the Kinect to create this depth mapping. As shown in this mapping, darker colored surfaces, such as the desk chair on the left of the image, are closer to the sensor while lighter colors are farther away. Unexpected infrared reflectors can confuse this mapping and produce erroneous results such as the light fixture in the center of the image. The bottom image is a photograph of the same scene (with the furniture slightly rearranged) for comparison 255 256 E. A. Dieckman and M. K. Hinders Fig. 8.4 Instead of the two-dimensional colormapped images, the Kinect depth-mapping can be overlaid on the RGB image and exported to a point-cloud format. These point-cloud library (PCL) images contain real-world distances and allow for three-dimensional visualization on a computer screen. Examples are shown for an indoor scene (top) and an outdoor scene acquired in low-light conditions (bottom), viewed at an oblique angle to highlight the 3-D representation 8.3.3 Audio Our main interest in updating rMary is to see how acoustic sensors could be integrated into mobile robotics. Past work with rMary’s sibling rWilliam (Fig. 8.8) investigated the use of air-coupled ultrasound in mobile robotics [28, 29], as have others [30–32]. The high attenuation of ultrasound in air limits the use of ultrasound scanning for mobile robots. Instead, we wish to study the use of low-frequency acoustic echolocation for mobile robots. This is similar in principle to how bats navigate, though at much lower frequencies and with much lower amplitude signals. A similar use of this technology is found in the Sonar ruler app for the iPhone that attempts to measure distances using the speaker and microphone, with mixed results [33]. Using signals in the audible range reduces the attenuation, allowing for propagation over useful distances. However, there is more background noise in the audible frequency range, 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 257 Fig. 8.5 Images acquired outdoors using the Kinect IR sensor (left) and the corresponding depth mapped images (right) for a window (top) and tree (bottom) show the difficulties sunlight creates for the infrared sensor requiring the use of coded excitation signals and sophisticated signal processing techniques to find the reflected signal in inherently noisy data. One way to ensure that the reflected signal is primarily backscattering from the target rather than clutter (unwanted reflections from the environment) is to create a tightly spatially controlled beam of low-frequency sound using an acoustic parametric array. We can also use insights gleaned from simulations to improve the analysis methods. Dieckman [2] discusses in detail a method of simulating the propagation of the nonlinear acoustic beam produced by the acoustic parametric array and its scattering from targets. The properties of the acoustic parametric array have been studied in depth [34–38] and has been used for area denial, concealed weapons detection, and nondestructive evaluation [39–41]. In brief, the parametric array works by generating ultrasonic signals at frequencies f 1 and f 2 , whose difference is in the audible range. As these signals propagate, the nonlinearity of air causes self-demodulation of the signal, creating signals at the sum ( f 1 + f 2 ) and difference ( f 2 − f 1 ) frequencies. Since absorption is proportional to the square of frequency, only the difference frequency remains as the signal propagates away from the array (Fig. 8.9). The acoustic parametric array allows for tighter spatial control of the lowfrequency sound beam than a standard loudspeaker of the same size. Directivity of 258 E. A. Dieckman and M. K. Hinders Fig. 8.6 As the sun emerges from behind a cloud and sunlight increases (top to bottom), the Kinect’s infrared sensor (left) becomes saturated and the Kinect is unable to construct a corresponding depth mapped image (right) a speaker depends on the ratio of the size of the speaker to the wavelength of sound produced, with larger speakers able to create more directive low-frequency sound. Line arrays (speakers arranged in a row) are the traditional way to create directional low-frequency sound, but can take up a great deal of space [42]. Using nonlinear acoustics, the acoustic parametric array is able to create directional low-frequency sound in a normal-sized speaker, as shown in Fig. 8.10. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 259 Fig. 8.7 The Kinect depth mapping (A) works well in nighttime outdoor environments, detecting a light pole not visible in the illuminated RGB image (B). The image from the thermal camera (C) also shows the tree and buildings in the background, but has a smaller field of view and lower resolution than the raw image from the Kinect IR sensor (D) (images resized from original resolutions) Fig. 8.8 A 50 kHz ultrasound scanner mounted on rWilliam is able to detect objects at close range 260 E. A. Dieckman and M. K. Hinders Fig. 8.9 The acoustic parametric array creates signals at two frequencies f 1 and f 2 in the ultrasonic range (pink shaded region). As the signals propagate away from the parametric array, the nonlinearity of air allows the signals to self-demodulate, creating signals at the sum and difference frequencies. Because attenuation is proportional to the square of frequency, the higher frequency signals attenuate more quickly, and after several meters only the audible difference frequency remains For our tests, we have mounted the Sennheiser Audiobeam parametric array to rMary, with power supplied directly from rMary’s battery. This commercially available parametric array uses a 40 kHz carrier signal to produce audible sound pressure levels of 75 ± 5 dB at a distance of 4 m from the face of the transducer. The echolocation signals we use are audible in front of the transducer at distances exceeding 50 m in a quiet environment, but would not necessarily be obtrusive to pedestrians passing through the area and are easily masked by low levels of external noise. To record the backscattered echolocation signal, as well as ambient noise from our environment, we use the 4-channel microphone array included in the Kinect. This array is comprised of four spatially separated high-quality capsule microphones with a sampling rate of 16 kHz. The array separation is not large enough to allow implementation of beamforming methods at distances of interest here. The low sampling rate means that acoustic signals are limited to a maximum frequency content of 8 kHz. Audio data recorded by the Kinect microphone array was compared to data recorded using a parabolic dish microphone (Dan Gibson EPM, 48 kHz sampling rate), whose reflector dish directs sound onto the microphone. Figure 8.11 shows that 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 261 Fig. 8.10 The acoustic beam created by the acoustic parametric array has a consistently tighter beam pattern than the much physically larger line array at low frequencies and fewer sidelobes at higher frequencies the microphones used in the Kinect actually perform better than the parabolic dish microphone [43, 44]. All data used in our subsequent analysis is recorded with the Kinect array. 8.3.4 Radar The final sensor on rMary is a coffee can radar. A collaboration between MIT and Lincoln Labs in 2011 produced a design for a low-cost radar system that uses two metal coffee cans as antennas [45]. Simple amplifier circuits built on a breadboard power low-cost modular microwave (RF) components to send and acquire signals 262 E. A. Dieckman and M. K. Hinders Fig. 8.11 Even though the 4-channel Kinect microphone array has tiny capsule microphones that only sample at 16 kHz, they provide a cleaner signal than the parabolic dish microphone with a 48 kHz sampling rate Fig. 8.12 A low-cost coffee can radar was built and attached to rMary to test the capabilities of radar sensors on mobile robots through the transmit (Tx) and receive (Rx) antennas. The entire system is powered by 8 AA batteries, which allows easy portability and the total cost of components is less than USD $350. Our constructed coffee can radar is shown in Fig. 8.12. The signal processing requirements of the coffee can radar system are reduced by using a frequency modulated continuous wave (FMCW). In this setup, the radar transmits an 80 MHz chirped waveform centered at 2.4 Ghz (in the ISM band). The same waveform is then used to downconvert, or “de-chirp” the signal so that the 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 263 residual bandwidth containing all the information is small enough to digitize with a sound card. This information is saved in .wav files and analyzed in Matlab. The system as originally designed has 10 mW Tx power with an approximate maximum range of 1 km and can operate in one of three modes: Doppler, range, or Synthetic Aperture Radar (SAR). In Doppler mode the radar emits a continuouswave signal at a given frequency. By measuring any frequency shifts in this signal, moving objects are differentiated from stationary ones. Images from this mode show an object’s speed as a function of time. In ranging mode, the radar signal is frequency modulated, with the magnitude of this modulation specifying the transmit bandwidth. This allows the imaging of stationary or slowly moving objects, and the resulting images show distance from the radar (range) as a function of time. SAR imaging is basically a set of ranging measurements acquired over a wide area to create a three-dimensional representation of the radar scattering from a target [46–48]. While SAR imaging has the greatest potential application in mobile robotics, since the robotic platform is already moving over time, we here look at ranging measurements in our feasibility tests of the radar. Figure 8.13 shows a ranging measurement of three vehicles approaching rMary with initial detection of the vehicles at a distance of 70 m. Since the ranging image is a color-mapped plot of time versus range, the speed of approaching vehicles can also be calculated directly from the image data. Fig. 8.13 The ranging image acquired using a coffee can radar shows three vehicles approaching rMary. The vehicles’ speed can be calculated from the slope of the line 264 E. A. Dieckman and M. K. Hinders These measurements demonstrate the feasibility of this low-cost radar as a longrange sensor for mobile robotics. Since the radar signal is de-chirped to facilitate processing with a computer sound card, these measurements may not contain information about scattering from the object, unlike the acoustic echolocation signal. However, radar ranging measurements could provide an early detection system for a mobile robot, detecting objects at long range before other sensors are used to classify the object. This detection distance is dependent upon a number of parameters, most important of which is the availability of line-of-sight to the target. Over the course of several months, we collected data on and near the William & Mary campus as vehicles approached rMary as shown in Fig. 8.14 at ranges of up to 50 m. The parametric array projected a narrow-beam, linear chirp signal down the road, which backscattered from on-coming vehicles. We noted in each case whether the on-coming vehicle was a car, SUV, van, truck, bus, motorcycle, or other. The rMary platform allows us to investigate the capabilities and limitations of a number of low-cost sensors in unstructured outdoor environments. A combination of short- and long-range sensors provides a mobile robot with the most useful information about its environment. Previous work focused on passive thermal infrared and air-coupled ultrasound as possible short-range sensor modalities. Our work looked at the suitability of the Microsoft Kinect as a short-range active infrared depth sensor, as well as the performance of a coffee can radar and acoustic echolocation via acous- Fig. 8.14 We collected data with rMary on the sidewalk, but with traffic approaching head on 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 265 tic parametric array as long-range sensors for mobile robotics. While the low-cost depth sensor on the Microsoft Kinect is of limited use in outdoor environments, the coffee can radar has the potential to provide low-cost long-range detection capability. In addition, the Kinect microphone array can be paired with an acoustic parametric array to provide high-quality acoustic echolocation measurements. 8.4 Pattern Classification So far we know that our transmitted signal is present in the backscattered reflection from a target at distances exceeding 50 m. The hope is that this reflected signal contains useful information that will allow us to determine the type (class) of vehicle. Since we are using a coded signal we also expect that a time–frequency representation of the data will prove useful in this classification process. The next step is to use statistical pattern classification techniques to find that useful information in these signals to differentiate between vehicles of different classes. These analyzes are written in parallel to run in MATLAB on a computing cluster to reduce computation time. 8.4.1 Compiling Data To more easily compare the large number of measurements from different classes we organize the measured data into structures. The greater than 4000 individual measurements we have collected are spread over 5926 files including audio, radar, and image data organized in timestamped directories. Separate plaintext files associate each timestamp with its corresponding vehicle class. If we are to run our analysis routines on computing clusters, providing access to this more than 3.6 GB of original data becomes problematic. Instead we create smaller data structures containing only the information we require. These datasets range in size from 11–135 MB for 108– 750 measurements and can easily be uploaded to parallel computing resources. Much of the reduction is size is due to the fact that we only require access to the audio data for these tests and can eliminate the large image files. One additional reduction is accomplished by choosing a single channel of the 4-channel Kinect audio data to include in the structure. The array has a small enough spacing that all useful information is present in every channel, as seen in Fig. 8.15. Resampling all data to the acceptable minimum rate allowed by the Nyquist–Shannon sampling theorem further reduces the size of the data structure. Our goal is to differentiate between vehicle classes, so it is natural to create data structures divided by class. Since it doesn’t make sense to compare data from different incident signals, we create these structures for a number of data groups. We have also allowed the option to create combination classes, for example, vans, trucks, and buses are combined into the “vtb” class. This allows vehicles with similar frontal 266 E. A. Dieckman and M. K. Hinders Fig. 8.15 Due to the close spacing of the microphone array on the Kinect, all four channels contain the same information. The parabolic microphone is mounted aft of the Kinect array, causing the slight delay visible here, and is much more sensitive to noise profiles to be grouped together to create a larger dataset to train our classifier. When creating these structures, data is pulled at random from the entire set of possibilities. The data structure can contain either all possible data or equal amounts of data from each class (or combined class), which can help reduce classification errors due to unequal data distribution. It is also important to note that due to the difficulty of detecting individual reflections inside a signal, not every measurement inside the data structure is guaranteed to be usable. Tables 8.1 and 8.2 of the amount of data in each group only provide an upper limit on the number of usable measurements. 8.4.2 Aligning Reflected Signals The first step in our pattern classification process is to align the signals in time. This is crucial to ensure that we are comparing the signals reflected from vehicles to each other, rather than comparing a vehicle reflection to a background measurement that contains no useful information. Our control of the transmitted signal gives us several advantages that we can exploit in this analysis. First, since the frequency content of the transmitted signal is known, we can apply a bandpass filter to the reflected signal to reduce noise at other frequencies. In some cases, this will highlight reflections that were previously hidden in the noise floor, allowing for automated peak detection. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 267 Table 8.1 Maximum amount of data in each classification group (overlaps possible between groups) Group c s v t b m o 5-750 HO 10-750 HO 100-900 HO 250-500 HO 250-500 NHO 250-500 all 250-1000 HO 250-1000 NHO 250-1000 all 250-comb HO 250-comb NHO 250-comb all 520 191 94 61 41 102 552 589 1141 613 630 1243 501 190 88 74 21 95 515 410 925 589 431 1020 52 22 13 16 1 17 55 52 107 71 53 124 32 18 7 3 3 6 42 28 70 45 31 76 15 7 16 3 1 4 27 20 47 30 21 51 5 3 1 0 0 0 3 2 5 3 2 5 1 0 0 3 0 3 3 6 9 6 6 12 Table 8.2 Maximum amount of data in each classification group when binned (overlaps possible). Group c s vtb 5-750 HO 10-750 HO 100-900 HO 250-500 HO 250-500 NHO 250–500 all 250–1000 HO 250–1000 NHO 250–1000 all 250-comb HO 250-comb NHO 250-comb all 520 191 94 61 41 102 552 589 1141 613 630 1243 501 190 88 74 21 95 515 410 925 589 431 1020 99 47 36 22 5 27 124 100 224 146 105 251 More often, however, the backscattered reflection remains hidden among the noise even after a bandpass filter is applied. In this case we obtain better results using peak detection on the envelope signal. To create this signal, we take our original signal f (x) which has already been bandpass filtered, and take the absolute value of its Hilbert transform | fˆ(x)|. This is the analytic signal, which discards the negative frequency components of a signal created by the Fourier transform in exchange for dealing with a complex-valued function. The envelope signal is then constructed by applying a very lowpass filter to the analytic signal. This process is shown in Fig. 8.16. 268 E. A. Dieckman and M. K. Hinders Fig. 8.16 The envelope signal is created by taking the original bandpass-filtered data (top), creating the analytic signal (middle), and applying a very lowpass filter (5th order Butterworth, f c = 20 Hz) (bottom). The envelope signal is created by taking the original bandpass-filtered data (top), creating the analytic signal (middle), and applying a very lowpass filter (5th order Butterworth, f c = 20 Hz) (bottom) In some cases, even peak detection on the envelope signal will not give optimal results. Occasionally, signals will have a non-constant DC offset that complicates the envelope signal. This can often be corrected by detrending (removing the mean) the signal. A more pressing issue is that the envelope signal is not a reliable detection method if reflections aren’t visible in the filtered signal. Even when peaks can be detected in the envelope signal, they tend to be very broad. As a general rule, peak detection is less sensitive to variations in threshold as the peak grows sharper. Adding a step to the peak detection that finds the mean value of connected points above a certain threshold ameliorates this problem, but since the peak widths of the envelope signal are not uniform, finding the same point on each reflected signal becomes an issue. Some of these issues are highlighted in Fig. 8.17. Instead, we can exploit another feature of our transmitted signal—its shape. All of our pulses are linear frequency chirps which have well-defined characteristics and, more importantly, maintain a similar shape even after they reflect from a target. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 269 Fig. 8.17 Even for multiple measurements from the same stationary vehicle, the envelope (red) has difficulty consistently finding peaks unless they are obviously visible in the filtered signal (blue). The shifted cross-correlation (green) doesn’t have this limitation By taking the cross-correlation of our particular transmitted signal and the reflected signal and accounting for the time shift inherent to the process, a sharp peak that can be easily found by an automated peak detection algorithm is created at time point where the reflected signal begins. Peak detection in any form requires setting a threshold at a level which reduces the number of false peaks detected without disqualifying actual peaks. This is a largely trial-and-error process and can easily introduce a human bias into the results. Setting the threshold as a percentage of the signal’s maximum value will also improve the performance of the peak detection. 270 E. A. Dieckman and M. K. Hinders Fig. 8.18 Individual reflections aren’t visible in the bandpass-filtered signal from an oncoming vehicle at 50 m (top) or in its detrended envelope signal (middle). Cross-correlation of the filtered signal and the 100–900 transmitted pulse (bottom) show clear peaks at the beginning of each reflected pulse, which can be used in an automated detection algorithm Several problems with the automated peak detection are clearly visible in Fig. 8.18, where a detection level set to 70% of the maximum value will only detect one of the three separate reflections that a human eye would identify. Although we could adjust the threshold level to be more inclusive, it would also increase the rate of false detection and add more computational load to filter these out. Adjustment of the threshold is also not ideal as it can add a human bias to the procedure. Another issue is due to the shape of the correlated waveform, caused by a vehicle’s noise increasing as it nears the microphone. The extra noise in the first part of the signal is above the detection threshold and will lead to false detection. This is an easier problem to solve—our algorithm will reject any peaks that are not separated by large enough distance. A separation distance of half the length of the cut signals reduces the rate of false detection. It is also important to note the position of the sensors on the robotic platform at this point. If we were using a regular loudspeaker, the transmitted pulse would be recorded along with the reflected signal and the cross-correlation would detect both 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 271 Fig. 8.19 For the simpler case of a stationary van at 25 m, the backscatter reflection is clearly visible once it has been bandpass filtered (blue). Automated peak detection may correctly find the peaks of the envelope signal (red), but is much more sensitive to the threshold level than the shifted cross-correlation signal (green) due to the sharpness of its peaks signals, complicating the detection process. Mounting the microphone array behind the speaker could help, but care would have to be taken with speaker selection. Since the acoustic parametric array transmits an ultrasonic signal, the audible signal is only audible at greater distances than the position of the microphone array. The three detection methods (filtered signal, envelope signal, and shifted crosscorrelation) are summarized in Fig. 8.19, which uses the simplified situation of data from a stationary vehicle at 25 m to illustrate all three methods. For our analysis we will use the shifted cross-correlation to align the signals in time. 8.4.3 Feature Creation with DWFP We use the dynamic wavelet fingerprint (DWFP) to represent our time-domain waveforms in a time-scale domain. This analysis has proven useful in past work to reveal subtle features in noisy signals [4–9] by transforming a one-dimensional, time-domain waveform to a two-dimensional time-scale image. An example of the DWFP process is shown in Fig. 8.20 for real-world data. 272 E. A. Dieckman and M. K. Hinders Fig. 8.20 A one-second long acoustic signal reflected from a bus (top) is filtered (middle) and transformed into a time-scale image that resembles a set of individual fingerprints (bottom). This image is a pre-segmented ternary image that can easily be analyzed using existing image processing algorithms The main advantage of the DWFP process is that the output is a pre-segmented image that can be analyzed using existing image processing techniques. We implement these libraries to create a number of one-dimensional parameter waveforms that describe the image, and by extension of our original signal. This analysis yields approximately 25 parameter waveforms. As an overview, our feature extraction process takes a time-domain signal and applies a bandpass filter. A pre-segmented fingerprint image is created using the DWFP process, from which a number of one-dimensional parameter waveforms are extracted. In effect, our original one-dimensional time-domain signal is now represented by multiple parameter waveforms. Most importantly, the time axis is maintained throughout this process so that features of the parameter waveform are directly correlated to events in the original time-domain signal. A visual representation of the process is shown in Fig. 8.21. The user has control of a large number of parameters in the DWFP creation and feature extraction process, which greatly affects the appearance of the fingerprint images, and thus the extracted features. The parameters that most affect the fingerprint image are the wavelets used for pre-filtering and performing the continuous wavelet transform to create the DWFP image. A list of candidate wavelets is shown in Table 8.3. However, there is no way to tell a priori which combination of parameters will create the ideal representation for a particular application. We use a computing cluster to run this process in parallel for a large number of parameter combinations, combined with past experience with analysis of DWFP images to avoid an entirely brute force implementation. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 273 Fig. 8.21 A one-second long 100–900 backscatter signal is bandpass filtered and converted to a ternary image using the DWFP process. Since the image is pre-segmented it is easy to apply existing image analysis techniques and create approximately 25 one-dimensional parameter waveforms that describe the image. Our original signal is now represented by these parameter waveforms, three examples of which are shown here (ridge count, filled area, and orientation). Since the time axis is maintained throughout the entire process, features in the parameter waveforms are directly correlated to events in the original time-domain image 8.4.4 Intelligent Feature Selection Now that each original one-dimensional backscatter signal is represented by a set of continuous one-dimensional parameter waveforms, we need to determine which will best differentiate between different vehicles. The end goal is to create a small dimensional feature vector for each original backscatter signal which contains the value of a parameter waveform at a particular point in time. By choosing these time 274 E. A. Dieckman and M. K. Hinders Table 8.3 List of usable wavelets. For those wavelet families with multiple representations (db, sym, and coif), the default value used is shown Name Matlab name Prefiltering Transform Haar Daubechies Symlets Coiflets Meyer Discrete meyer Mexican hat Morlet haar db3 sym5 coif3 meyr dmey mexh morl X X X X X X X X X X X X points correctly, we have created a new representation of the signal that is much more information dense than the original signal. This feature vector completely describes the original signal and can be used in statistical pattern classification algorithms to classify the data in seconds. For this analysis we are using a variant of linear discriminant analysis to find the points in time where the parameter waveform has the greatest separation between different classes, but also where signals of the same class have a small variance. For each parameter waveform, all of those signals from a single class are averaged to create a mean and corresponding standard deviation signal. Comparing the mean signals to each other and keeping a running average of the difference allows us to create an overall separation distance signal (δ), while a measure of the variance between signals of the same class comes from the maximum standard deviation of all signals (σ ). Instead of using iterative methods to simultaneously maximize δ and minimize σ , we create a ratio signal ρ = σδ and find its maxima (Fig. 8.22). We save the time point and value of ρ of the top 5–10 points for each extracted feature. When this process has been completed for all parameter waveforms, this list is sorted based on decreasing ρ and reduced to the top 25–50 points, keeping track of both points and feature name. Restricting the process in this manner tends to create a feature vector with components from many of the features, as shown in Fig. 8.23. The number of top points saved for both steps is a user parameter, shown in Table 8.4 and restricted to mitigate the curse of dimensionality. Feature vectors can then be created for each original backscatter signal by taking the value of the selected features at the given points. This results in a final, dense feature vector representation for each original signal. The entire pattern classification process for data from three classes is summarized in Fig. 8.24. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 275 Fig. 8.22 For each of the parameter waveforms (FilledArea shown here), a mean value is created by averaging all the measurements of that class (top). The distance between classes is quantified by the separation distance (middle left) and tempered by the intraclass variance, represented by the maximum standard deviation (middle right). The points that are most likely to separate the classes are shown as the peaks of the joint separation curve (bottom) 8.4.5 Statistical Pattern Classification The final step in our analysis is to test the ability of the feature vector to differentiate between vehicle classes using various pattern classification algorithms. This is often the most time-consuming step in the process, but since we have used intelligent feature selection to create an optimized and small feature vector, this is the fastest step of the entire process here, and can be completed in seconds on a desktop machine. Of course, we have simply shifted the hard computational work that requires a computing cluster to the feature selection step. That is not to say that there are no advantages to doing the analysis this way—having such small feature vectors allows us to easily test a number of parameters of the classification. Before we can run pattern classification routines we must separate our data into training and testing (or validation) datasets. By withholding a subset of the data for testing the classifier’s performance, we can eliminate any “cheating” that comes from using training data for testing. We also use equal amounts of data from each class for both testing and training to eliminate bias from unequal-sized datasets. 276 E. A. Dieckman and M. K. Hinders Fig. 8.23 The list of top features selected for 100–900 (left) and 250–500 (right) datasets illustrate how features are chosen from a variety of different parameter waveforms Table 8.4 List of user parameters in feature selection Setting Options Peakdetect Viewselected Selectnfeats Joint, separate Binary switch Z+ Topnfeats Z+ Description Method to choose top points View selected points Keep this many top points for each feature Keep this many top points overall Our classification routine are run in MATLAB, using a number of standard classifiers included in the PRTools toolbox [14]. Because of our small feature vectors and short classification run time, we run the pattern classification many times, randomizing the data used for the testing and training for each run. This gives us an average classification performance and allows us to use standard deviation of correct classifications as a measure of classification repeatability. While this single-valued metric is useful in comparing classifiers, more detailed information about classifier performance will come from the average confusion matrix. For n classes, this is an n × n matrix that plots the estimated class against the known class. The confusion matrix for a perfect classification would resemble the identity matrix, with values of 1 on the diagonal and 0 elsewhere. An example of a confusion matrix is shown in Fig. 8.25. Fig. 8.24 For each class, every individual measurement is filtered and transformed to a fingerprint image, from which a number of parameter waveforms are extracted. For each of these parameter waveforms, an average is created for each class. A comparison of these average waveforms finds the points that are best separated between the classes, and the feature vector is compiled using the values of the parameter waveform at these points. This image diagrams the process for sample data from three classes (blue, green, and red) 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 277 278 E. A. Dieckman and M. K. Hinders Fig. 8.25 This example of a real confusion matrix shows good classification performance, with high values on the diagonal and low values elsewhere. The confusion matrix allows more detailed visualization of the classification performance for specific classes than the single-value metric of overall percent correct. For example, although this classification has a fairly high accuracy of 86% correct, the confusion matrix shows that most of the error comes from misclassifying SUVs into the van/truck/bus class [udc classifier, 20 runs] 8.5 Results We illustrate the use of the pattern classification analyses on data collected from both stationary and oncoming vehicles. Due to the similar frontal profiles of vans, trucks, and buses, and to mitigate the small number of observations recorded from these vehicles, we will create a combined class of these measurements. The classes for classification purposes are then “car”, “SUV”, and “van/truck/bus”. For this threeclass problem, a classification accuracy of greater than 33% means the classifier is performing better than random guessing. 8.5.1 Proof-of-Concept: Acoustic Classification of Stationary Vehicles We begin our analysis with data collected from stationary vehicles. The first test is a comparison of classification performance when observations come from a single vehicle as compared to multiple vehicles. Multiple observations were made from vehicles in a parking lot at distances between 5 and 20 m. The orientation is approximately head-on (orthogonal) but with slight repositioning after every measurement to construct a more realistic dataset. The classification accuracy shown in Fig. 8.26 validates our expectation that classification performs better when observations are exclusively from a single vehicle rather than from a number of different vehicles. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 279 Fig. 8.26 The first attempt to classify vehicles based on their backscattered acoustic reflection from a 250–1000 transmitted signal shows very good classification accuracy when all observations come from a single vehicle at 20 m (top). When observations come from multiple vehicles at 5 m, the classification accuracy is much lower but still better than random guessing (bottom) [knnc classifier, 10 runs] We see that reasonable classification accuracies can be achieved even when the observations come from multiple vehicles. Optimizing the transmitted signal and recognizing the importance of proper alignment of the reflected signals will help improve classification performance, as seen in Fig. 8.27. Here, data is collected from multiple stationary vehicles at a range of short distances between 10 and 25 m using both a 10–750 and 250–1000 transmitted signal. Classification performance seems slightly better for the shorter chirp-length signal, but real improvement comes from ensuring the signals are aligned in time. For this dataset, alignment was ensured by visual inspection of all observations in the 250–1000 dataset. This labor-intensive manual inspection has been replaced by cross-correlation methods described earlier in the analysis of data from oncoming vehicle. While these initial tests exhibit poorer classification performance than the data from oncoming vehicles, it is worth noting that these datasets consist of relatively few observations and are intended as a proof-of-concept. These stationary datasets were used to optimize the analysis procedure for the more interesting data from oncoming vehicles. For example, alignment algorithms weren’t yet completely developed, and the k-nearest neighbor classifier used to generate the above confusion matrices has proven to have consistently worse performance than the classifiers used for 280 E. A. Dieckman and M. K. Hinders Fig. 8.27 Observations from multiple stationary vehicles at distances of 10–25 m shows approximately equal classification performance for both the 10–750 (left) and 250–1000 (right) transmitted signals. Manual inspection of the observations in the 250–1000 dataset to ensure clear visual alignment leads to markedly improved classification performance (bottom) [knnc, 10 runs] the results shown from oncoming vehicles. Nevertheless, we see that better-thanrandom-guessing classification accuracy is possible using only the acoustic echolocation signal. 8.5.2 Acoustic Classification of Oncoming Vehicles Now that we have seen that it is possible to classify stationary vehicles using only the reflected acoustic echolocation signal, the more interesting problem is trying to classify oncoming vehicles at greater distances. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 281 Since the DWFP feature creation process has a large number of user parameters, the first step was to find which few parameters will give us the best classification performance. This reduces the parameter space in our analysis and allows us to focus on more interesting details of the classification, such as the effect of the transmitted signal on the classification accuracy. Through previous work we have seen that the choice of wavelet in both the prefiltering and transform stage in the DWFP process causes the greatest change in the fingerprint appearance, and thus the features extracted from the fingerprint. Table 8.5 shows the classification accuracy for different prefiltering wavelets, while Table 8.6 shows the classification accuracy for different transform wavelets. In both cases, the dataset being classified is from vehicles at 50 m approaching head-on using the 100–900 transmitted signal. The settings for other user parameters are: filtering at 5 levels, removing the first 5 details, 15 slices of thickness 0.03, and removing fingerprints that do not have a solidity in the range 0.3–0.6. The mean correct classification rate is created from 20 classification runs for each classifier. These are the parameters for the rest of our analysis unless noted otherwise. The choice of prefiltering wavelet does not affect the classification accuracy much. The variance measure (given by the standard deviation of repeated classifications) is not shown, but is consistently around 0.07 for all classifiers. With this knowledge, there is no obvious preference of prefiltering wavelet and we chose coif3 for further analysis. The choice of transform wavelet does seem to affect the classification accuracy somewhat more than the choice of the prefiltering wavelet. Still, the choice of classifier is by far the most important factor in classification accuracy. We select db3 as the default transform wavelet, with dmey and sym5 as alternatives. From this analysis we can also select a few good classifiers for our problem. Pre-selecting classifiers violate the Ugly Duckling theorem, which states that we should not prefer one classifier over another, but because the underlying physical situation is similar between all of our datasets we are justified in selecting a small number of well-performing classifiers. We will use the top five classifiers from our initial analysis: nmsc, perlc, ldc, fisherc, and udc. The klldc and pcldc classifiers also performed well, but since they are closely related to ldc, we choose other classifiers for diversity and to provide a good mix of parametric and non-parametric classifiers. More in-depth analysis of the effect that physical differences have on classification accuracy can be explored, using coif3 as a prefiltering wavelet and db3 as the transform wavelet, with the nmsc, perlc, ldc, fisherc, and udc classifiers. Table 8.7 shows the datasets constructed for the following analyses. All datasets are constructed from data pulled at random from the overall datasets from that particular signal type and contain an equal number of instances for each class. Most of the datasets consist of three classes: car (c), SUV (s), and a combined van/truck/bus (vtb), though few datasets with data from all five classes: car (c), SUV (s), van (v), truck (t), and bus (b) were created to attempt this individual classification. Requiring an equal number of instances from each class leads to small datasets, even after creating the combined van/truck/bus class to mitigate this effect. In addition, not all Classifier qdc udc haar 0.40 0.81 db3 0.49 0.78 sym5 0.41 0.81 coif3 0.55 0.85 Average 0.46 0.81 PW ldc 0.83 0.80 0.81 0.77 0.80 klldc 0.78 0.76 0.81 0.80 0.79 pcldc 0.79 0.78 0.81 0.79 0.79 nmc 0.66 0.57 0.54 0.60 0.59 nmsc 0.87 0.86 0.85 0.89 0.87 loglc 0.75 0.69 0.73 0.77 0.73 fisherc 0.82 0.79 0.79 0.80 0.80 knnc 0.56 0.59 0.54 0.57 0.56 parzenc 0.63 0.58 0.52 0.59 0.58 parzendc 0.73 0.78 0.72 0.69 0.73 kernelc 0.59 0.57 0.58 0.66 0.60 perlc 0.86 0.80 0.84 0.86 0.84 svc 0.67 0.72 0.73 0.72 0.71 nusvc 0.70 0.68 0.69 0.71 0.69 treec 0.51 0.49 0.54 0.47 0.50 Average 0.70 0.69 0.69 0.71 0.70 Table 8.5 A comparison of prefiltering wavelet (PW) choice on classification accuracy. The transform wavelet is db3. Data is from the 100–900 approaching vehicle dataset with a train/test ratio of 0.7 and classification into three classes (c, s, vtb). The differences in performance between classifiers falls within the measure of variance for a single classifier (not shown here for reasons of space). Since there seems to be no preferred prefiltering wavelet, future analysis will use coif3 282 E. A. Dieckman and M. K. Hinders haar db3 sym5 coif3 meyr dmey mexh morl Average TW Classifier qdc udc 0.45 0.68 0.55 0.85 0.55 0.77 0.45 0.77 0.43 0.65 0.43 0.84 0.47 0.64 0.47 0.68 0.47 0.73 ldc 0.73 0.77 0.78 0.68 0.77 0.82 0.76 0.82 0.77 klldc 0.67 0.80 0.79 0.68 0.77 0.80 0.77 0.82 0.76 pcldc 0.68 0.79 0.81 0.67 0.77 0.82 0.73 0.82 0.76 nmc 0.42 0.60 0.53 0.49 0.51 0.50 0.55 0.52 0.51 nmsc 0.81 0.89 0.86 0.76 0.86 0.88 0.78 0.84 0.83 loglc 0.59 0.77 0.72 0.64 0.66 0.81 0.63 0.72 0.69 fisherc 0.66 0.80 0.82 0.64 0.80 0.80 0.76 0.83 0.76 knnc 0.47 0.57 0.53 0.49 0.53 0.47 0.57 0.51 0.52 parzenc 0.47 0.59 0.54 0.51 0.61 0.50 0.51 0.51 0.53 parzendc 0.67 0.69 0.73 0.68 0.68 0.71 0.63 0.76 0.69 kernelc 0.54 0.66 0.61 0.55 0.62 0.54 0.61 0.51 0.58 perlc 0.77 0.86 0.80 0.75 0.77 0.88 0.73 0.85 0.80 svc 0.61 0.72 0.71 0.65 0.69 0.71 0.64 0.70 0.68 nusvc 0.63 0.71 0.66 0.63 0.65 0.62 0.63 0.65 0.65 treec 0.48 0.47 0.49 0.56 0.47 0.52 0.44 0.51 0.49 Average 0.61 0.71 0.69 0.62 0.66 0.69 0.64 0.68 0.70 Table 8.6 A comparison of transform wavelet (TW) choice on classification accuracy shows very similar classification performance for many wavelet choices. The prefiltering wavelet is coif3. Data is from the 100–900 approaching vehicle dataset with train/test ratio of 0.7 and classification into three classes (c, s, vtb). Due to space constraints, the variance is not shown 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 283 284 E. A. Dieckman and M. K. Hinders Table 8.7 A survey of the datasets used in this analysis (in order of appearance) shows the number of classes, total number of instances, and distance at which data was acquired. The small size of the datasets is a direct result of requiring the datasets to have an equal number of instances per class and the relatively few observations from vans, trucks, and buses Dataset Classes Instances Distance (m) 100–900 a 100–900 b 100–900 c 250-comb a 250-comb b 250-comb c 250–500 250–1000 250-comb HO 250-comb NHO 5–750 10–750 100–900 3 3 3 3 3 3 3 3 3 3 3 3 5 108 108 108 251 251 251 66 66 98 98 108 108 35 50 50 50 25, 30, 50 25, 30, 50 25, 30, 50 30, 50 < 25, 25, 30, 60 50 25, 30 50 50 50 Table 8.8 Increasing the amount of data used for training increases the classification accuracy, but reduces the amount of available data for validation. As too much of the available data is used for training the classifier becomes overtrained and the variance of the accuracy measurement increases. Data is from the 100–900 approaching vehicle dataset with classification into three classes (c, s, vtb) Train % Classifier nmsc perlc ldc fisherc udc Avg 0.25 0.5 0.6 0.7 0.8 0.9 0.82 ± 0.06 0.89 ± 0.05 0.89 ± 0.03 0.90 ± 0.06 0.91 ± 0.06 0.91 ± 0.08 0.77 ± 0.06 0.85 ± 0.04 0.84 ± 0.06 0.87 ± 0.05 0.89 ± 0.06 0.86 ± 0.12 0.52 ± 0.08 0.69 ± 0.07 0.76 ± 0.06 0.81 ± 0.06 0.79 ± 0.11 0.82 ± 0.13 0.43 ± 0.09 0.74 ± 0.08 0.75 ± 0.06 0.79 ± 0.06 0.80 ± 0.10 0.82 ± 0.12 0.72 ± 0.08 0.80 ± 0.05 0.82 ± 0.06 0.83 ± 0.06 0.82 ± 0.07 0.88 ± 0.10 0.65 0.79 0.81 0.84 0.84 0.86 of the instances are usable due to the difficulty of detecting and aligning the signals. This is especially true for the 250 ms signals. We will first look at the influence of the train/test ratio on the classification performance. Table 8.8 shows the classification accuracy as a function of train/test ratio for the same 100–900 dataset used in our earlier analysis of wavelets and classifiers shown in Tables 8.5 and 8.6. In general, the classifiers are able to perform well even when only 25% of the dataset is used for training, with the notable exception of the fisherc classifier. Classification accuracy increases with increasing training ratio, but when too much of the dataset is used for training (90% here) not enough data is 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 285 Table 8.9 Classification of datasets whose instances are selected at random from the larger dataset containing all possible observations shows repeatable results for both 100–900 and 250-comb data. The overall lower performance of the 250-comb datasets is likely due to the greater variety in the observations present in this dataset Dataset Classifier nmsc perlc ldc fisherc udc Avg 100–900: a b c 250-comb: a b c 0.81 ± 0.04 0.84 ± 0.06 0.86 ± 0.05 0.60 ± 0.06 0.58 ± 0.05 0.52 ± 0.06 0.84 ± 0.05 0.76 ± 0.09 0.84 ± 0.05 0.55 ± 0.06 0.55 ± 0.06 0.45 ± 0.05 0.78 ± 0.05 0.69 ± 0.08 0.79 ± 0.07 0.56 ± 0.06 0.56 ± 0.06 0.51 ± 0.06 0.78 ± 0.06 0.74 ± 0.08 0.77 ± 0.06 0.56 ± 0.03 0.54 ± 0.03 0.50 ± 0.05 0.76 ± 0.07 0.71 ± 0.08 0.72 ± 0.05 0.53 ± 0.04 0.56 ± 0.05 0.50 ± 0.04 0.79 0.75 0.77 0.56 0.56 0.50 available for validation and the variance of the classification accuracy increases. A more in-depth look at this phenomenon comes from the confusion matrices, shown in Fig. 8.28. For future analysis we choose a train/test ratio of 0.6 to ensure we have enough data for validation, with the caveat that our classification performance could be few points higher if we used a higher training ratio. 8.5.2.1 Repeatability of Classification Results Since our code creates a dataset for classification by randomly selecting observations from a given class from among all the total possibilities, we would expect some variability between these separate datasets. Table 8.9 shows the classification results for three datasets compiled from all available data from the 100–900 and 250-comb overall datasets. Classification performance is similar among both the 100–900 and 250-comb datasets. The 250-comb dataset has an overall lower classification performance, likely due to the greater variety of observations present in the dataset. The 250-comb data is a combination of 250–500 and 250–1000 data, created with the assumption that the time between chirps is less important than the length of the chirp. A comparison of classification performance of all the 250 ms chirps, shared in Table 8.10 and Fig. 8.29 calls this into question. While it is possible that this particular 250-comb dataset that was randomly selected from the largest and most diverse dataset just suffered from bad luck, there is clearly a difference in classification performance between the 250–500/250–1000 datasets and the combined 250-comb dataset. This leads us to believe that the entire transmitted signal, including the space between chirps, is important in defining a signal, rather than just the chirp length. That said, both the 250–500 and 250–1000 datasets exhibit good classification performance, albeit with large variances. These large variances are caused by the relatively small number of useful observations in each dataset. Figure 8.30 shows 286 E. A. Dieckman and M. K. Hinders Fig. 8.28 This example of the classification of 98 instances pulled from the 100–900 dataset shows how increasing the training ratio first improves classification performance and then increases variance due to overtraining and a lack of data to validate the classifier. The mean confusion matrix is shown for the fisherc classifier at 25% training data (top), 60% train (middle), and 90% train (bottom) 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 287 Fig. 8.29 A comparison of the confusion matrices (perlc classifier) of the 250–500 (top), 250–1000 (middle), and 250-comb (bottom) datasets shows that the non-combined datasets have a much higher classification performance than the 250-comb dataset. Both the 250–500 and 250–1000 datasets are small, with 17 and 19 total instances, respectively, compared with the 215 total instances of the 250-comb dataset. The classification performance of the 250–1000 dataset is almost perfect 288 E. A. Dieckman and M. K. Hinders Table 8.10 A comparison of classification performance between the 250 datasets shows that the spacing between chirps is important in defining our transmitted signal Dataset Classifier nmsc perlc ldc fisherc udc Avg 250–500 0.92 ± 0.09 0.85 ± 0.12 0.72 ± 0.16 0.75 ± 0.14 0.70 ± 0.15 0.79 250–1000 0.99 ± 0.03 0.98 ± 0.05 0.56 ± 0.22 0.68 ± 0.19 0.97 ± 0.07 0.84 250-comb: a 0.60 ± 0.06 0.55 ± 0.06 0.56 ± 0.06 0.56 ± 0.03 0.53 ± 0.04 0.56 Table 8.11 A comparison of 250-comb data acquired in a “head-on” orientation and data collected at a slight angle shows that both orientations have a similar classification performance. Orientation Classifier nmsc perlc ldc fisherc udc Avg Head-on Oblique 0.64 ± 0.08 0.56 ± 0.11 0.54 ± 0.11 0.52 ± 0.09 0.65 ± 0.12 0.58 0.63 ± 0.09 0.56 ± 0.07 0.57 ± 0.10 0.55 ± 0.12 0.58 ± 0.08 0.58 an example of how these variances play out and why the confusion matrices are so important to understanding the classification results. Here, both the ldc and udc classifiers have an average overall correct rate of around 70%, but the udc classifier has difficulty correctly classifying the van/truck/bus class. This example also illustrates the importance of having large datasets to create training and testing sets with a sufficient number of observations for each class. Both the 250–500 and 250–1000 datasets used here have fewer than 10 instances per class, meaning that even at a 60% training percentage, the classification can only be tested on a few instances. We are forced to use these small datasets in this situation because our automated detection routing has a good deal of difficulty locating peaks from the 250 signals. The detection rate of this 250–500 dataset is 26% and the rate for the 250–1000 dataset is 29%. This is compared to a detection rate of 89% for the 100–900 signal. For this reason, and reasons discussed earlier, the 100–900 signal remains our preferred transmission signal. 8.5.2.2 Head-on Versus Oblique Reflections Another useful comparison is between data acquired “head-on” and at a slight angle. Results from stationary vehicles confirm that the recorded signal contains reflected pulses, and Table 8.11 shows that both datasets have a similar classification performance. With an average correct classification rate of 58% for both, this 250-comb data isn’t an ideal dataset for reasons that we discussed above, but was the only dataset containing observations from both orientations. 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 289 Fig. 8.30 A comparison of the confusion matrices from the ldc (top) and udc (bottom) classifiers on data from the 250–500 dataset highlights the importance of using the extra information present in the confusion matrix. Both classifiers have a mean overall correct rate of approximately 70%, but the udc classifier has much more difficult time classifying vans/trucks/buses into the correct class 8.5.2.3 Comparison of All Input Signals and Classification into Five Classes Finally, we make an overall comparison between the different incident signals for the three-class problem, and make an attempt at classifying the data from one dataset into the five original, non-grouped classes. 290 E. A. Dieckman and M. K. Hinders Table 8.12 A comparison of datasets from all of the transmitted signal types shows very good classification performance for all classifiers except ldc and fisherc. Removing these classifiers gives the average in the last column (avg2). The 100–900 and 250 datasets have been discussed previously in more detail and are included here for completeness. The final row shows the lone five-class classification attempt, with only seven instances per class Signal Classifier nmsc perlc ldc fisherc udc Avg Avg2 0.98 ± 0.05 10–750 0.96 ± 0.06 100–900 0.89 ± 0.06 250–500 0.92 ± 0.09 250–1000 0.99 ± 0.03 100–900 0.94 ± 5C 0.09 5–750 0.99 ± 0.02 0.94 ± 0.04 0.86 ± 0.05 0.85 ± 0.12 0.98 ± 0.05 0.92 ± 0.08 0.53 ± 0.19 0.61 ± 0.13 0.77 ± 0.07 0.72 ± 0.16 0.56 ± 0.22 0.47 ± 0.14 0.54 ± 0.14 0.50 ± 0.17 0.80 ± 0.09 0.75 ± 0.14 0.68 ± 0.19 0.45 ± 0.19 0.96 ± 0.07 0.88 ± 0.09 0.85 ± 0.06 0.70 ± 0.15 0.97 ± 0.07 0.82 ± 0.15 0.80 0.98 0.78 0.93 0.83 0.87 0.79 0.82 0.84 0.98 0.72 0.89 Table 8.12 shows the results from these comparisons. Even though our reflection detection algorithm has difficulties with both the 5–750 and 10–750 datasets (as well as with the 250- datasets as discussed earlier) and can only detect the reflection in 25% of the observations, we get good classification performance. The ldc and fisherc classifiers give anomalously low mean overall classification performance rates with high variance. Removing these classifiers, we can calculate an average performance for the remaining three classifiers, shown in the last column of Table 8.12. With mean overall classification rates ranging from 82 to 98% we can’t say much about one signal being preferred to another, except that our algorithms are able to detect the reflections in the 100–900 signal best. Even with the severely limited data available for classification into five classes (only 7 instances per class), we surprisingly find good classification performance, with an average classification rate of 89%. The best (nmsc at 94%) and worst (udc at 82%) classifiers for this data is shown in Fig. 8.31. 8.6 Conclusions We have shown that oncoming vehicles can be classified with a high amount of accuracy, and at useful distances, using only reflected acoustic pulses. Finding and aligning these reflected signals is a nontrivial, vital step in the process, but one that can be successfully automated, especially if the transmitted signal is optimized to the application. Useful feature vectors that differentiate between vehicles of different 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 291 Fig. 8.31 Classification of 100–900 data into five classes is only possible with a limited dataset of 7 instances per class, but manages mean classification performance rates of 82% for the udc classifier (top) and 94% for the nmsc classifier (bottom). Classification of 100–900 data into five classes is only possible with a limited dataset of seven instances per class, but manages mean classification performance rates of 82% for the udc classifier (top) and 94% for the nmsc classifier (bottom) 292 E. A. Dieckman and M. K. Hinders classes can be formed by using the Dynamic Wavelet Fingerprint to create alternative time–frequency representations of the reflected signal, and intelligent feature selection algorithms create information-dense representations of our data that allow for very fast and accurate classification. We have found that a 100–900 linear chirp transmitted signal is best optimized for this particular problem. The signal contains enough energy to propagate long distances, while remaining compact enough in time to allow easy automatic detection. With this signal we can consistently attain correct overall classification rates upwards of 85% at distances of 50 m. We have investigated a number of sensor modalities that may be appropriate for mobile walking-speed robots operating in unstructured outdoor environments. A combination of short- and long-range sensors is necessary for a robot to capture usable data about its environment. Our prior work had focused on passive thermal infrared and air-coupled ultrasound as possible short-range sensor modalities. This work looked at the suitability of the Microsoft Kinect as a short-range active infrared depth sensor, as well as the performance of a coffee can radar and acoustic echolocation via acoustic parametric array as long-range sensors for mobile robotics. In particular, we have demonstrated that the most exciting feature of the Microsoft Kinect, a low-cost depth sensor, is of limited use in outdoor environments. The active illumination source in the near infrared is both limited to a range of several meters and easily saturated by sunlight so that it is mostly useful in nighttime outdoor environments. The infrared sensor is tuned to this near infrared wavelength and provides little more information than the included RGB webcam. The Kinect 4-channel microphone array proved to be of high quality. The microphones are not spatial separated enough to allow for implementation of beamforming methods at distances over several meters and are limited to a relatively low 16 kHz sampling rate by current software, but the design of the capsule microphones and built-in noise cancellation algorithms allow for high-quality recording. Construction of a coffee can radar showed that such devices are feasible for mobile robotics, providing long-range detection capability at low cost and in a physically small package. Since the radar signal is de-chirped to facilitate processing with a computer sound card, these measurements do not contain much useful information about scattering from the target. However, radar ranging measurements could provide an early detection system for a mobile robot, detecting objects at long range before other sensors are used to classify the object. Another exciting possible use of the radar sensor is the creating of synthetic aperture radar (SAR) images. This method to create a three-dimensional representation of the radar scattering from a target is essentially a set of ranging measurements acquired over a wide area. Normally this requires either an array of individual radar sensors or a radar that can be steered by beam forming but is a natural fit for mobile robotics since the radar sensor is in motion on a well-defined path. The main focus of our work has been the use of acoustic echolocation as a longrange sensor for mobile robotics. Using coded signals in the audible range increases the range of the signal while still allowing for detection in noisy environments. The acoustic parametric array is able to create a tight beam of this low-frequency sound, 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 293 directing the majority of the sound energy on the target. This serves the dual purpose of reducing clutter in the backscattered signal and keeping noise pollution added to the surrounding environment to a minimum level. As a test of this sensor modality, several thousand acoustic echolocation measurements were acquired from approaching vehicles in a variety of environmental conditions. The goal was to classify the vehicle into one of five classes (car, SUV, van, truck, or bus) based on the frontal profile. To test feasibility as a long-range sensor, vehicles were interrogated at distances up to 50 m. Initial analysis of the measured backscattered data showed that useful information about the target under is buried deep in noise. Time–frequency representations of the data, in particular, representations created using the dynamic wavelet fingerprint (DWFP) process reveal hidden information. The formal framework of statistical pattern classification allowed us to intelligently create small-dimensional, informationdense feature vectors that best describe the target. This process was able to correctly classify vehicles using only the backscattered acoustic signal with 94% accuracy. References 1. Pratkanis AT, Leeper AE, Salisbury JK (2013) Replacing the office intern: an autonomous coffee run with a mobile manipulator. In: IEEE international conference on robotics and automation 2. Dieckman EA (2013) Use of pattern classification algorithms to interpret passive and active data streams from a walking-speed robotic sensor platform, William & Mary Doctoral Dissertation 3. Cohen L (1995) Time-frequency analysis. Prentice Hall, New Jersey 4. Hou J, Hinders M (2002) Dynamic wavelet fingerprint identification of ultrasound signals. Mater Eval 60(9):1089–1093 5. Hinders M, Hou J, Keon JM (2005) Wavelet processing of high frequency ultrasound echoes from multilayers. In: Review of Progress in Quantitative Nondestructive Evaluation, vol 24, pp 1137–1144 6. Hou J, Hinders M, Rose S (2005) Ultrasonic periodontal probing based on the dynamic wavelet fingerprint. J Appl Signal Process 7:1137–1146 7. Hinders M, Jones R, Leonard KR (2007) Wavelet thumbprint analysis of time domain reflectometry signals for wiring flaw detection. Eng Intell Syst 15(4):65–79 8. Bertoncini C, Hinders M (2010) Fuzzy classification of roof fall predictors in microseismic monitoring. Measurement 43:1690–1701 9. Miller CA (2013) Intelligent feature selection techniques for pattern classification of timedomain signals. Doctoral dissertation, The College of William and Mary 10. Bingham J, Hinders M (2009) Lamb wave characterization of corrosion-thinning in aircraft stringers: experiment and 3d simulation. J Acous Soc Am 126(1):103–113 11. Bertoncini C, Hinders M (2010) Ultrasonic periodontal probing depth determination via pattern classification. In: Thompson D, Chimenti D (eds) 31st review of progress in quantitative nondestructive evaluation, vol 29. AIP Press, pp 1556–1573 12. Miller CA, Hinders MK (2014) Classification of flaw severity using pattern recognition for guided wave-based structural health monitoring. Ultrasonics 54:247–258 13. Bertoncini C (2010) Applications of pattern classification to time-domain signals. Doctoral dissertation, The College of William and Mary 14. Duin R, Juszczak P, Paclik P, Pekalska E, de Ridder D, Tax D, Verzakov S (2007) Prtools4.1, a Matlab toolbox for pattern recognition. Delft University of Technology, http://prtools.org/ 15. DARPA (2004) DARPA grand challenge, http://archive.darpa.mil/grandchallenge04/ 294 E. A. Dieckman and M. K. Hinders 16. Guizzo E (2011) How Google’s self-driving car works. IEEE spectrum online, http://spectrum. ieee.org/automaton/robotics/artificial-intelligence/how-google-self-driving-car-works 17. Fisher A (2013) Inside Google’s quest to popularize self-driving cars. Popular science. http:// www.popsci.com/cars/article/2013-09/google-self-driving-car 18. The Economist, “March of the lettuce bot,” The Economist, December 2012. http://economist. com/news/technology-quarterly/21567202-robotics-machine-helps-lettuce-farmers-justone-several-robots/ 19. Hall DL, Llinas J (1997) An introduction to multisensor data fusion. Proc IEEE 85:6–23 20. Fehlman WL (2008) Classification of non-heat generating outdoor objects in thermal scenes for autonomous robots. Doctoral dissertation, The College of William and Mary 21. Fehlman WL, Hinders MK (2010) Passive infrared thermographic imaging for mobile robot object identification. J Field Robot 27(3):281–310 22. Fehlman WL, Hinders M (2009) Mobile robot navigation with intelligent infrared image interpretation. Springer, London 23. Open Kinect. http://openkinect.org/ 24. OpenNI. http://openni.org/ 25. Burrus N, RGBDemo. http://labs.manctl.com/rgbdemo/ 26. Microsoft Research, Kinect for windows SDK. http://www.microsoft.com/en-us/ kinectforwindows/ 27. Liebe CC, Padgett C, Chapsky J, Wilson D, Brown K, Jerebets S, Goldberg H, Schroeder J (2006) Spacecraft hazard avoidance utilizing structured light. In: IEEE aerospace conference. Big Sky, Montana 28. Gao W, Hinders MK (2005) Mobile robot sonar interpretation algorithm for distinguishing trees from poles. Robot Autonom Syst 53:89–98 29. Hinders M, Gao W, Fehlman W (2007) Sonar sensor interpretation and infrared image fusion for mobile robotics. In: Kolski S (ed) Mobile robots: perception and navigation. Pro Literatur Verlag, Germany, pp 69–90 30. Barshan B, Kuc R (1990) Differentiating sonar reflections from corners and planes by employing an intelligent sensor. IEEE Trans Pattern Anal Mach Intell 12(6):560–569 31. Kleeman L, Kuc R (1995) Mobile robot sonar for target localization and classification. Int J Robot Res 14(4):295–318 32. Hernandez A, Urena J, Mazo M, Garcia J, Jimenez A, Jimenez J, Perez M, Alvarez F, DeMarziani C, Derutin J, Serot J (2009) Advanced adaptive sonar for mapping applications. J Intell Robot Syst 55:81–106 33. Laan Labs, Sonar ruler. https://itunes.apple.com/app/sonar-ruler/id324621243?mt=8. Accessed 5 Aug 2013 34. Westervelt P (1963) Parametric acoustic array. J Acous Soc Am 35:535–537 35. Bennett MB, Blackstock D (1975) Parametric array in air. J Acous Soc Am 57:562–568 36. Yoneyama M, Fujimoto J, Kawamo Y, Sasabe S (1983) The audio spotlight: an application of nonlinear interaction of sound waves to a new type of loudspeaker design. J Acous Soc Am 73:1532–1536 37. Pompei FJ (1999) The use of airborne ultrasonics for generating audible sound beams. J Audio Eng Soc 47:726–731 38. Gan W-S, Yang J, Kamakura T (2012) A review of parametric acoustic array in air. Appl Acoust 73:1211–1219 39. Achanta A, McKenna M, Heyman J (2005) Non-linear acoustic concealed weapons detection. In: Proceedings of the 34th applied imagery and pattern recognition workshop (AIPR05), pp 7–27 40. Hinders M, Rudd K (2010) Acoustic parametric array for identifying standoff targets. In: Thompson D, Chimenti D (eds) 31st review of progress in quantitative nondestructive evaluation, vol 29. AIP Press, pp 1757–1764 41. Calicchia P, Simone SD, Marcoberardino LD, Marchal J (2012) Near- to far-field characterization of a parametric loudspeaker and its application in non-destructive detection of detachments in panel paintings. Appl Acoust 73:1296–1302 8 Pattern Classification for Interpreting Sensor Data from a Walking-Speed Robot 295 42. Ureda MS (2004) Analysis of loudspeaker line arrays. J Audio Eng Soc 52:467–495 43. Tashev I (2009) Sound capture and processing: practical approaches. Wiley, West Sussex 44. Tashev I (2011) Audio for kinect: from idea to “Xbox, play!”. http://channel9.msdn.com/ events/mix/mix11/RES01 45. Charvat GL, Williams JH, Fenn AJ, Kogon S, Herd JS (2011) Build a small radar system capable of sensing range, doppler, and synthetic aperture radar imaging. Massachusetts Institute of Technology: MIT OpenCourseWare RES.LL-003 46. Skolnik M (2008) Radar handbook. McGraw-Hill, New York 47. Richards M (2005) Fundamentals of radar signal processing. McGraw-Hill, New York 48. Pozar DM (2011) Microwave engineering. Wiley, Hoboken 49. Barker JL (1970) Radar, acoustic, and magnetic vehicle detectors. IEEE Trans Vehicul Technol 19:30–43 50. Mills MK (1970) Future vehicle detection concepts. IEEE Trans Vehicul Technol 19:43–49 51. Palubinskas G, Runge H (2007) Radar signatures of a passenger car. IEEE Geosci Remote Sens Lett 4:644–648 52. Lan J, Nahavandi S, Lan T, Yin Y (2005) Recognition of moving ground targets by measuring and processing seismic signal. Measurement 37:189–199 53. Sun Z, Bebis G, Miller R (2006) On-road vehicle detection: a review. IEEE Trans Pattern Anal Mach Intell 28:694–711 54. Buch N, Velastin SA, Orwell J (2011) A review of computer vision techniques for the analysis of urban traffic. IEEE Trans Intell Transp Syst 12:920–939 55. Thomas DW, Wilkins BR (1972) The analysis of vehicle sounds for recognition. Pattern Recognit 4(4):379–389 56. Nooralahiyan AY, Dougherty M, McKeown D, Kirby HR (1997) A field trial of acoustic signature analysis for vehicle classification. Transp Res C 5(3–4):165–177 57. Nooralahiyan AY, Kirby HR, McKeown D (1998) Vehicle classification by acoustic signature. Math Comput Model 27(9–11):205–214 58. Bao M, Zheng C, Li X, Yang J, Tian J (2009) Acoustic vehicle detection based on bispectral entropy. IEEE Signal Process Lett 16:378–381 59. Guo B, Nixon MS, Damarla T (2012) Improving acoustic vehicle classification by information fusion. Pattern Anal Appl 15:29–43 60. Averbuch A, Zheludev VA, Rabin N, Schclar A (2009) Wavelet-based acoustic detection of moving vehicles. Multidim Syst Sign Process 20:55–80 61. Averbuch A, Zheludev VA, Neittaanmaki P, Wartiainen P, Huoman K, Janson K (2011) Acoustic detection and classification of river boats. Appl Acoust 72:22–34 62. Quaranta V, Dimino I (2007) Experimental training and validation of a system for aircraft acoustic signature identification. J Aircraft 44:1196–1204 63. Wu H, Siegel M, Khosla P (1999) Vehicle sound signature recognition by frequency vector principal component analysis. IEEE Trans Instrum Measur 48:1005–1009 64. Aljaafreh A, Dong L (2010) An evaluation of feature extraction methods for vehicle classification based on acoustic signals. In: Proceedings of the 2010 international conference on networking, sensing and control (ICNSC), pp 570–575 65. Lee J, Rakotonirainy A (2011) Acoustic hazard detection for pedestrians with obscured hearing. IEEE Trans Intell Transp Syst 12:1640–1649 66. Braun ME, Walsh SJ, Horner J, Chuter R (2013) Noise source characteristics in the ISO 362 vehicle pass-by noise test: literature review. Appl Acoust 74:1241–1265 67. Westervelt P (1957) Scattering of sound by sound. J Acous Soc Am 29:199–203 68. Westervelt P (1957) Scattering of sound by sound. J Acous Soc Am 29:934–935 69. Pompei FJ (2002) Sound from ultrasound- The parametric array as an audible sound source. Doctoral dissertation, Massachusetts Institute of Technology 70. Pierce AD (1989) Acoustics: an introduction to its physical principles and applications. The Acoustical Society of America, New York 71. Beyer RT (1960) Parameter of nonlinearity in fluids. J Acous Soc Am 32:719–721 72. Hamilton MF, Blackstock DT (1998) Nonlinear acoustics. Academic, San Diego 73. Beyer RT (1974) Nonlinear acoustics. Brown University Department of Physics, Providence Chapter 9 Cranks and Charlatans and Deepfakes Mark K. Hinders and Spencer L. Kirn Abstract Media gate-keepers who curate information and decide which things regular people should see and read are largely superfluous in the age of social media. There have always been hoaxsters and pranksters and hucksters, but who and what online sources to trust and believe is becoming an urgent problem for us regular people. Snake oil salesmen provide insight into what to watch for when skeptically evaluating the claims of machine learning software salesreps. Digital photos are so easy to manipulate that we can’t believe what we see in pictures, of course, but now deepfake videos mean we can’t tell when video evidence is deliberately misleading. Classic faked UFO photos instruct us to pay attention to the behavior of the photographer, which can now be done automatically by analyzing tweetstorms surrounding events. Topic modeling gives a way to form time-series plots of tweetstorm subjects that can then be transformed via dynamic wavelet fingerprints to isolate shapes that may be characteristic of organic versus artificial virality. Keywords Tweetstorm · Digital charlatans · Natural language processing · Wavelet fingerprint 9.1 Digital Cranks and Charlatans Question: Whose snake oil is more dangerous, the crank or the charlatan? First some historical context. Medicine in the modern sense is a fairly recent phenomenon, with big advances happening when we’re at war. During the Civil War, surgeons mostly knew to wash their saws between amputations, but their primary skill was speed because anesthesia (ether) wasn’t really used in the field hospitals. Part of the punishment for losing WWI was Germany giving up their aspirin patent. Antibiotics can be thought of as a WWII innovation along with radar and the A-bomb. In our current wars to ensure the flow of oil and staunch the resulting spread of terrorism, M. K. Hinders (B) · S. L. Kirn W&M Applied Science Department, Williamsburg, VA 23187-8795, USA e-mail: hinders@wm.edu © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 M. K. Hinders, Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint, https://doi.org/10.1007/978-3-030-49395-0_9 297 298 M. K. Hinders and S. L. Kirn we have been doing an excellent job of saving lives of those who have been gravely wounded on the battlefield, as evidenced by the now common sight of prosthetic arms and legs in everyday America. Same goes for the dramatic drop in murder rates that big-city mayors all took credit for. Emergency medical technology on the battlefield and in the hood is really quite amazing, but it’s very recent. In the late ninenteenth century there really wasn’t what we would now call medicine [1–7]. Nevertheless, there was a robust market for patent medicines, i.e., snake oil. Traveling salesmen would hawk them, you could order them from the Sears and Roebuck catalog, and so on. There was no way to know what was in them because the requirement to list ingredients on the label didn’t arise until the Pure Food and Drug Act of 1906. The modern requirement to prove both safety and efficacy prior to marketing a drug or treatment was far off into the distant future. You might think, then, that soothing syrup that was so effective at getting a teething baby to sleep was mostly alcohol like some modern cough syrups. Nope. Thanks to recent chemical analyses of samples from the Henry Ford museum, we now know that most of these snake oils were as much as thirty percent by weight opiates. Rub a bit of that on the baby’s gums. Have a swig or three yourself. Everybody has a nice long nap. Some of the hucksters who mixed up their concoctions in kitchens, barns, etc. knew full well that their medicines didn’t actually treat or cure any ailments, but the power of suggestion can and does make people feel better if it’s wrapped up properly in a convincing gimmick. This turns out to be how ancient Chinese medicine works. Acupuncture makes you feel better because you think it should make you feel better. Billions of people over thousands of years can’t have been wrong! Plus, you see the needles go in and you feel a tingling that must be your Qi unblocking. Don’t worry, placebos still work even after you’ve been told that they’re placebos. Similarly, tap water in a plastic bottle sells for way more if there’s a picture of nature on the label. You know perfectly well there’s no actual Poland Spring in Maine and that your square-bottled Fiji is no more watery than regular water, but their back stories make the water taste so much more natural. Plus, everybody remembers that gross kid in school who put his mouth right on the water fountain, ew! Hucksters who know perfectly well that their snake oil isn’t medicine are charlatans. They are selling you pretend medicine (or tap water) strictly to make a buck. The story they tell is what makes their medicine work. Even making it taste kind of bad is important because we all know strong medicine has side effects. Oh, and expensive placebos actually do work better than cheap ones, so charlatans don’t have to feel bad about their high-profit margins. Some of the hucksters have actually convinced themselves that the concoction they’ve mixed up has curative powers bordering on the magical, kind of like the second-grade teacher whose effervescent vitamin tablets prevent you from getting sick touching in-flight magazines. These deluded individuals are called cranks. Charlatans are criminals. Cranks are fools. But whose snake oil is more dangerous? Note again that all medicine has side effects, which you hear listed during commercials for medicines on the evening news. Charlatans don’t want to go to prison and they want a reliable base of repeat customers. Hence, they will typically take great care to not kill people off via side effects. Cranks are much more likely to 9 Cranks and Charlatans and Deepfakes 299 Fig. 9.1 Charlatans know full well their snake oil works via the placebo effect. Cranks may overlook dangerous side effects because they fervently believe in their concoction’s magical curative powers. School pills give a student knowledge without having to attend lessons [8–10] so that the student’s time can instead be applied to Athletic pursuits. School pills are sugar coated and easily swallowed, unlike most machine learning software overlook quite dangerous side effects in their fervent belief in the ubiquitous healing powers of their concoctions. That fervent belief can also make cranks surprisingly effective at convincing others. They are, in fact, true believers. Their snake oil is way more dangerous than the charlatan’s, even the charlatan who is a pure psychopath that lacks any ability to care whether you live or die (Fig. 9.1). Raymond Merriman is a financial astrologer and author of several financial market timing books, and is the recipient of the Regulus Award for Enhancing Astrology’s Image as a Profession [11]. “Financial markets offer an objective means to test astrological validity. The Moon changes signs every 2–3 days and is valuable for short-term trading. Planetary stations and aspects identify longer term market reversals. Approximately 4–5 times per year, markets will form important highs or lows, which are the most favorable times to buy and sell.” If you’d like to pay actual money to attend, his workshop “provides research studies showing the correlation of astrological factors to short-term and longer term financial market timing events in stock markets, precious metals, and Bitcoin.” His website does offer a free weekly forecast, so there’s that. Guess what he didn’t forecast in Q1 2020? Warren Buffet is neither a crank nor a charlatan. He is the second or third wealthiest, and one of the most influential, business people in America [12] having earned the nickname “Oracle of Omaha” by predicting the future performance of companies and then hand-picking those that he thought were going to do well. Ordinary people who bought shares in his company have done quite well. Nevertheless, Buffett advocates index funds [13] for people who are either not interested in managing their 300 M. K. Hinders and S. L. Kirn own money or don’t have the time. He is skeptical that active management can outperform the market in the long run, and has advised both individual and institutional investors to move their money to low-cost index funds that track broad, diversified stock market indices. Buffett said in one of his letters to shareholders that [14] “when trillions of dollars are managed by Wall Streeters charging high fees, it will usually be the managers who reap outsized profits, not the clients.” Buffett won a friendly wager with numerous managers that a simple Vanguard S & P 500 index fund will outperform hedge funds that charge exorbitant fees. Some call that a bet, some call it a controlled prospective experiment. If you’re trying to figure out whether a new miracle cure or magic AI software actually works, you have to figure out some way to test it in order to make sure you aren’t being fooled or even fooling yourself. A casual perusal of software platforms for traders, both amateur day-traders and professional money managers, sets off my bullshit detectors which I have honed over 35 years of learning about and teaching pseudoscience. It seems obvious to me that there are lots of cranks and charlatans in financial services. I suppose that’s always been the case, but now that there’s so much AI and machine learning snake oil in the mix it has become suddenly important to understand their modus operandi. OK, so the stock market isn’t exactly life and death, but sometimes it sure feels like it and a little soothing syrup would often hit the spot right about 4pm Eastern time. How do we go about telling the crank from the charlatan when the snake oil they’re selling is machine learning software? They’re pretty unlikely to list the algorithmic ingredients on their labels, since that might involve open-sourcing their software. Even if you could get access to the source code, good luck decoding that uncommented jumble of function calls, loops and what ifs. The advice of diplomats might be: trust, but verify. Scientists would approach it a bit differently. One of the key foundational tenets of science is that skeptical disbelief is good manners. The burden of proof is always with the one making a claim of some new finding, and the more amazing the claim is the more evidence that the claimant must provide. Politicians (and cranks) always seem to get cross when we don’t immediately believe their outlandish promises, so we can use that sort of emotional response as a red flag. Another key aspect of science is that no one experiment, no matter how elegantly executed, is convincing by itself. Replication, replication, replication. Independent verification by multiple laboratories is how science comes around to new findings. Much of the research literature in psychology is suspect right now because of their socalled replication crisis [15]. Standard procedures in psychology have often involved what’s called p-hacking, which is basically slicing and dicing the results with different sorts of statistical analysis methods until something notable shows up with statistical significance. In predictive financial modeling it’s a trivial matter to back test your algorithms over a range of time slices and market-segment dices until you find a combination where your algorithms work quite well. Charlatans will try to fool you by doing this. Cranks will fool themselves. We can use this as another red flag, and again science has worked out for us a way to proceed. Let’s say you wanted to design a controlled test of urine therapy [16–24] which is the crazy idea that drinking your own tinkle has health benefits. I suppose that loners who believe chugging their first water of the morning will make them feel better do 9 Cranks and Charlatans and Deepfakes 301 actually feel better, but that’s almost certainly just the placebo effect [25]. Hence, any test of urine therapy would have to be placebo controlled, which brings up the obvious question of what one would use for a placebo. Most college students that I’ve polled over the last several years agree that warm Natty Light tastes just like piss, so that would probably work. So the way a placebo-controlled study goes is, people are randomly assigned to either drink their own urine or chug a beer, but they can’t know which group they’re in so we’ll have to introduce a switcheroo of some sort. What might work is that everybody pees into a cup and then gives the cup to an assistant who either hands back that cup of piss or hands over an identical cup full of warm Natty Light according to which group the testee is in. It might even be necessary to give the testee an N95 facemask and/or the tester a blindfold so as to minimize the chances of inadvertently passing some clues as to what the cup contains. Then another research assistant who has no idea whether the testee has been drinking urine or beer will be asked to assess some proposed health benefit, which is the part that makes it double blind. If the person interpreting the results knows who’s in which group, that can easily skew their assessment of the results. Come to think of it, the testees should probably get an Altoids before they go get their health assessed. That scheme is pretty standard, but we’ve left out a very important thing that needs to get nailed down before we start. Namely, what is urine therapy good for? If we look for ill-defined health benefits we almost certainly will be able to p-hack the results afterwards in order to find some, so we have to be crystal clear up front about what it is that drinking urine might do for us. If there’s more than one potential health benefit, we might just have to run the experiment again with another whole set of testees. Don’t worry, Natty Light is cheaper than water. So when somebody has a fancy new bit of Python code for financial modeling or whatever, we’re going to insist that they show that it works. The burden of proof lies with them. We are appropriately skeptical of their claims, and the more grandiose their claims are the more skeptical we are. That’s just good manners, after all. And since we’re afraid of p-hacking, we’re going to insist that a specific benefit be described. We might even phrase that as, “What is it good for?” Obviously we’re not going to buy software unless it solves a need, and we’re not dumb enough to believe that a magic bit of coded algorithm could solve a whole passel of needs, so pick one and we’ll test that. We’ll also need something to compare against, which I suppose could be random guessing but probably is going to be some market average or what an above-average hedge fund can do or whatever.1 Then we back test, but we’re going to have to blind things first. That probably means having assistants select slices of time and dices of market sectors and running back tests without knowing whether they’re using the candidate machine learning software or the old stand-by comparison. Other assistants who weren’t involved in running the back tests can then assess the results or if there’s a simple metric of success then that can be used to automatically score the new software against the standard. Of course, some of those back tests should run forward a bit in to the future so that we can get a true test of whether they work or not. If the vendor really believes in their software, they might even be willing to 1 Warren Buffet would say compare to some Vanguard low-load index funds. 302 M. K. Hinders and S. L. Kirn agree to stake a bet on predictions a fair chunk into the future. If they’re right, you’ll buy more software from them. If they’re wrong, you get your money back and they have to drink their own urine every morning for month. 9.2 Once You Eliminate the Possible, Whatever Remains, No Matter How Probable, Is Fake News [26] Harry Houdini was a mamma’s boy. He loved his mommy. It was awkward for his wife. Then his mommy died, and he wanted nothing more than to talk to her one last time to make sure she was all right and knew that he still loved her. Eternally awkward for his wife. I know what you’re thinking, so here’s Houdini in his own words, “If there ever was a son who idolized and worshipped his Mother, that son was myself. My Mother meant my life, her happiness was synonymous with my peace of mind.” Notice that it’s capital-M. Houdini was perhaps the world’s most famous performer. He may have been the most talented magician ever. He was also an amazing physical specimen, and many of his escapes depended on the rare combination of raw talent, physical abilities, and practice, practice, practice. His hotness also helped with at least half of his audience, because of course he had to be half-clothed to prove he wasn’t hiding a key or whatnot. That may have been awkward for both Mrs. Houdinis. Sir Arthur Conan Doyle was Houdini’s frenemy. He thought that Houdini wasn’t abnormally talented physically, but that he had the paranormal ability to disapparate his physical body from inside the locked chest under water, or whatever, and then re-apparate it back on land, stage, etc. I suppose that could explain all manner of seemingly impossible escapes, but geez. Doyle’s mind was responsible for Sherlock Holmes, who said that when you have eliminated the impossible, whatever remains, however improbable, must be the truth. Doyle apparently didn’t allow for the possibility of Houdini’s rare combination of raw talent, physical abilities, and practice, practice, practice. He did, however, believe in fairies. The cardboard cutout kind. Not in person, of course, but photos. Notice in Fig. 9.2 that the wings of the hovering fairies are captured crisply but the waterfall in the background is blurred. That’s not an Instagram filter, because this was a century ago. Otherwise, those little girls had a pretty strong selfie game. It’s always been fun for kids to fool adults. Kate and Margaret Fox were two little girls in upstate New York in 1848 who weren’t tired even though it was their bedtime, so they were fooling around. When their mom said to “quiet down up there” they said, of course, that it wasn’t them. It was Mr. Slipfoot. A spirit. They asked him questions and he responded in a sort of Morse code by making rapping noises. Had momma Fox been less gullible she might have asked the obvious question about ghosts and spirits, “If they can freely pass through walls and such, how can they rattle a chain or make a rapping noise? Now go to sleep, your father and I worked hard all day and we’re tired.” 9 Cranks and Charlatans and Deepfakes 303 Fig. 9.2 So you see, comrade, the Russians did not invent fake news on social media They didn’t quiet down, because they had merely been playing all day and weren’t tired at all. When their mom came to tell them one last time to hush they had an apple tied to a string so they could make the rapping noises with her standing right there. Mr. Slipfoot backed up their story. Momma Fox was amazed. She told the neighbors. They were similarly amazed, and maybe even a bit jealous because their own special snowflakes hadn’t been singled out by the Spirits to communicate on behalf of the living. How long ‘till I die, and then what? Everybody ever has had at least some passing curiosity about that. Kate and Margaret had a married sister who saw the real potential. She took them on tour, which they then did for the rest of their lives. Margaret gave it up for a while and admitted it was all trickery, but eventually had to go back to performing tricks to make a living.2 What amazes me is that as little kids they figured out how to make the raps by swishing their fingers and then their toes, and do it loudly enough to make a convincing stage demonstration. Margaret says, “The rappings are simply the result of a perfect control of the muscles of the leg below the knee, which govern the tendons of the foot and allow action of the toe and ankle bones that is not commonly known.” That sounds painful. Margaret continues, “With control of the muscles of 2 Rapping noises, not those tricks you perv. 304 M. K. Hinders and S. L. Kirn the foot, the toes may be brought down to the floor without any movement that is perceptible to the eye. The whole foot, in fact can be made to give rappings by the use only of the muscles below the knee. This, then is the simple explanation of the whole method of the knock and raps.” My knee sometimes makes these noises when I first get out of bed in the morning. Look up gullible in an old thesaurus and you’ll find its antonym is Houdini. You’ll also find his picture next to the definition of momma’s boy in most good dictionaries from the roaring twenties. To quote one final time from his book [27] “I was willing to believe, even wanted to believe. It was weird to me and with a beating heart I waited, hoping that I might feel once more the presence of my beloved Mother.” Also, “I was determined to embrace Spiritualism if there was any evidence strong enough to down the doubts that have crowded my brain for the last thirty years.” As Houdini traveled the world performing, he did two things in each town. First, he gave money to have the graves of local magicians maintained. Second, he sought out the locally most famous Spiritualist medium and asked her if she could please get in contact with his beloved Mother, which they all agreed they could easily do. Including Lady Doyle, Sir Arthur’s wife. They all failed. Lady Doyle failed on Mother’s birthday. Spiritualist mediums were all doing rather rudimentary magic tricks, which Houdini spotted immediately because it takes one to know one. That pissed him off so much that he wrote the book I’ve been quoting from. To give an idea of the kind of things that Spiritualists were getting away with, imagine what you might think if your favorite medium was found to have your family jewels in his hand.3 You would be within your rights to assume that you were being robbed. But no, it was the Spirits who dematerialized your jewels from your safe and rematerialized them into the hands of the medium while the lights were off for the séance. It’s proof that the spirits want your medium to have your valuables, and if you call the police that will piss them off and you’ll never get the spirits to go find Mother and bring her around to tell you that she’s been thinking of you and watching over you and protecting you from evil spirits as any good Mother would do for her beloved son. Most people remember that Houdini left a secret code with his wife, so that after he died she could ask mediums to contact him and if it was in fact the Great Houdini he could verify his presence with that code. What most people don’t know is that Houdini also arranged codes with other close associates who died before he did. No secret codes were ever returned. If anybody could escape from the great beyond it would be the world’s (former) greatest escape artist. So much for Spiritualism. But wait, just because Houdini’s wife wasn’t able to contact her husband after he died, that doesn’t mean anything. Like Mother would let him talk to that woman now that she had him all to herself, for eternity. 3 Diamonds and such, you perv. Geez. 9 Cranks and Charlatans and Deepfakes 305 9.3 Foo Fighters Was Founded by Nirvana Drummer Dave Grohl After the Death of Grunge The term “flying saucer” is a bit of a misnomer. When private pilot Kenneth Arnold kicked off the UFO craze [28] by reporting unidentifiable flying objects in the Pacific Northwest in 1947, he said they skipped like saucers, not that they were shaped like saucers. It’s kind of a shame that he didn’t call them flapjacks, because there actually was an experimental aircraft back then with that nickname, see Fig. 9.3. Many of the current stealth aircraft are also flying wing designs that look quite a lot like flying saucers from some angles, and some of the newest military drones look exactly like flying pancakes. Fig. 9.3 The Flying Flapjackwas an experimental U.S. Navy fighter aircraft designed by Charles H. Zimmerman for Vought during World War II. This unorthodox design consisted of a flyingsaucer-shaped body that served as the lifting surface [29]. The proof-of-concept vehicle was built under a U.S. Navy contract and it made its first flight on November 23, 1942. It has a circular wing 23.3 feet in diameter and a symmetrical NACA airfoil section. A huge 16-foot diameter three-bladed propeller was mounted at the tip of each airfoil blanketing the entire aircraft in their slipstreams. Power was provided by two 80 HP Continental engines. Although it had unusual flight characteristics and control responses, it could be handled effectively without modern computerized flight control systems. It could almost hover and it survived several forced landings, including a nose-over, with no serious damage to the aircraft, or injury to the pilot. Recently, the Vought Aircraft Heritage Foundation Volunteers donated over 25,000 labor hours to complete a restoration effort, and the aircraft is on long-term loan from the Smithsonian Institution 306 M. K. Hinders and S. L. Kirn Fig. 9.4 Most UFO sightings are early aircraft or weather balloons or meteorological phenomena [31]. The planet Venus is commonly reported as a UFO, including by former President Jimmy Carter So, lots of people started seeing flying saucers and calling up the government to report them. The sightings were collected and collated and evaluated in the small, underfunded Project Bluebook (Fig. 9.4). Then one crashed near Roswell, NM [30] and the wreckage was recovered. Never mind, said the Air Force. “It was just a weather balloon.” Actually it was a top-secret Project Mogul balloon, but they couldn’t say that in 1947 because it was top secret. If a flying saucer had crashed and been recovered, of course it would have been top secret. Duh. In 1947 we were this world’s only nuclear superpower. Our industrial might and fighting spirit (and radar) had defeated the Axis of evil. Now all we had to worry about was our Soviet frenemies. And all those other Commies. But if we could reverse-engineer an interplanetary space craft we would be unstoppable. Unless the aliens attacked, in which case we had better hurry up and figure out all their advanced technology so we can defend ourselves. Ergo, any potentially usable physical evidence from a UFO would be spirited away to top-secret government labs where “top men” would be put to work in our national interest. The key thing to know about government secrecy is that it’s compartmentalized [32–43]. You don’t even know what you don’t know because before you can know, somebody who already knows has to decide that you have a need to know what you don’t yet know. You will then be read into the program and will know, but you can’t ever say what you know to somebody who doesn’t have a need to know. There’s no way to know when things become knowable, so it’s best to make like you don’t know. For example, Vice President Truman didn’t know about the Manhattan Project [44] because the President and the Generals didn’t think he needed to know. Then when Roosevelt died the Generals had to have a slightly awkward conversation about this 9 Cranks and Charlatans and Deepfakes 307 Fig. 9.5 Trent photos with indications of how they are typically cropped. The Condon Committee concluded,“The object appears beneath a pair of wires, as is seen in Plates 23 and 24. We may question, therefore, whether it could have been a model suspended from one of the wires. This possibility is strengthened by the observation that the object appears beneath roughly the same point in the two photos, in spite of their having been taken from two positions.” and concludes [46] “These tests do not rule out the possibility that the object was a small model suspended from the nearby wire by an unresolved thread” top-secret gadget that could end the war and save millions of American lives. It’s not too surprising that there’s no good evidence of UFOs, that I know of.... In 1950 Paul and Evelyn Trent took a couple of pictures of a flying saucer above their farm near McMinnville, OH. Without wreckage, pictures are the best evidence, except that in the 1950s this is how deepfakes were done so it’s worth critically assessing both the photos themselves and the behavior of those involved. Life magazine [45] published cropped versions of the photos (Fig. 9.5) but misplaced the original negatives. That’s highly suspicious, because if real it would be a huge story. In those days the fakery typically happened when prints were made from the negatives so careful analysis of the original negatives was key. Another oddity is that because the roll of film in Paul Trent’s camera was not entirely used up they did not have the film developed immediately. You never really knew if your pictures had come out until you had them developed, of course, so you’d think the Trents would take a bunch of selfies to finish out the roll and run right down to the drug store. The Trents said that the pictures were taken at 7:30 pm, but analysis of the shadows shows pretty clearly that it was instead morning. Also, the other objects in the photos—oil tank, bush, fencepost, garage—allow the 3D geometry to be reconstucted. Clearly it’s a small object hanging from the wires and the photos were taken from two different lateral positions, rather than a large distant object zooming past with two photos snapped from the same position while panning across as the object passed by. Despite being thoroughly debunked, the McMinnville UFO photographs remain perhaps the best publicized in UFO history. They’re most convincing if the overhead wires are cropped out of the pictures. Rex Heflin’s UFO photos are at least as famous as the Trent’s (Fig. 9.6) and again it’s the behavior of the photographer that is most of interest. First of all, he is the 308 M. K. Hinders and S. L. Kirn Fig. 9.6 In 1965 an Orange County, CA highway inspector named Rex Heflin [47] snapped three close-up photos through the windows of his truck of a low-flying UFO with the Polaroid camera he carried for work. They clearly show a round, hat-like object with a dark band around its raised superstructure, and in the side mirror you can see the line of telephone poles it was hanging from only one who saw it, but a large UFO in the distance would have been above one of the busiest freeways in southern California at noon on a weekday. He originally produced three Polaroids, then later came up with a fourth. Rather than providing the originals for analysis, he said the “men from NORAD” took them so all he had was copies. He had snapped the photos from inside the cab of is pickup, so the UFO is framed by that, and the telephone poles are only visible accidentally in the side mirror of one of the photos. Oops. Both the Trent and the Heflin photos have been analyzed by experts on both sides. Ufologists declared them genuine. Others, not so much. Or any. Just, nope. Double nope. And these are the best UFO photos in existence, something both sides agree to stipulate. So there’s that. What we needed was a controlled test of this sort of thing. What you might call a hoax. It turns out that ufologists are fun to pwn. The Trent and Heflin episodes are what I call shallowfakes. Warminster isn’t quite a deepfake, but it’s at least it’s past your navel. In 1970 Flying Saucer Watch published photographs taken in March of that year at Cradle Hill in Warminster, UK showing a UFO at night. This was a place where ufologists would gather together in order to watch the skies and socialize. It was before the Internet. Mixed in with the enthusiasts were a couple of hoaxers, one of whom had a camera and excitedly took a picture or three of a UFO across the way. Figure 9.7 shows one of those photos and also what they actually were looking at. Two more of the Warminster photos are shown in Fig. 9.8. Part of the hoax was to shine the purple spotlight and pretend to take a picture of it and then see if anybody noticed the difference between the UFO in the photo and what they all saw blinking purple on and off on the other hill. The second part of the test was that some of the 9 Cranks and Charlatans and Deepfakes 309 Fig. 9.7 A UFO over Cradle Hill in Warminster, UK. The photo on the left was pre-exposed with an indistinct UFO shape. The photo on the right shows what the ufologists were actually seeing. Note that the UFO in the photo, which indicated by an arrow, isn’t a purple spotlight Fig. 9.8 Two photos of the scene of the UFO over Cradle Hill in Warminster, UK, except that these pictures were taken on different nights (with different patterns of lights) and from different enough perspectives (streetlight spacings are different) that ufologists investigating the sightings should have noticed photos were taken on a different night, so the pattern of lights in the pictures was a bit different. Nobody noticed that either. The hoaxers let the experiment run for a couple of years and with some amusement watched a series of pseudo-scholarly analyses pronounce the photos genuine. See for example [48]. At this point you might want to refer back to the Spectral Intermezzo chapter. That was intended to be a bit silly, of course, but now that we’ve talked explicitly about cranks/charlatans and hoaxes/pwns, you should be better able to spot such obvious attempts to apply machine learning lingo to pseudoscientific phenomena. You might also appreciate Tim Minchin’s 9-min beat poem, Storm [49] where he notes that Scooby Doo is like CSI for kids. 310 M. K. Hinders and S. L. Kirn 9.4 Digital Imaging Is Why Our Money Gets Redesigned so Often A picture is worth a thousand words, so in 2002 the website worth1000.com started having photoshopping contests. Go ahead and pick your favorites from Figs. 9.9, 9.10, and 9.11. I may have done the middle one. We’ve all gotten used to the idea that most images we see in print have been manipulated, but people didn’t really pay attention until Time magazine added a tear to an existing photo of the Gipper [51]. First there were Instagram filters (Fig. 9.12) and now Chinese smartphones come with automatic image manipulation software built in, and it’s considered bad manners to not “fix” a friend’s pic before you upload a selfie. We are now apparently in the age of Instagram Face [52]. “It’s a young face, of course, with poreless skin and plump, high cheekbones. It has catlike eyes and long, cartoonish lashes; it has a small, neat nose and full, lush lips.” So now we’re at this strange moment in technology development when seeing may or may not be believing [54–58]. Figures 9.9 and 9.12 are all in good fun, but you just have to get used to the idea that the news isn’t curated text delivered to your driveway in the morning by the paperboy anymore. The news is images that come to that screen you’re holding all the damn time and scrolling through when you should be watching where you’re going. “Powerful images of current events, of controversies, of abuses have been an important driver of social change and public policy. If the public, if the news consuming, image consuming, picture drenched public loses confidence in the Fig. 9.9 Three of the five stooges deciding the fate of post-war Europe [50]. Shemp and Curley–Joe are not pictured. Joe Besser isn’t a real stooge, sorry Stinky 9 Cranks and Charlatans and Deepfakes 311 Fig. 9.10 UFO over William and Mary took no talent and almost no time. The lighting direction isn’t even consistent between the two component images ability of photographers to tell the truth in a fundamental way, then the game is up.” [59] Fake news on social media seems to be able to influence elections. Yikes! At some level we’re also going to have to get used to the idea that any given picture might be totally fake. I’m willing to stipulate for the record that there is no convincing photographic evidence of alien visitation. The best pictures from the old days were faked, and ufologists showed themselves incapable of even moderate skepticism during the Warminster pwn.4 It would be trivially simple to fake a UFO selfie these days. But what about video? It’s bad news. Really bad news, for Hermione Granger in particular. 4I said pwn here, but it was a reasonably well-controlled prospective test of ufologists. 312 M. K. Hinders and S. L. Kirn Fig. 9.11 This took some real talent, and time [50]. Even today, I would expect it to win some reddit gold even though I’m not entirely sure what that means Fig. 9.12 This Guy Can’t Stop Photoshopping Himself Into Kendall Jenner’s Instagram Pics, and it’s hilarious because she has someone to’Shop her pics before she uploads them [53] We probably should have seen it coming, though. In 2007 a series of videos hit the interwebs showing UFOs over various cities around the world [60, 61] made by a 35-year-old professional animator who had attended one of the most prestigious art schools in France and brought a decade of experience with computer graphics and commercial animation to the party. It took a total of 17 h to make the infamous Haiti and Dominican Republic videos, working by himself using a MacBook Pro and a suite of commercially available three-dimensional animation programs. The videos are 100% computer-generated, and may have been intended as a viral marketing ploy in which case it worked, since they racked up tens of millions of views on YouTube [62]. “No other footage of a UFO sailing through the sky has been watched by as many people.” 9 Cranks and Charlatans and Deepfakes 313 9.5 Social Media Had Sped up the Flow of Information Deepfake is the muggle name for putting the face of a celebrity onto another body in an existing video. The software is free. It runs on your desktop computer. It’s getting better fast enough that soon analysis of the deepfake video footage won’t be able to tell the difference [63–68]. One strategy is to embed watermarks or some sort of metadata into the video in order to tell if it’s been altered [69]. Some young MBA has probably already declared confidently via a thick slide deck that Blockchain is the answer. YouTube’s strategy [70] is to “identify authoritative news sources, bring those videos to the top of users’ feeds, and support quality journalism with tools and funding that will help news organizations more effectively reach their audiences. The challenge is deciding what constitutes authority when the public seems more divided than ever before on which news sources to trust—or whether to trust the traditional news industry at all.” Recall the lesson from the Trents and Rex Heflin that we should look to the behavior of the photographer to assess the reliability of their photographs. Are they cranks for charlatans? I forgot to mention it before, but the Trents were repeaters who had previously faked UFO photos. Those early attempts didn’t go viral via Life magazine, of course, but in my rulebook any previous faking gets you a Pete Rose style lifetime ban. Any hint of schenanigans gets you banished. We can’t afford to play around anymore, so if you’ve been photoshopping yourself into some random Kardashian’s social media we don’t ever believe your pics even if you’re trying to be serious this time. Content moderation at scale isn’t practical, but Twitter can be analyzed in bulk to see what people are tweeting about and if you have enough tweets, topic modeling can go much further than keeping track of what’s trending. Tweetstorms are localized in time, so trending isn’t really an appropriate metric. Trends are first derivatives. We should at least look at higher derivatives, don’t you think? Second derivatives give rate of change, or acceleration. That would be better. How fast something trends and then detrends should be pretty useful. Our current @POTUS has figured out that tweeting something outrageous just often enough distracts the mainstream media quite effectively and allows the public narrative to be steered. The mainstream media even use up screentime on their old-fashioned newscasts to read for oldsters the things @POTUS just tweeted. They also show whatever viral video trended on the moms’ Facebook a couple of days ago and had been seen by all the digital natives a couple of days before that. These patterns can be used to identify deepfakes before they make it to the screens of people who are too used to believing their own eyes. It’s not the videos per se that we analyze, but the pattern of spread of the videos and most importantly the cloud of converstations that are happening in the cloud during that spread. Social media platforms come and go on a relatively short time scale, so the key is to be able to analyze the text contained in whatever platform people are using to communicate and share. We’re talking about Twitter, but the method described below can be adapted to a wide variety of communication modalities, including voice because speech-to-text already works pretty well and is getting better. 314 M. K. Hinders and S. L. Kirn Fig. 9.13 A group of squirrels that isn’t a family unit is called a scurry. A family of squirrels is called a dray. Most squirrels aren’t social, though, so they normally don’t gather in a group. Oh look: pizza [71, 72] Unless you downloaded this chapter first and haven’t read the rest of this book, you probably can guess what I’m about to say: dynamic wavelet fingerprints. Tweetstorms have compact support, by which I mean that they start and stop. Tweetstorms have echoes via retweeting and such. Wavelets are mathematically suited to analyzing time-domain signals that start and stop and echo in very complicated ways. Wavelet fingerprints allow us to identify patterns in precisely this sort of complex time-domain signal. First we have to be able to extract meaning from tweetstorms, but that’s what topic modeling does. Storms of tweets about shared cultural events will be shaped by the natural time scale of the events and whether people are watching it live or on tape delay or are waking up the next morning and just then seeing the results. There should be fingerprints that allow us to identify these natural features. Deepfakery-fueled tweetstorms and echoes should leave fingerprints that allow us to identify un-natural features. It should be possible to flag deepfakes before they make it to the moms’ Facebook feeds and oldsters’ television newscasts (Fig. 9.13). 9.6 Discovering Latent Topics in a Corpus of Tweets In an age where everyone is rushing to broadcast opinions and inanities and commentaries on the Internet, any mildly notable event can sometimes trigger a tweetstorm. We define a tweetstorm as any surge in a specific topic on social media, where tweet is used in a generic sense as a label for any form of unstructured textual data posted to the interwebs. Since tweetstorms generate vast amounts of text data in short amounts of time for text analytics, this gives us the ability to dissect and analyze Internet chatter in a host of different situations. Companies can analyze transcribed phone calls from their call centers to see what customers call about most often, which is exactly what happens when they say “this call may be monitored for quality control pur- 9 Cranks and Charlatans and Deepfakes 315 poses.” Journalists can analyze tweets from their region to determine which stories are the most interesting to their audience and will likely have the greatest impact on the local community. Political campaigns can analyze social media posts to discern various voters’ opinions on issues and use those insights to refine their messages. Topic modeling is an extraordinarily powerful tool that allows us to process largescale, unstructured text data to uncover latent topics in what would otherwise be obscured from human analysis. According to the Domo Data Never Sleeps study, nearly 475,000 tweets are published every minute [73]. Even with this firehose of data continually streaming in, it is possible to run analysis in real time to examine the popular topics of online conversation at any given instant. Using unsupervised machine learning techniques to leverage large-scale text analytics allows for the discovery of hidden structures, groupings, and themes throughout massive amounts of data. Although humans possess a natural ability to observe patterns, manual analysis of huge datasets is highly impractical. To extract topics from Twitter we have explored a range of topic modeling and document clustering methods including non-negative matrix factorization, doc2vec, and k-means clustering [74]. 9.6.1 Document Embedding Document and word embeddings are instrumental in running any kind of natural language modeling algorithm because they produce numerical representations of language, which can be interpreted by a computer. There are two common methods for creating both document and word embeddings: document-term methods and neural network methods. Document-term methods are the simplest and most common ways to create document embeddings. Each of these produces a matrix with dimensions m × n, where m is the total number of documents in the corpus being analyzed and n is the total number of words in the vocabulary. Every entry in the matrix represents how a word in the vocabulary is related to each document. A weight is assigned to each entry, often according to one of three common weighting styles. The first of these is a simple binary weighting where a 1 is assigned to all terms that appear in the corresponding document and a 0 is assigned to all other terms. The frequency of each term appearing in a specific document has no bearing on the weight, so a term that appears once in the text has the same weight as a term that appears 100,000 times. The second method, the term frequency weighting method, accounts for the frequent or infrequent use of a term throughout a text. Term frequency counts the occurrences of each term in a document and uses that count as the assigned weight for the corresponding entry of the document-term matrix. The last weighting style is term frequency-inverse document frequency (TF-IDF). This weighting scheme is defined as N +1 , (9.1) t f -id f t,d = t f t,d × log d ft 316 M. K. Hinders and S. L. Kirn where t f -id f t,d is the weight value for a term t in a document d, t f is the number of occurrences of term t in document d, N is the total number of documents in the collection, and d f t is the total number of documents the term t occurred in. The TFIDF weight for each term can be normalized by dividing each term by the Euclidean norm of all TF-IDF weights for the corresponding document. This normalization ensures that all entries of the document-term matrix are between zero and one. The TF-IDF weighting scheme weights terms that appear many times in a few documents higher than terms that appear many times in many documents, allowing for more descriptive terms to have larger weights. Document-term methods for word and document embeddings are simple and effective, but they can occasionally create issues with the curse of dimensionality [75] since they project language into spaces that may reach thousands of dimensions. Not only do such large embeddings cause classification issues, but they can cause memory issues because the document-term matrix might have tens or hundreds of thousands of entries for even a small corpus and several millions for a larger corpus. Since large document-term embeddings can lead to issues downstream with uses of document and word vectors, it is beneficial to have embeddings in lower dimensions. This is why methods like word2vec and doc2vec, developed by Le and Mikolov [76, 77], are so powerful. These methods work by training a fully connected neural network to predict context words based on an input word or document. The weight matrices created in the training stage contain a vector for each term and document in the training set. Each of these vectors is close to semantically similar terms and documents in the embedding space. The greatest benefits of using neural network embedding methods are that we can control the size of the embedding vectors, and that the vectors are not sparse. This allows vectors to be passed to downstream analysis methods without encountering the issues of large dimension size and sparsity that commonly arise in document-term methods. However, they require significantly more data to train a model. While we can make document-term embeddings from only a few documents, it takes thousands or perhaps millions of documents and terms to accurately train these embeddings. Furthermore, the individual dimensions are arbitrary, unlike document-term embeddings where each dimension represents a specific term. Where sufficient training data does exist, doc2vec has been proven to be adept at topic modeling [78–81]. 9.6.2 Topic Models Clustering is the generic term used in machine learning for grouping objects based on similarities. Soft clustering methods allow objects to belong to multiple different clusters and have weights associated with each cluster, while hard clustering assigns objects to a single group. Matrix decomposition algorithms are popular soft clustering methods because they run quickly and produce interpretable results. Matrix methods for topic modeling are used for document-term embedding methods, but they are useless for more sophisticated embedding methods like doc2vec because 9 Cranks and Charlatans and Deepfakes 317 Fig. 9.14 Decomposition of a document-term matrix, A, into a document-topic, W , and a topicterm matrix, H their dimensions in these are arbitrary. However, there has been some work on implementing non-negative matrix factorization (NMF) on word co-occurrence matrices to create word and document embeddings like those generated by word2vec and doc2vec [80, 82]. Non-negative matrix factorization (NMF) was first proposed as a topic modeling technique by Xu et al. [83]. NMF is a matrix manipulation method that decomposes an m × n matrix A into two matrices A = W H, (9.2) where W is an m × k matrix, H is a k × n matrix, and k is a predefined parameter that represents the number of topics. Decomposing a large matrix into two smaller matrices (Fig. 9.14) is useful for analyzing text from a large corpus of documents to find underlying trends and similarities. The matrix A is a document-term matrix, where each row represents one of the m documents in the corpus and each column represents one of the n terms. The decomposition of A is calculated by minimizing the function L = 1 || A − W H||, 2 (9.3) where both W and H are constrained to non-negative entries [83] and L is the objective function to be minimized. The matrix norm can be any p-norm, but is usually either the 1 or 2 norm or a combination of the two. In topic modeling applications, k is the number of topics to be extracted from the corpus of m documents. Thus, the matrix W is referred to as the document-topic matrix. Each entry in the documenttopic matrix is the weight of a particular topic in a certain document, where a higher weight indicates the prevalence of that topic throughout the corresponding document. The matrix H is the topic-term matrix. Weights in H represent the relevance of each term to a topic. 318 M. K. Hinders and S. L. Kirn A popular extension of NMF is as a dynamic topic model (DTM) which tracks topics through time updating the weights of terms in topics and adding new topics when necessary [84]. One of the original DTMs was actually derived for latent dirichlet allocation (LDA) [85]. Several popular DTM extensions of NMF include [86–88]. Principal component analysis (PCA) is a common method for dimension reduction. PCA uses eigenvectors to find the directions of greatest variance in the data and uses those as features in a lower dimensional space. In topic modeling, we can use these eigenvectors as topics inherent in the dataset. PCA operates on the TF-IDF document-term matrix A. We calculate the covariance matrix of A by pre-multiplying it by its transpose C= 1 AT A, n−1 (9.4) which gives a symmetric covariance matrix with dimensions of the total vocabulary 1 term acting as a normalizer. The covariance matrix shows how often each with the n−1 word co-occurs with every other word in the vocabulary. We can then diagonalize it to find the eigenvectors and eigenvalues C = E D E −1 , (9.5) where the matrix D is a diagonal matrix with eigenvalues as the entries and E is the eigenvector matrix. The eigenvectors have shape 1 × V , the total number of terms in the vocabulary. We select the eigenvectors that correspond to the largest eigenvalues as our principal components and eliminate all others. These eigenvectors are used as the topics and the original document-term matrix can now be recast into a lower dimensional document-topic space. The terms defining a topic are found by selecting the largest entries in each eigenvector. Singular value decomposition (SVD) uses the same process as PCA, but it does not rely on the covariance matrix. In the context of topic modeling SVD is also referred to as latent semantic indexing (LSI) [89] which decomposes matrix A into A = UΣ V T , (9.6) where Σ is the diagonal matrix with singular values along its entries. The matrix U represents the document-topic matrix and the matrix V is the topic-term matrix. The largest values of Σ represent the most common topics occurring in the original corpus. We select the largest values and the corresponding vectors from the topic-term matrix as the topics. Latent dirichlet allocation (LDA) is a probabilistic model which assumes that documents can be represented as random mixtures over latent topics, where each topic is a distribution of vocabulary terms [90]. LDA is described as a generative process which draws a multinomial topic distribution θd =Dir(α), where α is a corpus-level scaling parameter, for each document, d. Then for each word wn ∈ {w1 , w2 , . . . , w N }, 9 Cranks and Charlatans and Deepfakes 319 where N is the total number of terms in d, we draw one of k topics, z d,n , from θd and choose a word wd,n from p(wn |z n , β). This probability is a multinomial distribution of length V , where V is the number of terms in the total vocabulary. In this algorithm α and β are corpus-level parameters that are fit through training. The parameter α is a scaling parameter and β is a k × V matrix representing the word probabilities in each topic. β can be compared to the document-topic matrix observed in prior methods. The k-means algorithm seeks to partition an n-dimensional dataset into a predetermined number of clusters, where the variance of each cluster is minimized. The data are initially clustered randomly, and the mean of the cluster is calculated. These means, called the centroids and denoted μ j , are used to calculate the variance in successive iterations of clusters. After the centroids have been calculated, the distance between each element of the dataset and each centroid is computed, and samples are re-assigned to the cluster with the closest centroid. New centroids are then calculated for the updated clusters. This process continues until the variance for each cluster reaches a minimum, and the iterative process has converged. Minimization of the within-cluster sum-of-squares k n i=1 j=1 min (||xi − μ j ||2 ) μ j ∈C is the ultimate goal of the k-means algorithm [91]. Here, a set of n datapoints x1 , x2 , . . . , xn is divided into k clusters, each with a centroid μ j . The set of all centroids is denoted C. Unfortunately, clustering becomes difficult since the concept of distance in very high-dimensional spaces is poorly defined. However, k-means clustering often remains usable because it evaluates the variance of a cluster rather than computing the distance between all points in a cluster. It is generally useful to reduce the dimension using SVD or PCA before applying clustering to document-term embeddings, or to train a doc2vec model and cluster the resulting lower dimension embeddings. Evaluation of our topic models requires a way to quantify the coherence of each topic generated. Mimno et al. suggest that there are four varieties of “bad” topics that may arise in topic modeling [92]: 1. Chained: Chained topics occur when two distinct concepts arise in the same topic because they share a common word. For example, in a corpus of documents containing texts about river banks and financial banks, a model might generate a topic containing the terms “river”, “financial”, and “bank”. Although “river bank” and “financial bank” are two very different concepts, they still appear in the same topic due to the shared word “bank”. 2. Intruded: Intruded topics contain sets of distinct topic terms that are not related to other sets within that topic. 3. Random: Random topics are sets of terms that make little sense when strung together. 320 M. K. Hinders and S. L. Kirn 4. Unbalanced: Unbalanced topics have top terms that are logically connected, but also contain a mix of both general terms and very specific terms within the topic. We want to be able to locate “bad” topics automatically, as opposed to manually sifting through and assigning scores to all topics. To accomplish this, Mimno et al. propose a coherence score C(t, V (t) ) = M m−1 log m=2 l=1 D(vm(t) , vl(t) ) + 1 D(vl(t) ) , (9.7) where t is the topic, V (t) = [v1(t) · · · v(t) M ] is the list of the top M terms in topic t, D(vl(t) ) is the document frequency of term vl(t) , and D(vm(t) , vl(t) ) is the co-document frequency of the two terms vm(t) and vl(t) [92]. The document frequency is a count of the number of documents in the corpus in which a term appears, and the co-document frequency is the count of documents in which two different terms appear. A 1 is added in the numerator of the log function to avoid the possibility of taking the log of zero. For highly coherent topics, the value of C will be close to zero while for incoherent topics this value will become increasingly negative. The only way for C to be positive is for the terms vm(t) to only appear in documents that also contain vl(t) . 9.6.3 Uncovering Topics in Tweets As an example showing the application of topic modeling to real Twitter data, we have scraped a set of tweets from the 2018 World Cup in Russia and extracted the topics of conversation. Soccer is the most popular sport worldwide, so the World Cup naturally creates tweet storms whenever matches occur. Beyond a general interest in the event, national teams and television coverage push conversation on social media by promoting and posting various hashtags during coverage. Two of the common World Cup hashtags were #WorldCup and #Russia2018. Individual games even had their own hashtags such as #poresp for a game between Portugal and Spain or the individual hashtags for the teams #por #esp. Along with the continued swell of general online activity, the rise of 5G wireless technology will facilitate online chatter at stadiums, arenas, and similar venues. KT Corp., a South Korean telecom carrier, was the first to debut 5G technology commercially at the 2018 Winter Olympics. 5G wireless will allow for increased communication at speeds reaching up to 10 gigabits per second, even in areas with congested networks [93]. Furthermore, 5G will allow for better device-to-device communication. For text analysis, this means that more people will have the ability to post thoughts, pictures, videos, etc. during events and might even be able to communicate with event officials through platforms like Twitter to render a more interactive experience at events like the World Cup. This will create a greater influx 9 Cranks and Charlatans and Deepfakes 321 of text data which event coordinators can use to analyze their performance and pinpoint areas that need improvement. Twitter can be accessed through the API with a Twitter account, and data can be obtained using the Tweepy module. The API provides the consumer key, consumer secret, access token, and access token secret, sequences necessary to retrieve Twitter data. Once access to the Twitter API has been established, there are two ways to gather tweets: one can either stream them in real time, or pull them from the past. To stream tweets in real time we use the tweepy.Stream method. To pull older tweets we use the tweepy.Cursor method, which also provides an option to search for specific terms throughout Twitter using the tweepy.API().search function and defining query terms in the argument using q=‘query words’. Further information about streaming, cursoring, and other related functions can be found in the Tweepy documentation [94]. We pulled tweets from the group stage of the 2018 World Cup by searching for posts containing the hashtag #WorldCup. These tweets were published between June 19–25, 2018. Our dataset is composed of about 16,000 tweets—a small number relative to the total volume of Twitter, but a sufficient amount for this proof-of-concept demonstration. 9.6.4 Analyzing a Tweetstorm We have found that NMF produces better results for our purposes than the other topic modeling methods we have explored, so the remainder of this chapter relies on the output of Sci-kit Learn’s NMF implementation [91]. However, the same analysis could be carried out on any topic model output. Our goal is to obtain the most coherent set of topics possible by varying the number of extracted topics, k. Testing k values ranging from 10 to 25, we find that the most coherent topics are produced when k = 17. The topics are as follows: • Topic 0: russia2018 worldcup, coherence score: −1.099 • Topic 1: memberships escort discounts google website number news great lasele bitcoin, coherence score: −1.265 • Topic 2: 2018 russia, coherence score −1.386 • Topic 3: world cup russia win, coherence score: −0.510 • Topic 4: mex (Mexico) ger (Germany) germex mexicovsalemania, coherence score: −1.449 • Topic 5: arg (Argentina) isl (Iceland) argisl 11, coherence score: −0.869 • Topic 6: nigeria croatia cronga nga (Nigeria) cro (Croatia) supereagles win match soarsupereagles 20, coherence score: −0.998 • Topic 7: worldcup2018 worldcuprussia2018 rusya2018 rusia2018, coherence score: −0.992 • Topic 8: mexico germany germex game, coherence score: −1.610 322 M. K. Hinders and S. L. Kirn • Topic 9: bra (Brazil) brazil brasui switzerland brasil neymar 11 copa2018 coutinho, coherence score: −1.622 • Topic 10: fifa football, coherence score: −1.386 • Topic 11: vs serbia, coherence score: −1.099 • Topic 12: argentina iceland 11, coherence score: −0.903 • Topic 13: england tunisia eng (England) threelions football tun (Tunisia) engtun tuneng go tonight, coherence score: −1.485 • Topic 14: fraaus france australia 21 match pogba, coherence score: −1.196 • Topic 15: fifaworldcup2018 fifaworldcup soccer russia2018worldcup fifa18, coherence score: −0.439 • Topic 16: messi ronaldo penalty argisl lionel miss, coherence score: −1.009 Overall, these topics are intelligible and there are no obvious chained, intruded, unbalanced, or random topics. Even the least coherent topic, topic 9, still makes good sense, with references to Brazil and Switzerland during their match, Neymar and Coutinho are both star Brazilian players. There are a few topics that overlap, such as topics 4 and 8 that both discuss the Germany–Mexico match, and topics 5, 12, and 16 which all discuss the Argentina–Iceland match. Since these particular matches were highly anticipated, it is not surprising that they prompted enough Twitter activity to produce several topics. Tweets are significantly shorter than other documents we might run topic modeling on, such as news articles, so there are far fewer term co-occurrences. In our dataset, a portion of the tweets only reference a single occurrence in a match. As a result, some of our extracted topics are extremely specific. For example, topic 16 seems to be primarily about Lionel Messi missing a penalty kick, while topic 5 is about the Argentina–Iceland game in general. One distinction between topics 5 and 12 seems to be the simple difference in how the game is referenced—the tweets represented by topic 5 use the hashtag shortenings “arg” and “isl” for Argentina and Iceland, while the tweets generating topic 12 use the full names of the countries. A similar difference is seen in topics 4 and 8 referencing the Germany–Mexico match. Topics 6, 9, 13, and 14 all also reference specific games that occurred during the time of data collection. Beyond individual games we also see topics that reference the hashtags that were used during the event and how these hashtags overlap in the dataset. Specifically topics 0, 7, 15 all show different groupings of hashtags. Topic 0 is by far the most popular of these hashtags with over 9000 tweets referencing it, while topics 7 and 15 each have fewer than 2000 tweets. This makes sense because we searched by #WorldCup so most tweets should have a reference to it by default.5 These were also the two hashtags that were being promoted by the traditional media companies covering the event. Although topics extracted from such a dataset provide insight, they only tell a fraction of the story. With enough tweets, we can use time-series representations to 5 It should be noted here that by definition all tweets have the hashtag #WorldCup because that was our query term. However, many tweets that push the character limit are truncated through the API. If #WorldCup appears in the truncated part of the tweet is does not show up in the information we pull from the API. 9 Cranks and Charlatans and Deepfakes 323 Fig. 9.15 Time-series representation of the World Cup dataset. Each point in the time series represents the total number of tweets occurring within that given five-minute window. All times are in UTC gain a deeper understanding of the topic and a way to measure a topic’s relevance by allowing us to identify various trending behaviors. Did a topic build slowly to a peak the way a grassroots movement might? Did it spike sharply indicating collective reaction to a singular event might? These questions can be answered with time-series representations. To create a time series from topics, we use the topic-document matrix, W , and the publish time of each tweet. In our case the tweets in W are in chronological order. Figure 9.15 shows the time-series representation of the full dataset, f. Each data point in this figure represents an integer number of tweets that were collected during a five-minute window. We first iterate through the rows of W , collecting all the tweets that correspond to one single-time window, and count the number of nonzero entries for each topic. Next, W is normalized across topics so that each row represents a distribution across topics. We then convert W to a binary matrix, Wbi n , by setting every entry in W greater than 0.1 to 1, and sum across each column to get a count for each topic i 0 +f(t) Wbi n (i, k), (9.8) G(t, k) = i=i 0 where G is a T × k matrix representing the time series for each of the discovered topics, t = 1 . . . T , T is the total time of the time series f, and i 0 is calculated as i0 = t−1 f( j). (9.9) j=0 The result is a matrix G, where each column gives the time-series representation of a topic. Two examples of topic progression over time can be seen in Fig. 9.16. This figure shows the time-series plots of World Cup topics 4 and 5, which discuss the Germany–Mexico match and the Argentina–Iceland match, respectively. Topic 4 spikes shortly after the match started at 15:00 UTC, with additional peaks at 16:50 324 M. K. Hinders and S. L. Kirn Fig. 9.16 Time-series representation for topics 4 and 5. Topic 4 is about the Germany–Mexico match, that occurred on June 17, and topic 5 is about the Argentina–Iceland match that occurred on June 16 and 15:35. The latter two times correspond to events that occurred during the game: at 15:35 Mexico scored the first goal to take a 1-0 lead, and at 16:50 the game ended, signifying Mexico’s defeat of the reigning World Cup champions. Topic 5 focuses on the Argentina–Iceland game which took place on June 16 at 13:00 UTC. The highest peak occurs at the conclusion of the game, while another peak occurs at the start. This was such a popular game because although Iceland was a major underdog, they were able to play to a tie the star-studded Argentinian team. Two other peaks occur at 13:20 and 13:25, the times when the two teams scored their only goals in the 1-1 tie. Although we are able to identify important characteristics of each topic using this sort of standard time-series representation, we would like to be able to analyze each topic’s behavior at a deeper level. Fourier analysis does not give behavior localized in time and spectrograms are almost never the most efficient representations of localized frequency content. Thus we look to wavelet analysis, which converts a one-dimensional time series into two dimensions with coefficients representing both wavelet scale and time. Wavelet scale is the wavelet analog to frequency, where high wavelet scales represent low-frequency behavior and low-wavelet scales represent high-frequency behavior [95]. The dynamic wavelet fingerprint is a technique developed [96] to analyze ultrasonic waveforms, though it has been found to be an effective tool for a wide array of problems [97–105]. It is also effective for characterizing behaviors exhibited by time series extracted from topic modeling millions of tweets [74]. The key to this method, and why it has been so broadly applicable is that it has the ability to identify behavior that is buried deeply in noise in a way the traditional Fourier analysis is unable to. By casting the time series into higher dimensions and isolating behavior in time we have shown that we can identify common behavior across diverse topics from different tweet storms, even if this behavior was not obvious from looking at their time series. DWFP casts one-dimensional time-series data into a two-dimensional black and white time-scale image. Every mother wavelet creates a unique fingerprints, which 9 Cranks and Charlatans and Deepfakes 325 can highlight different features in the time series [95]. The DWFP is performed on a time series w(t), where t = 1, . . . , T by calculating a continuous wavelet transform C(a, b) = +∞ −∞ w(t)ψa,b dt, (9.10) where a and b represent the wavelet scale and time shift of the mother wavelet ψa,b , respectively. The time shift value, b, represents the point in time the wavelet begins and allows wavelet analysis to be localized in both time and frequency similar to a spectrogram [106]. The resulting ns × T coefficient matrix from (9.10), C(a, b), represents a three-dimensional surface with a coefficient for each wavelet ψa,b , where ns is the total number of wavelet scales. This matrix is then normalized between −1 and 1 (9.11) Cnor m (a, b) = C(a, b)/max(|C(a, b)|). With the normalized surface represented by Cnor m (a, b), a thick contour slice operation is performed to create a binary image I(a, b) such that I(a, b) = 1, s − r2t ≤ Cnor m (a, b) ≤ s + 0, otherwise rt 2 , (9.12) where s represents the center of each of the S total thick contour slices, s = (± 1S , ± 2S , . . . , ± SS ), and r t represents the total width of each slice. This gives a binary matrix, I(a, b), which we call a fingerprint. Figure 9.17 illustrates the steps performed to create a wavelet fingerprint from the topic 4 time series. Shown at the top is the raw time series for topic 4. Noise is removed from the time series using a low-pass filter resulting in the middle image of Fig. 9.17. Once the noise is removed from a time series it can be processed by the DWFP algorithm, which generates the fingerprint shown at the bottom of Fig. 9.17. It should be noted that that shading present in Fig. 9.17 as well as other fingerprints in this chapter is simply for readability. Generally when analyzing fingerprints just the binary image is used. The alternating shades show different objects in a fingerprint. A single object represents either a peak or a valley in the Cnor m matrix, where peaks refer to areas where all entries are positive and valley is an area where all entries are negative. One of the main advantages of using the DWFP for time-series analysis is the ability to create variations in fingerprint representations with different mother wavelets. This is depicted in Fig. 9.18, which shows three different DWFP transformations of the filtered time series in Fig. 9.17. All parameters remain constant—only the mother wavelet is changed. The top image uses the Mexican Hat wavelet, the same as in Fig. 9.17, the middle fingerprint was created using the 4th Daubechies wavelet, and the bottom image was created using the 6th Gaussian wavelet. From these representations we can extract features such as ridge count, filled area, best fit ellipses, etc. Detailed explanations of these features can be found in [107]. Each different repre- 326 M. K. Hinders and S. L. Kirn Fig. 9.17 Illustration of process to create DWFP of topic 4. Beginning with the raw waveform (top) we run a low-pass filter to filter out the noise. The filtered waveform (middle) is then passed through the DWFP process to create the fingerprint (bottom). The resulting fingerprint gives a twodimensional representation of the wavelet transform. In the above example we used the Mexican Hat wavelet, with r t = 0.12 and s = 5. The different shades show the different objects within the fingerprint sentation gives a unique description of the behavior of the signal and can be used to identify different characteristics. For example the middle fingerprint in Fig. 9.18 has more activity in low scale wavelets, while the Mexican Hat DWFP shows more low scale information from the signal and a much wider fingerprint. Individual objects in a fingerprint tell a story about the temporal behavior of a signal over a short time. Each object in a fingerprint describes behavior in the time series in a manner that is not possible through traditional Fourier analysis. In [74] we analyzed seven different tweet storms of various volumes and contexts. Some 9 Cranks and Charlatans and Deepfakes 327 Fig. 9.18 Example of the ability to change representations of the same time series by simply changing the mother wavelet. Represented are three DWFP transforms of the filtered time series in Fig. 9.17. The top is the same Mexican Hat representation shown in Fig. 9.17. The middle representation was created using the fourth Daubechies wavelet. The bottom is was created using the 6th Gaussian wavelet were political tweet storms, while some were focused on major sporting events including this World Cup dataset. Through this analysis, we identified 11 distinct objects that described behavior that differentiated different types of topics. We called these objects characteristic storm cells. Figure 9.19 shows the fingerprints from topics 5 (top), 6 (middle), and 9 (bottom). In each of these topics, as well as the fingerprint for topic 4 at the bottom of Fig. 9.17, the far left object resembles storm cell 8 as found in [74]. We found this storm cell was descriptive of topics that build up to a peak and maintain that peak for some time. Each of the topics representing individual games shows this slow build up. Though there are occasionally quick spikes for goals scored, these spikes will be registered 328 M. K. Hinders and S. L. Kirn Fig. 9.19 Fingerprints for topics 5 (top), 6 (middle), and 9 (bottom). Each topic represents a different match that occurred during data collection and all three show relatively similar behavior. Different shades in each fingerprint represent different objects. Objects give a description of the temporal behavior of a signal over a short time in the second object which describes the behavior during the match. These second objects represent various storm cells such as storm cell 7 which describes behavior after the initial increase in volume. Many of the match topics appear similar in the time-series representation, so it is no surprise that they look similar in the DWFP representation. However, when looking at some of the other topics, while their time-series representations look quite different we can show that the behavior is actually quite similar. Figure 9.20 shows two different topics with their time series and corresponding fingerprints. The top two images are for topic 0, which represents tweets that had strong correlation to the hashtags #Russia2018 and #WorldCup, while the bottom two images are for topic 1, 9 Cranks and Charlatans and Deepfakes 329 Fig. 9.20 Time series and DWFP representations for topics 0 (top 2) and 1 (bottom 2). Even though these two topics emit much different time signatures we can still identify common patterns in their behavior using the DWFP. Specifically the objects indicated by the red arrows look quite similar to the characteristic storm cell 0 as found in [74] 330 M. K. Hinders and S. L. Kirn which represents mostly ads and spam that ran along with the World Cup coverage. The time series for topic 0 mimics the time series for the dataset as a whole, Fig. 9.15, with a rhythmic pattern over the three days of data. This should be expected because most tweets in the dataset referenced #WorldCup. However, topic 1 does not have this same shape. It exhibits two quick spikes around the same times topic 0 has its local maximum, showing the advertisers targeted their content for when traffic was at its heaviest, but there is not the same rhythmic pattern over the three days as shown from topic 0. While these time series look quite different in the time domain, they share some key characteristics in the DWFP domain, most notably an object sitting between the two local maxima, marked by the red arrow in both fingerprints, that resembles storm cell 0 in [74]. This shows the underlying behavior of the advertisers, who tailor their ads to World Cup fans, thus they try to run their ads when more users are online to maximize exposure while minimizing the overall costs of running the ads. As teams and networks continue to push conversation and coverage to social media over time, the amount of data on Twitter will continue to grow. For instance, during the 2018 Olympics in South Korea, NBC began streaming live events online and providing real-time coverage via Twitter [108], and several NFL games and Wimbledon matches were also aired on Twitter in 2016 [109]. The application of topic modeling and DWFP analysis to social media could be far more powerful than what was illustrated above. Disinformation campaigns have been launched in at least 28 different countries [110] using networks of automated accounts to target vulnerable audiences with their fake news stories disguised as traditional news [111–114]. Content moderation at scale is an intractable problem in a free society, so analyzing characteristics of individual stories posted to social media is not the answer. Instead we need to look for the underlying behavior of the actors operating campaigns. We can use the DWFP to uncover this behavior by identifying key temporal signals buried deeply in the noise of social media. Networks of automated accounts targeting specific audiences will necessarily emit different time signatures than would a network of human users discussing some topic, as illustrated by ads timed to coincide with high World Cup tweet volume. This can be exploited by identifying the fingerprint pattern that best identifies this time signature. Topics giving this time signature can then be flagged so that preventative action can be taken to slow the dissemination of that disinformation. Paired with spatial methods such as retweet cascades [115] and network analysis [116, 117] this can differentiate bot driven topics from disinformation campaigns from all other normal topics on social media. Part of the inherent advantage of the DWFP representation is that it can highlight signal features of interest that are buried deeply in noise. 9 Cranks and Charlatans and Deepfakes 331 9.7 DWFP for Account Analysis Much of the spread of inflammatory content and disinformation on social media is pushed by bots, many of whom operate as part of disinformation campaigns [118– 120]. Thus it is advantageous to be able to identify if an individual account is a bot. Bot accounts are adept at disguising themselves as humans so they can be exceedingly difficult to detect [121]. Many bot accounts on social media platforms are harmless and upfront about their nature. Examples of these include news aggregators and customer service bots. Of greater concern to the public are accounts that do not identify themselves as bots, but instead pretend to be real human accounts in order to fool other users. These types of bots have been identified as a large component in spreading misinformation on social media [122, 123]. They were instrumental in the coordinated effort by the Russian operated Internet Research Agency to sow discord into the United States’ voter base in order to attempt influence the 2016 US Presidential Election [124–126]. Raising concerns about malicious bot accounts pushing misinformation and misconstruing public sentiment has lead to the need to develop robust algorithms that can identify bots operating in near real time [127]. We can again use Tweepy to extract metadata from individual Twitter users [94]. Metadata related to an account include total number of followers/following, total number of tweets, creation date, bio, etc. Metadata is commonly used to detect bots [128–131]. One of the most popular currently available bot detection algorithms, Botometer—originally BotOrNot—uses over 1,000 features extracted from an account’s metadata and a random forest classifier to return the likelihood an account is a bot [132, 133]. However, training a bot detection system purely on metadata will not create an algorithm that is robust through time. Bot creators are constantly evolving their methods to subvert detection [134]. Thus, any bot detection method needs to be able to look beyond surface level metadata and analyze behavioral characteristics of both bots and their networks, as these will be more difficult to manipulate [135, 136]. We use the wavelet fingerprint to analyze the temporal posting behavior of an account to identify inorganic behaviors. Recently we [74] illustrated the basic method analyzing a corpus of tweets curated from 7 distinct tweet storms that occurred between 2015 and 2018 that we deemed to have cultural significance. The Twitter API was used to stream tweets as events unfolded for the Brett Kavanaugh confirmation hearings, Michael Cohen congressional testimony, President Trump’s summit with North Korea’s Kim Jung Un, the release of the Mueller Report on Russian meddling in the 2016 presidential election, and the 2018 World Cup. Open datasets were found for the 2018 Winter Olympics, 2017 Unite the Right rally in Charlottesville VA, and the 2015 riots in Baltimore over the killing of Freddie Gray. In all we collected and analyzed 28,285,124 tweets. The key to identifying twitterbots is sophisticated analysis of the timing of tweets. Pre-programmed posts will not have the organic signatures characteristic of humans posting thoughts and re-tweeting mots. The wavelet fingerprint method can be used to analyze tweetstorms and automatically classify different types of tweetstorms. We first use topic modeling to identify the things that are being tweeted about, and 332 M. K. Hinders and S. L. Kirn then form time series of the topics of interest. We then perform our particular type of wavelet transform on the time series to return the black-and-white time-scale images. We then extract features from these wavelet fingerprints to use for machine learning in order to classify categories of tweetstorms. Using the time series of each topic we run wavelet fingerprint analyses to get a two-dimensional, time-scale, binary image. Gaussian mixture model (GMM) clustering is used to identify individual objects, or storm cells, that are characteristic to specific local behaviors commonly occurring in topics. The wavelet fingerprint transformation is volume agnostic, meaning we can compare tweet storms of different intensities. We find that we can identify behavior, localized in time, that is characteristic to how different topics propagate through Twitter. Organic tweetstorms are driven by human emotions, e.g., outrage or team spirit, and many people are often simultaneously expressing those same, or perhaps the contra-lateral, emotions via tweets. It’s the characteristic fingerprints of those collective human emotions that bots can’t accurately mimic, because we’re pulling out the topics from the specific groups of words being used. We have found, for example, that fans tweeting about World Cup matches are distinguishable from fans tweeting about Olympic events, presumably because football fans care much more deeply and may be tweeting from a pub. With enough tweets, we can use time-series representations to gain a deeper understanding of the topic and a way to measure a topic’s cultural relevance by allowing us to identify various trending behaviors. Did a topic build slowly to a peak the way a grassroots movement might? Did it spike sharply indicating collective reaction to a singular event might? These questions can be answered with time-series representations. Although we are often able to identify important characteristics of each topic using this sort of standard time-series representation, we need to be able to analyze each topic’s behavior at a deeper level. Fourier analysis does not give behavior localized in time and spectrograms are almost never the most efficient representations of localized frequency content. Thus we look to wavelet analysis, which converts a one-dimensional time series into two dimensions with coefficients representing both wavelet scale and time. Wavelet scale is the wavelet analog to frequency, where high wavelet scales represent low-frequency behavior and low-wavelet scales represent high-frequency behavior. Each mother wavelet creates a unique fingerprints, which can highlight different features in the time series. One of the main advantages of using the wavelet fingerprint for time-series analysis is the ability to create variations in fingerprint representations with different mother wavelets. As an illustration of how we can use the DWFP to identify inorganic behavior emitting from a single account we analyze the account trvestuff (@trvestuff). At 7:45 am Eastern time on 22 Jan 20, one of our curated sock puppets (@IBNowitall) with 8 known human followers, tweeted the following, “Al Gore tried to warn us all about adverse climate change. He had a powerpoint and everything. Now even South Park admits that manbearpig is real. It’s time to get ready for commerce in the Arctic. I’m super serial.” It was retweeted almost immediately by a presumed bot programmed to watch for tweets about climate change. The bot’s profile pic is clearly a teenager purporting to be in Tacoma, WA where it would have been rather early. First-period 9 Cranks and Charlatans and Deepfakes 333 Fig. 9.21 Time-series analysis of the posting behavior of the account trvestuff (@trvestuff). The top plot shows the time series of the raw number of posts every 15 min for two weeks. The middle plot shows the wavelet fingerprint of this time series. The wavelet fingerprint was created using the Gauss 2 wavelet, 50 wavelet scales, 5 slices, and a ridge thickness of 0.12. The bottom plot shows the ridge count for the wavelet fingerprint. There is a clearly un-natural repetition to this signal bell there is 7:35 am and this was a school day. Although this twitter account is only a few months old, that profile pic has been on the interwebs for more than 5 years, mostly in non-English-speaking countries. We ran our topic modeling procedure on that bot’s corpus of tweets and formed a time signature of retweets characteristic of bots. Figure 9.21 shows the actual tweeting behavior of trvestuff plotted in time. The top plot in this figure shows the time-series of all tweets by trevstuff over a two week period from 5 Mar 20 until 19 Mar 20. Each point in the time series shows how many tweets the account posts for a 15 min interval. There is clearly a repetitive behavior exhibited by trvestuff. With 12 h of tweeting 4–6 tweets every hour, then 12 h off. This is confirmed by looking at the wavelet fingerprint shown in the middle image of Fig. 9.21. Compare this to Fig. 9.22 which shows the same information for a prolific human tweeter (@realDonaldTrump) and looks much less regular, with a few spikes of heavy tweeting activity. Looking at the fingerprint we cannot see any repetitive behavior, outside of maybe a slight diurnal pattern, which would be natural for any human tweeter. Features are extracted from the fingerprints which describe the tweeting behavior of an account in different ways. For example at the bottom of both Figs. 9.21 and 9.22 is the ridge count for the corresponding wavelet fingerprint. The ridge count tabulates the number of times each column of pixels in the fingerprint switch from 334 M. K. Hinders and S. L. Kirn Fig. 9.22 Time-series analysis of the posting behavior of the account Donald Trump (@realDonaldTrump). The top plot shows the time series of the raw number of posts every 15 min for two weeks. The middle plot shows the wavelet fingerprint of this time series. The wavelet fingerprint was created using the Gauss 2 wavelet, 50 wavelet scales, 5 slices, and a ridge thickness of 0.12. The bottom plot shows the ridge count for the wavelet fingerprint Fig. 9.23 Autocorrelation coefficients for each of the ridge counts shown in Figs. 9.21 (top) and 9.22 (bottom). Each data point shows the correlation coefficient of the ridge count with itself offset by that many data points 9 Cranks and Charlatans and Deepfakes 335 0 to 1 or 1 to 0. We can use the autocorrelation of the ridge counts to unearth any repetitive behaviors. Figure 9.23 shows the normalized auto correlation. Each plot shows how similar the vector is to its self shifted by some amount of time steps, with a maximum of 1 if they are exactly the same. Again we see further evidence of the artificial nature of trevstuff’s tweeting behavior. The President’s shows no real repetition. There is a small spike when the vector is offset by about three days. This is from the ridge count spikes on March 9th and March 12th aligning. However, the prominence of that spike is not close to the prominence of the spikes exhibited from trvestuff. The DWFP was probably unnecessary to see that trvestuff is a bot posing as a human. Sophisticated bots operating on Twitter will not be this obvious, of course, but the goal is to detect bots automatically. The DWFP has shown itself to be adept at identifying behavior that is buried deeply in noise, so it can identify these inorganic signals that are buried deeply in the noise of a Twitter bot’s posting behavior. This is the capability necessary to uncover sophisticated bots spreading spam and disinformation through Twitter. But what else might this be useful for? 9.8 In-Game Sports Betting Some people think Ultimate Frisbee is a sport. Some people think the earth is shaped like a Frisbee. Some people think soccer is a sport. Some people think the earth is shaped like a football. I think American football is a sport. I think basketball is a sport. I think baseball and golf are excellent games for napping. I think Frisbee is a sport for dogs. I think muggle quidditch is embarrassing. I have no particular opinion on soccer, except that I don’t think kids should get a trophy if their feet never touched the ball. But now that sports betting has been legalized nationally (What were the odds of that?) we may have to expand our definition of sports to be pretty much any contest that you can bet on [137]. Americans currently wager a total of $150 billion each year through illegal sports gambling, although it’s not clear how much of that is March Madness office pools where most people make picks by their teams’ pretty colors or offensive mascots. Some people are worried that sports betting will create ethical issues for the NCAA. I’m worried that academically troubled student-athletes are encouraged to sign up for so-called paper classes, which are essentially no-show independent studies involving a single paper that allows functionally illiterate football players to prop up their GPAs to satisfy the NCAA’s eligibility requirements.6 As many others have 6 Here’s an actual UNC Tarheel paper that was written for an actual intro class, in which the studentathlete finished with an actual A-: On the evening of December Rosa Parks decided that she was going to sit in the white people section on the bus in Montgomery, Alabama. During this time blacks had to give up there seats to whites when more whites got on the bus. Rosa Parks refused to give up her seat. Her and the bus driver began to talk and the conversation went like this. “Let me have those front seats” said the driver. She didn’t get up and told the driver that she was tired of giving her seat to white people. “I’m going to have you arrested,” said the driver. “You may do that,” Rosa 336 M. K. Hinders and S. L. Kirn pointed out, people who think that most big-time college athletes are at school first and foremost to be educated are fooling themselves. They’re there to work and earn money and prestige for the school. That money all stays inside the athletic departments. The English department faculty don’t even get free tee shirts or tickets to the games, because that might create a conflict of interest if someday a semi-literate star player needed an A- to keep his eligibility. Profs. do get to share in the prestige, though. There might be a small faculty discount at the university bookstore, which mostly sells tee shirts. Tweetstorms about the Olympics give characteristically different fingerprints than the World Cup. Fans are somewhat less emotionally engaged for the former, and somewhat more likely to be drinking during the latter. Winter Olympics hooligans simply isn’t a thing, unless Canada loses at curling. Both the Olympics and the World Cup do take place over about a fortnight, and whoever paid for the broadcast rights does try hard to generate buildup. Hence, related tweetstorms can be considered to have compact support in time. Cultural events also often happen over about this sort of time scale because people collectively are fickle and the public is thought to have short attention spans. There’s also always some new disaster happening or looming to grab the public’s attention and perhaps generate advertising dollars. We can analyze tweetstorms about global pandemics in a manner similar to how we did the Olympics and the World Cup. The minutiae of the wavelet fingerprints are different in ways that should allow us to begin to automatically characterize tweetstorm events as they begin to build and develop so that proper action can be taken. Of particular interest is distinguishing tweetstorms that blow up organically versus those that are inflated artificially. Fake news is now widely recognized as a problem in society, but some politicians have gotten into the habit of accusing anything that they disagree with of being fake news. Analyzing the tweetstorms that surround an event should allow us to determine its inherent truthiness. We expect this to be of critical importance during the next few election cycles, now that deepfake videos can be produced without the hardware, software and talent that used to be confined to Hollywood special effects studios. These techniques can be used to analyze in real time the minutiæ of in-game sports betting in order to adjust odds on the fly faster than human bookies could ever do. For a slow-moving sport like major league baseball, fans inside the stadium will be able to hold a beer in one hand and their stadium-WiFi-connected smartphone in the other. Teams will want to provide excellent WiFi and convenient apps that let fans bet on every little thing that happens during the game. They can encourage fans to come to the game by exploiting the demise of net neutrality to give a moment’s advantage to fans inside the stadium as compared to those streaming the game at home. The in-stadium betting can be easily monetized by taking a small cut off the top for those using the team’s app, but the real issue with in-game betting is adjusting the odds to make sure they are always in Panem’s favor. We expect a rich storm of Parks responded. Two white policemen came in and Rosa Parks asked them “why do you all push us around?” The police officer replied and said “I don’t know, but the law is the law and you’re under arrest.” 9 Cranks and Charlatans and Deepfakes 337 information to develop around any live sporting event as people draw on multiple sources of information, guidance, speculation, etc. in order to improve their odds. As of 2020, that seems likely to be primarily twitter, although we use the terms “tweets” and “tweetstorms” in the general sense, rather than referring to a particular social media platform because those come and go all the time. Speech-to-text technology is robust enough that radio broadcasts can be entrained into topic modeling systems, and databases such as in-game, in-stadium beer sales can be updated and accessed in real time. Meteorological information and the like can also be incorporated via a comprehensive machine learning approach to fine tune the odds. Gotta go, the game’s about to start. 9.9 Virtual Financial Advisor Is Now Doable Meanwhile, Warren Buffet and Bill Gates are persuading fellow billionaires to each commit to giving away half of their wealth to good causes before they die. If none of the billionaires listed at givingpledge.org think you’re a good cause to give their money to, you might be interested in some fancier investment strategies than diversified, low-load index funds. Econometrics is the field of study where mathematical modeling and statistics is used to figure out why the market(s) just did that thing that nobody except Warrent Buffet predicted. Business analytics and machine learning are all the rage right now, and applying artificial intelligence to financial modeling and prediction seems like a no-lose business proposition. As I may have mentioned, there seem to be lots of cranks and charlatans in this business right now. If you page back through this book you should see that the dynamic wavelet fingerprint technique works quite well to identify subtle patterns in time-series data. It would be a trivial matter to form wavelet fingerprints from econometric datasets, and then to look for patterns that you think might predict the future. The human eye is very good at seeing patterns in these kinds of fingerprint images, but will see patterns even if they aren’t there, so you have to be very careful about fooling yourself. The good news is you can place bets on yourself and see if your predictions return profit or loss. You could also write a simple wavelet fingerprint app for day traders and see if that hits the jackpot. If you make billions of dollars doing that sort of thing, be sure to sign on to the giving pledge. Generously supporting The Alma Mater of a Nation would be a worthy cause, and we’ll be happy to name a building after you and yours. Machine learning has rather a lot of potential for doing good in financial services, which we close with to end on a hopeful note, as opposed to deepfakes and twitter bots and such. Structured data is typically comprised of well-defined data types whose pattern makes it easily searchable, while unstructured data is comprised of data that is usually not as easily searchable including formats like audio, video, and social media postings that require more preprocessing. It’s now possible to make use of both structured and unstructured data types in a unified machine learning paradigm. Key to the applications discussed in this book is extracting meaning from time-series data 338 M. K. Hinders and S. L. Kirn using the standard statistical methods as well as time-series and time-frequency/timescale transformations which convert time-series signals to higher dimensional representations where image processing methods and shape identification methods can be brought to bear. The standard tools available via day-trading platforms such as E*trade, TD Ameritrade, etc.—and even the more sophisticated tools on Bloomberg terminals—can be leveraged to develop advanced signal processing methods for machine learning for developing a virtual financial advisor (VFA). A VFA could, for example, give advice on purchasing a car that would best fit a person’s particular financial situation, accounting for marital status, dependents, professional status, savings, total debt, credit score, etc. as well as previous cars purchased. Rather than a simple credit score decision tree, the goal is a more holistic assessment of the person’s situation and financial trajectory and how notional car-buying decisions would affect that financial trajectory both over the short term and much longer. Natural language processing allows the VFA to simply engage the person in a conversation without the pressure of a salesman trying to close a deal, and without having to type a bunch of information into a stupid webform. Of course, major financial decisions like buying a car should never be done without an assessment of the state of the larger economy and economic outlook. The VFA algorithms should allow for incorporation of micro- and macro-economic data including realtime assessments of markets, trending topics on world financial markets discussion sites (verbal and text-based), and historical data/analysis with an eye toward precursor signals of significant financial events.7 Topic modeling and pattern classification via dynamic wavelet fingerprints allow a machine learning system to incorporate and make sense of patterns in all of these diverse and disparate information sources in order to give give neutral, but tailored guidance to a potential car buyer. The VFA would need to be able to • Leverage data of people with similar attributes, e.g., education, professional status, life events, financial situations, and previous car ownership. The data types and formats available, especially the time resolution and length of archived data history, will be critical in this algorithm development. • Understand semantics in conversations via an up-to-date a library of car names, nicknames and related jargon, but freely available sources from Wikipedia to specialized subreddits can be scraped and analyzed with topic-modeling methods. • Present alternatives as suggestions, but not rank ordered in any way because most people view their cars as an extension of their own self-image and part of the “personality” of the VFA is that it doesn’t presume to know which car makes you cool or whatever human emotions happen to be driving your car-buying decisions. The recommendations would be simply couched in terms of how various options might affect the your financial situation now and in the future. Of course the VFA would need to be integrated as a module in an intelligent personal assistant technology with an interface to interact using (Warren Buffet’s) voice, which 7 As I’m typing this, oil is at $20 per barrel and some analysts are predicting single digits. Who could have predicted that? 9 Cranks and Charlatans and Deepfakes 339 means that it could be used to keep track of television shows such as Motorweek and several of the numerous car-repair and rework shows airing currently, automotive podcasts, etc. Similarly, it could monitor financial news television shows, podcasts, etc. with a particular focus on identifying and isolating discussions about the state of the automotive industry. A side benefit of the ability to monitor and classify automotive industry and financial audio streams may be to “bookmark” segments that would be of interest for you and suggest that you might want to listen or watch current news about the cars you’ve expressed interest in buying. The TextRank algorithm for automatic keyword extraction and summarization using Levenshtein distance as relation between text units can be leveraged to tl;dr them. This sort of system could be expanded to give advice for other decisions such as home purchases (e.g., finding a home, mortgages and insurance) and college savings (e.g., 529 College Savings). The VFA would give advice on how to approach purchasing a home, for example, by helping with mortgages and insurance as well as finding homes that best suit a person’s family and financial situation. It would make use of existing home-search tools, like Realtor.com, Zillow.com, etc. Buying a home is a much more significant financial decision with much longer term consequences, so a feature of a VFA personality for this would be entering into a patient, longterm conversation about the range of choices. Also, once the person’s dream home appears on the market there is often a sudden urgency to downselecting on all the ancillary financial issues, especially financing options and the like. A virtual financial advisor is ideally suited to all configurations of a hurry-up-and-wait scenario and can patiently and diligently watch both the local housing market and mortgage rates for things to perfectly align. Individuals can be confident that the VFA is driven by what’s in their best interests, not just the need to make its sales numbers this quarter, and will present ranges of options with implications for their long-term financial well being at the center of the calculus. You wouldn’t have to worry about whether the VFA might be a crank (fool) or a charlatan (crook) because its sole purpose is to collect information and detect patterns and then give you tailored, unbiased advice. Similarly, modules to offer advice concerning other long-term financial planning decisions, such as college savings, could be incorporated. Specialized expertise—such as details about how to maximize benefits under the Post-9/11 GI Bill or why it’s better to have grandparents open a 529 College Savings Plan or how the typical discount rate at private colleges affects cost of attendance—could be incorporated into the VFA. There is a wealth of insider information about how college works and what it actually costs and how US News rankings, Final Four results, and post-college placement statistics, etc. affect everything from the acceptance rate, cost of attendance, and subsequent earning potential for the child. The VFA can help to demystify college for a young parent who needs to know now how to start planning enough in advance to help open the right doors 10 or 15 years hence without resorting to bribe a coach or something. We think wavelet fingerprints can help. 340 M. K. Hinders and S. L. Kirn References 1. Stromberg J (2013) What’s in century-old ‘Snake Oil’ medicines? Mercury and lead, smithsonian.com April 8, 2013. https://www.smithsonianmag.com/science-nature/whats-in-centuryold-snake-oil-medicines-mercury-and-lead-16743639/ 2. Silberman S (2012) Are warnings about the side effects of drugs making us sick? PLOS (The Public Library of Science). http://blogs.plos.org/neurotribes/2012/07/16/are-warningsabout-the-side-effects-of-drugs-making-us-sick/. Accessed 16 July 2012 3. Preston E (2014) Say no to nocebo: how doctors can keep patients’ minds from making them sicker. Discover Magazine. http://blogs.discovermagazine.com/inkfish/2014/07/09/say-noto-nocebo-how-doctors-can-keep-patients-minds-from-making-them-sicker/. Accessed 9 July 2014 4. Zublin F (2015) Keep your whooping cough to yourself, OZY.com. https://www.ozy.com/ immodest-proposal/keep-your-whooping-cough-to-yourself/60310. Accessed 31 May 2015 5. Lewandowsky S, Mann ME, Bauld L, Hastings G, Loftus EF (2013) The subterranean war on science. Association for Psychological Science. https://www.psychologicalscience.org/ observer/the-subterranean-war-on-science 6. Mole B (2016) FDA: homeopathic teething gels may have killed 10 babies, sickened 400. Ars Technica. https://arstechnica.com/science/2016/10/fda-homeopathic-teething-gelsmay-have-killed-10-babies-sickened-400/. Accessed 13 Oct 2016 7. Gorski D (2010) The dietary supplement safety act of 2010: a long overdue correction to the DSHEA of 1994? Science-Based Medicing. http://www.sciencebasedmedicine.org/?p=3772. Accessed 8 Feb 2010 8. Summers D (2014) Dr. Oz: world’s best snake oil salesman. The Daily Beast. https://www. thedailybeast.com/dr-oz-worlds-best-snake-oil-salesman. Accessed 14 June 2014 9. McCoy T (2014) Half of Dr. Oz’s medical advice is baseless or wrong, study says. The Washington Post. http://www.washingtonpost.com/news/morning-mix/wp/2014/12/19/halfof-dr-ozs-medical-advice-is-baseless-or-wrong-study-says/. Accessed 19 Dec 2014 10. Kaplan K (2014) Real-world doctors fact-check Dr. Oz, and the results aren’t pretty. LA Times. http://www.latimes.com/science/sciencenow/la-sci-sn-dr-oz-claims-fact-checkbmj-20141219-story.html. Accessed 19 Dec 2014 11. Ray Merriman Workshop at ISAR Conference on “Reimagining the Future”. https://isar2020. org/, https://www.mmacycles.com/events/ray-merriman-workshop/. Accessed 9 Sept 2020 12. How Warren Buffet became one of the wealthiest people in America, A chronological history of the Oracle of Omaha: 1930–2019 By Joshua Kennon the balance. https://www.thebalance. com/warren-buffett-timeline-356439. Accessed 25 June 2019 13. Yochim D, Voigt K (2019) Index funds: how to invest and best funds to choose. NerdWallet. https://www.nerdwallet.com/blog/investing/how-to-invest-in-index-funds/. Accessed 5 Dec 2019 14. Buffett W (2017) ‘Oracle of Omaha’, criticises Wall Street and praises immigrants. The Guardian. https://www.theguardian.com/business/2017/feb/25/warren-buffettberkshire-hathaway-wall-street-apple-annual-letter. Accessed 25 Feb 2017 15. Yong Ed (2016) Psychology’s replication crisis can’t be wished away. The Atlantic. https:// www.theatlantic.com/science/archive/2016/03/psychologys-replication-crisis-cant-bewished-away/472272/. Accessed 4 March 2016 16. Reilly J (2014) Chin chin: urine-drinking hindu cult believes a warm cup before sunrise straight from a virgin cow heals cancer - and followers are queuing up to try it. The Daily Mail. http://www.dailymail.co.uk/news/article-2538520/Urine-drinking-Hindu-cultbelieves-warm-cup-sunrise-straight-virgin-cow-heals-cancer-followers-queuing-try-it. html. Accessed 13 Jan 2014 9 Cranks and Charlatans and Deepfakes 341 17. Adams C (2014) Is there a scientifically detectable difference between high-price liquor and regular stuff? The Straight Dope. http://www.straightdope.com/columns/read/3142/is-therea-scientifically-detectable-difference-between-high-price-liquor-and-regular-stuff/, http:// www.straightdope.com/columns/read/3142/is-there-a-scientifically-detectable-differencebetween-high-price-liquor-and-regular-stuff/. Accessed 3 Jan 2014 18. Adams W (2014) Wine expert reviews cheap beer. Devour.com. http://devour.com/video/ wine-expert-reviews-cheap-beer/. Accessed 26 Feb 2014 19. Burke J (2013) Woo-ing wine drinkers. Skepchick. http://skepchick.org/2013/01/guest-postwoo-ing-wine-drinkers/. Accessed 19 Jan 2013 20. The bottled water taste test. BuzzFeedVideo. https://youtu.be/2jIC6MBkjjs. Accessed 30 Nov 2014 21. Peters J (2008) What’s the best adult diaper? That depends. Slate Geezers issue. http:// www.slate.com/articles/life/geezers/2008/09/whats_the_best_adult_diaper.html. Accessed 10 Sept 2008 22. Barry-Jester AM (2016) What went wrong in flint. FiveThirtyEight. https://fivethirtyeight. com/features/what-went-wrong-in-flint-water-crisis-michigan/. Accessed 26 Jan 2016 23. Groll E (2015) WHO: to avoid MERS, Don’t Drink Camel Urine. Foreign Policy. https:// foreignpolicy.com/2015/06/08/who-to-avoid-mers-dont-drink-camel-urine/. Accessed 8 June 2015 24. Lee C (2015) Magic placebo more effective than ordinary placebo. Ars Technica. https:// arstechnica.com/science/2015/04/a-tale-of-two-placebos/, http://journals.plos.org/plosone/ article?id=10.1371/journal.pone.0118440. Accessed 22 April 2015 25. Kubo K, Guillory N (2016) People tried their own urine for the first time and they were disgusted. Buzz Feed. https://www.buzzfeed.com/kylekubo/people-tried-their-own-urine-forthe-first-time-and-they-wer. Accessed 5 Jan 2016 26. This section adapted with permission from: Ignatius B. Nowitall, "Scientific Adulting and BS Detection" W&N Edutainment, 2020 27. Houdini H (1924) A magician among the spirits. Harper, New York 28. Mizokami K (2018) 70 years and counting, the UFO phenomenon is as mysterious as ever. Popular Mechanics. https://www.popularmechanics.com/space/a22025557/world-ufo-day2018/. Accessed 2 July 2018 29. Wright T (2013) Why there will never be another Flying Pancake. The end of Vought V173. Air & Space Magazine. http://www.vought.org/rest/html/rv-1731.html, https://www. airspacemag.com/history-of-flight/restoration-vought-v-173-7990846/ 30. Webster D (2017) In 1947, A high-altitude balloon crash landed in roswell. The Aliens Never Left Smithsonian Magazine. https://www.smithsonianmag.com/smithsonian-institution/in1947-high-altitude-balloon-crash-landed-roswell-aliens-never-left-180963917/. Accessed 5 July 2017 31. Project BLUE BOOK - Unidentified Flying Objects. United States Air Force, 1952–1969. https://www.archives.gov/research/military/air-force/ufos.html 32. “Top Secret America" ws a project nearly two years in the making that describes the huge national security buildup in the United States after the Sept. 11, 2001, attacks. The project was last updated in September 2010. http://projects.washingtonpost.com/top-secret-america/ 33. Kirk M, Gilmore J, Wiser M, Smith M (2014) United States of Secrets. Frontline. https:// www.pbs.org/wgbh/frontline/film/united-states-of-secrets/. Accessed 13 May 2014 34. Lundberg J, Pilkington M, Denning R, Kyprianou K (2013) Mirage Men, a journey into paranoia, disinformation and UFOs. Random Media. World premiere at Sheffield Doc/Fest in June 2013. https://www.imdb.com/title/tt2254010/ 35. Plackett B (2012) Declassified at last: air force’s supersonic flying saucer schematics. Wired. https://www.wired.com/2012/10/the-airforce/. Accessed 05 Oct 2012 36. Cowing K (2014) CIA admits that it owns all of the flying saucers. NASA Watch. http:// nasawatch.com/archives/2014/12/cia-admits-that.html. Accessed 29 Dec 2014 342 M. K. Hinders and S. L. Kirn 37. Dickson C (2015) Obama adviser John Podesta’s biggest regret: keeping America in dark about UFOs. Yahoo News. https://www.yahoo.com/news/outgoing-obama-adviser-johnpodesta-s-biggest-regret-of-2014--keeping-america-in-the-dark-about-ufos-234149498. html. Accessed 13 Feb 2015 38. Cooper H, Blumenthal R, Kean L (2017) Glowing auras and ‘Black Money’: the Pentagon’s Mysterious U.F.O. Program. New York Times. https://www.nytimes.com/2017/12/16/us/ politics/pentagon-program-ufo-harry-reid.html. Accessed 16 Dec 2017 39. Tchou A (2011) I saw four green objects in a formation, an interactive map of 15 years of UFO sightings. Slate. http://www.slate.com/articles/news_and_politics/maps/2011/01/i_ saw_four_green_objects_in_a_formation.html. Accessed 11 Jan 2011 40. “British UFO Files Reveal Shocking Sightings And Hoaxes From 1950 To Present” HUFFINGTON POST May 25, 2011 https://www.huffingtonpost.com/2011/03/03/uk-releases-8500pages-of_n_830880.html 41. Darrach HB, Jr, Ginna R (1952) HAVE WE VISITORS FROM SPACE? LIFE Magazine. http://www.project1947.com/shg/csi/life52.html. Accessed 7 April 1952 42. Sofge E (2009) The 10 most influential UFO-inspired books, movies and TV shows. Popular Mechanics. https://www.popularmechanics.com/space/g204/4305349/. Accessed 17 March 2009 43. Mikkelson D (2014) Did boyd bushman provide evidence of alien contact? Snopes.com. https://www.snopes.com/fact-check/boyd-bushman-aliens/. Accessed 31 Oct 2014 44. When president Roosevelt died Truman was was informed by secretary of war, Harry Stimson of a new and terrible weapon being developed by physicists in New Mexico. https://www. history.com/this-day-in-history/truman-is-briefed-on-manhattan-project 45. "Farmer Trent’s Flying Saucer". Life: 40. https://books.google.com/books? id=50oEAAAAMBAJ. Accessed 26 June 1950 46. SCIENTIFIC STUDY OF UNIDENTIFIED FLYING OBJECTS Conducted by the University of Colorado Under contract No. 44620-67-C-0035 With the United States Air Force, Dr. Edward U. Condon, Scientific Director (1968). Electronic edition1999 by National Capital Area Skeptics (NCAS). http://files.ncas.org/condon/text/case46.htm 47. Dom Armentano OC’s moment in UFO history, The Orange County Register October 30, 2009. https://www.ocregister.com/2009/10/30/ocs-moment-in-ufo-history/. See also: https://www.washingtonpost.com/news/morning-mix/wp/2015/01/21/two-decades-ofmysterious-air-force-ufo-files-now-available-online/ 48. Bowen C (1970) “Progress at Cradle Hill” in the March/April edition of Flying Saucer Review. An international journal devoted to the study of Unidentified Flying Objects vol 17, no 2 (1970). http://www.ignaciodarnaude.com/ufologia/FSR%201971%20V%2017%20N%202. pdf 49. Minchin’s T (2011) Storm the animated movie. https://www.youtube.com/watch? v=HhGuXCuDb1U. Accessed 7 April 2011 50. Worth1000.com is now part of DesignCrowd.com, which has preserved all the amazing Worth1000 content here so you can search the archives to find old favorites and new contest art. https://blog.designcrowd.com/article/898/worth1000-on-designcrowd 51. Tumulty K (2007) How the right went wrong time magazine. (PHOTOGRAPH BY DAVID HUME KENNERLY. TEAR BY TIM O’BRIEN) http://content.time.com/time/covers/0, 16641,20070326,00.html. Accessed 15 March 2007 52. “The Age of Instagram Face, How social media, FaceTune, and plastic surgery created a single, cyborgian look.” by Jia Tolentino. The New Yorker. https://www.newyorker.com/ culture/decade-in-review/the-age-of-instagram-face. Accessed 12 Dec 2019 53. This guy can’t stop photoshopping himself into Kendall Jenner’s instagram pics, twisted Sifter. http://twistedsifter.com/2017/01/guy-photoshops-himself-into-kendall-jenners-pics/. Kirby Jenner, Fraternal Twin of Kendall Jenner is @KirbyJenner. Accessed 5 Jan 2017 54. Shiffman D (2013) How to tell if a “shark in flooded city streets after a storm" photo is a fake in 5 easy steps. Southern Fried Science. http://www.southernfriedscience.com/how-to-tell-if-ashark-in-flooded-city-streets-after-a-storm-photo-is-a-fake-in-5-easy-steps/. Accessed 23 Jan 2013 9 Cranks and Charlatans and Deepfakes 343 55. Geigner T (2013) This week’s bad photoshopping lesson comes from scientology (from the the-thetans-did-it dept). Techdirt. Accessed 16 May 2013 56. Live Science Staff “10 Paranormal Videos Debunked” Live Science. https://www.livescience. com/33237-6-paranormal-video-hoaxes.html. Accessed 27 April 2011 57. Boehler P (2013) Nine worst doctored photos of Chinese officials. South China Morning. https://www.scmp.com/news/china-insider/article/1343568/slideshow-doctoredphotos-chinese-officials. Accessed 30 Oct 2013 58. Dickson EJ (2014) Plastic surgeons say Three-Boob Girl is a hoax. The Daily Dot. https:// www.dailydot.com/irl/three-boob-internet-hoax/. Accessed 23 Sept 2014 59. Bruce Shapiro: What happens when Photoshop goes too far? PBS Newshour. https://www.pbs. org/newshour/show/now-see-exhibit-chronicles-manipulated-news-photos. Accessed 26 July 2015 60. Neil Slade Haiti UFO DEBUNKED Slow Motion and Enhanced Stills, Video posted on Aug 10, 2007. https://www.youtube.com/watch?v=rrrx9izp0Lchttps://www.snopes.com/ fact-check/ufos-over-haiti/ 61. Sarno D (2007) It came from outer space. Los Angeles Times. Accessed 22 Aug 2007. http:// www.latimes.com/newsletters/topofthetimes/la-et-ufo22aug22-story.html 62. Newitz A (2012) Why is this the most popular UFO footage on YouTube?” io9, Gizmodo. https://io9.gizmodo.com/5912215/why-is-this-the-most-popular-ufo-footage-on-youtube. Accessed 22 May 2012 63. Weiskott E (2016) Before ‘Fake News’ Came False Prophecy. The Atlantic. https://www. theatlantic.com/politics/archive/2016/12/before-fake-news-came-false-prophecy/511700/. Accessed 27 Dec 2016 64. Holiday R (2012) How your fake news gets made (Two Quick Examples). Forbes. https://www.forbes.com/sites/ryanholiday/2012/05/24/how-your-fake-news-gets-madetwo-quick-examples/. Accessed 24 May 2012 65. Achenbach J (2015) Why do many reasonable people doubt science? National Geographic. https://www.nationalgeographic.com/magazine/2015/03/ 66. Bergstrom CT, West J (2017) Calling bullshit: data reasoning in a digital world. INFO 198/BIOL 106B. University of Washington, Autumn Quarter 2017. https://callingbullshit. org/syllabus.html 67. Aschwanden C (2015) Science isn’t broken, it’s just a hell of a lot harder than we give it credit for. FiveThirtyEight. https://fivethirtyeight.com/features/science-isnt-broken/. Accessed 19 Aug 2015 68. Yglesias M (2018) Mark Zuckerberg has been apologizing for reckless privacy violations since he was a freshman. Vox. https://www.vox.com/2018/4/10/17220290/mark-zuckerbergfacemash-testimony. Accessed 11 April 2018 69. Constine J (2018) Truepic raises $8M to expose Deepfakes, verify photos for Reddit. TechCrunch. https://techcrunch.com/2018/06/20/detect-deepfake/. Accessed 20 June 2018 70. Building a better news experience on YouTube, together. YouTube Official Blog, Monday. https://youtube.googleblog.com/2018/07/building-better-news-experience-on.html. Accessed 9 July 2018 71. Trulove R (2018) Wildlife writer and conservationist of over a half century. Quora. https:// top.quora.com/What-is-a-group-of-squirrels-called. Accessed 25 March 2018 72. Blakeslee S (1997) Kentucky doctors warn against a regional dish: squirrels’ brains. New York Times. https://www.nytimes.com/1997/08/29/us/kentucky-doctors-warn-againsta-regional-dish-squirrels-brains.html. Accessed 29 Aug 1997 73. Data Never Sleeps 6.0, https://www.domo.com/learn/data-never-sleeps-6 74. Kirn SL, Hinders MK (2020) Dynamic wavelet fingerprint for differentiation of tweet storm types. Soc Netw Anal Min 10:4. https://doi.org/10.1007/s13278-019-0617-3 75. Raschka S (2015) Python machine learning. Packt Publishing Ltd., Birmingham 76. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196 344 M. K. Hinders and S. L. Kirn 77. Mikolov T et al (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 3111–3119 78. Jedrzejowicz J, Zakrzewska M, (2017) Word embeddings versus LDA for topic assignment in documents. In: Nguyen N, Papadopoulos G, Jedrzejowicz P, Trawinski B, Vossen G (eds) Computational collective intelligence. ICCCI, (2017) Lecture notes in computer science, vol 10449. Springer, Cham 79. Lau JH, Baldwin T (2016) An emperical evaluation of doc2vec with practical insights into document embedding generation. In: Proceedings of the 1st workshop on representation learning for NLP. Berlin, Germany, pp 78–86 80. Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. Adv Neural Inf Process Syst 2177–2185 81. Niu L, Dai X (2015) Topic2Vec: learning distributed representations of topics, CoRR, arXiv:1506.08422 82. Xun G et al (2017) Collaboratively improving topic discovery and word embeddings by coordinating global and local contexts. KDD 17 Research Paper, pp 535–543 83. Xu W et al (2003) Document clustering based on non-negative matrix factorization. In: Research and development in information retrieval conference proceedings, pp 267–273 84. Patrick J et al (2018) Scalable generalized dynamic topic models. arXiv:1803.07868 85. David B, John L (2006) Dynamic topic models. In: Proceedings of the 23rd international conference on machine learning, pp 113–129 86. Saha A, Sindhwani V (2012) Learning evolving and emerging topics in social media: a dynamic nmf approach with temporal regularization. In: Proceedings of the fifth ACM international conference on Web search and data mining, pp 693–702 87. Derek G, Cross James P (2017) Exploring the political agenda of the European parliament using a dynamic topic modeling approach. Politi Anal 25(1):77–94 88. Yong C et al (2015) Modeling emerging, evolving, and fading topics using dynamic soft orthogonal nmf with sparse representation. In: 2015 IEEE international conference on data mining, pp 61–70 89. Deerwester S et al (1990) Indexing by latent semantic analysis. J Amer Soc Inf Sci 41(6):391– 407 90. David B et al (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022 91. Pedregosa F et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825– 2830 92. Mimno D et al (2011) Optimizing semantic coherence in topic models. In: Proceedings of the 2011 conference on empirical methods in natural language processing, pp 262–272 93. Kim S, Sohee Kim, 5G is making its global debut at Olympics, and it’s wicked fast, https://www.bloomberg.com/news/articles/2018-02-12/5g-is-here-super-speed-makesworldwide-debut-at-winter-olympics 94. Tweepy Documentation, http://docs.tweepy.org/en/v3.5.0/index.html 95. Daubechies I (1992) Ten lectures on wavelets, vol 61. Siam 96. Hou J, Hinders MK (2002) Dynamic Wavelet fingerprint identification of ultrasound signals. Mater Eval 60:1089–1093 97. Bertoncini CA, Hinders MK (2010) Fuzzy classification of roof fall predictors in microseismic monitoring. Measurement 43(10):1690–1701 98. Bertoncini CA, Rudd K, Nousain B, Hinders M (2012) Wavelet fingerprinting of radiofrequency identification (RFID) tags. IEEE Trans Ind Electron 59(12):4843–4850 99. Bingham J, Hinders M (2009) Lamb wave characterization of corrosion-thinning in aircraft stringers: experiment and three-dimensional simulation. J Acoust Soc Amer 126(1):103–113 100. Bingham J, Hinders M, Friedman A (2009) Lamb wave detection of limpet mines on ship hulls. Ultrasonics 49(8):706–722 101. Hinders M, Bingham J, Rudd K, Jones R, Leonard K (2006) Wavelet thumbprint analysis of time domain reflectometry signals for wiring flaw detection. In: Thompson DO, Chimenti DE (eds) Review of progress in quantitative nondestructive evaluation, vol 25, American Institute of Physics Conference Series, vol 820, pp 641–648 9 Cranks and Charlatans and Deepfakes 345 102. Hou J, Leonard KR, Hinders MK (2004) Automatic multi-mode lamb wave arrival time extraction for improved tomographic reconstruction. Inverse Probl 20(6):1873–1888 103. Hou J, Rose ST, Hinders MK (2005) Ultrasonic periodontal probing based on the dynamic wavelet fingerprint. EURASIP J Adv Signal Process 2005:1137–1146 104. Miller CA, Hinders MK (2014) Classification of flaw severity using pattern recognition for guided wave-based structural health monitoring. Ultrasonics 54(1):247–258 105. Skinner E, Kirn S, Hinders M (2019) Development of underwater beacon for arctic through-ice communication via satellite. Cold Reg Sci Technol 160:58–79 106. Cohen L (1995) Time-frequency analysis, vol 778. Prentice Hall, Upper Saddle River 107. Bertoncini CA (2010) Applications of pattern classification to time-domain signals. PhD dissertation, William and Mary, Department of Physics 108. Pivot to Video: Inside NBC’s Social Media Strategy for the 2018 Winter Games 109. Dillinger A, Everything you need to know about Watching the NFL on Twitter, https://www. dailydot.com/debug/how-to-watch-nfl-games-on-twitter/ 110. Bradshaw S, Howard PN (2018) The global organization of social media disinformation campaigns. J Int Affairs 71:23–35 111. Keller FB et al (2019) Political astroturfing on Twitter: how to coordinate a disinformation campaign. Polit Commun 1–25 112. Pierri F et al (2019) Investigating Italian disinformation spreading on Twitter in the context of 2019 European elections, arXiv:1907.08170 113. Yao Y et al (2017) Automated crowdturfing attacks and defenses in online review systems. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 1143–1158 114. Zannettou S et al (2019) Disinformation warfare: Understanding state-sponsored trolls on Twitter and their influence on the web. In: Companion proceedings of the 2019 World Wide Web Conference, pp 218–226 115. Starbird K and Palen L (2012) (How) will the revolution be retweeted?: information diffusion and the 2011 Egyptian uprising. In: Proceedings of the acm 2012 conference on computer supported cooperative work, pp 7–16 116. Xiong F et al (2012) An information diffusion model based on retweeting mechanism for online social media. Phys Lett A 376(30–31):2103–2108 117. Zannettou S et al (2017) The web centipede: understanding how web communities influence each other through the lens of mainstream and alternative news sources. In: Proceedings of the 2017 internet measurement conference, pp 405–417 118. Shao S et al (2017) The spread of fake news by social bots. 96:104 arXiv:1707.07592 119. Woolley SC (2016) Automating power: social bot interference in global politics. First Monday 21(4) 120. Woolley SC (2020) The reality game: how the next wave of technology will break the truth. PublicAffairs, ISBN, p 9781541768246 121. Schneier B (2020) Bots are destroying political discourse as we know it. The Atlantic 122. Ferrara E et al (2016) The rise of social bots. Commun ACM 59(7):96–104 123. Shao C et al (2018) The spread of low-credibility content by social bots. Nat Commun 9(1):4787 124. Mueller RS (2019) Report on the investivation into Russian interference in teh 2016 Presidental Election. US Department of Justice, Washington, DC 125. Bessi A, Ferrara E (2016) Social bots distort the 2016 US Presidential Election online discussion. First Monday 21(11–7) 126. Linvill DL et al (2019) "The Russians are Hacking my Brain!" Investigating Russia’s Internet Research Agency Twitter tactics during the 2016 United states Presidential Campaign. Computers in Human Behavior 127. Subrahmanian VS et al (2016) The DARPA Twitter bot challenge. Computer 49(6):38–46 128. Minnich A et al (2017) Botwalk: efficient adaptive exploration of Twitter bot networks. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 467–474 346 M. K. Hinders and S. L. Kirn 129. Varol O et al (2017) Online human-bot interactions: detection, estimation, and characterization. In: Eleventh international AAAI conference on web and social media 130. Kudugunta S, Ferrara E (2018) Deep neural networks for bot detection. Inf Sci 467:312–322 131. Yang K et al (2019) Scalable and generalizable social bot detection through data selection. arXiv:1911.09179 132. Davis CA et al (2016) Botornot: a system to evaluate social bots. In: Proceedings of the 25th international conference companion on the World Wide Web, pp 273–274 133. Yang K et al (2019) Arming the public with artificial intelligence to counter social bots. Hum Behav Emerg Technol 1(1):48–61 134. Cresci S et al (2017) The paradigm-shift of social spambots: evidence, theories and tools for the arms race. In: Proceedings of the 26th international conference on World Wide Web companion, pp 963–972 135. Cresci S et al (2017) Social fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans Dependable Secure Comput 15(4):561–576 136. Cresci S et al (2019) On the capability of evolved spambots to evade detection via genetic engineering. Online Soc Netw Media 9:1–16 137. Stradbrooke S (2018) US supreme court rules federal sports betting ban is unconstitutional. Calvin Ayre.com. https://calvinayre.com/2018/05/14/business/us-supreme-courtrules-paspa-sports-betting-ban-unconstitutional/. Accessed 14 May 2018