Automated species classification with SonoBat 3 SonoBat uses a decision engine based on the quantitative analysis of approximately 10,000 speciesknown recordings from across North America. SonoBat automatically recognizes and sorts calls, then processes the calls to extract six dozen parameters that describe the time-frequency and timeamplitude trends of a call. SonoBat’s intelligent call trending algorithm can recognize the end of calls buried in echo and noise. SonoBat can also successfully establish trends through noise and from low power signals. SonoBat generates high resolution continuous trends of time-frequency and timeamplitude content that enable robust parameter extraction. Inclusion of amplitude parameters increases classification performance above that achieved by using just timefrequency parameters alone. Example classifier discriminant functions for Myotis leibii vs. M. septentrionalis showing the relative dependence of frequency and amplitude parameters selected as important for discrimination. SonoBat will classify individual calls processed as high resolution standard views. However, classifying an entire sequence will typically provide more reliable results as this method benefits from the combined information within the sequence. Press this button to classify a currently opened sequence. Knowing when to not make a classification decision prevents misclassifications. SonoBat assesses a number of signal quality and reliability indicators, and if any fall below their acceptable level, the classification indicator displays as grayed out to indicate an unreliable result. In this example the quality falls below the default minimum acceptable value of 0.80 (may be changed in the Pref Panel). SonoBat rejected this example because it has an overloaded signal (a.k.a. saturated or clipped). Overloaded signals provide unreliable amplitude information that could lead to misclassification. Rejected calls do not contribute to sequence decisions. SonoBat rejected this example because it has both an overloaded signal and falls below the default minimum acceptable quality of 0.80. SonoBat classifies calls using an ensemble consensus of hierarchical decision algorithms and reports a single species decision if the result exceeds the discriminant probability threshold set in the Pref Panel. If a classification decision does not exceed the threshold, then SonoBat displays the species or hierarchical groups that sum to the threshold, as shown in this example. If the processed data from a call falls outside of known or confident data space, SonoBat will display a classification result of “none.” It may also indicate, as in this example, a call having a trend that reaches either end of its std view window. SonoBat treats this as an incomplete and thereby unreliable set of call data and does not venture a classification. Should this happen when manually selecting a call for std view display and classification, simply select a larger std view window and the call will reprocess. Note: A more generous spacing tolerance reduces truncated call trends during sequence classifications and automated batch runs. In this example, one of the hierarchical group decisions did not exceed the discriminant probability threshold required to advance to the next decision level. If the decision does not reach the species level, then SonoBat reports the last level that did not satisfy the threshold. Detectors can only record the strongest portions of calls from bats partially out range resulting in call fragments. (Grayed-out result indicates unreliable result.) Some call fragments mimic the data characteristics of other species and this can result in misclassifications, unless recognized as such. E.g., out of range little brown bats will leave fragments from the body of the call having a simple curved sweep that mimic red bats. SonoBat applies secondary logic tests to classification decisions and rejects calls that do not meet minimum classification criteria. Another call later in the same sequence as the call fragment in the previous slide. This call had sufficient criteria for this species to output a confident species decision. Sequence classification performed on the same file showing the decision based on all accepted calls within the sequence. Classifying an entire sequence (i.e., bat pass) typically provides more confident results than individual call classification as this method benefits from the combined information within the sequence. For a sequence classification, SonoBat first ranks the calls in a sequence based on apparent time-frequency coverage and amplitude and then classifies the individual calls in descending order of rank up to the number designated in the preferences for "max # of calls to consider per file." If any of these calls result in a rejected classification, SonoBat will move on to the next call in the ranked order until reaching the "max # of calls to consider per file" or the end of the available acceptable calls in the file. SonoBat then considers the accepted calls by hierarchical group and species and processes them to generate a mean sequence decision. SonoBat also determines a sequence vote decision based on the individual call decisions. A majority vote requires a minimum of two calls per majority spp (except for spp like L. noctivagans or T. brasiliensis that can have few calls/sec) and requires the majority species to have equal to or better than twice the number of calls as the sum of the second and third most prevalent species (if classified). Consensus of sequence vote and the mean sequence decision provides the most confident results. For example, for 1,444 known Northeastern US recordings, the consensus decision correctly classified 98.5% of recordings, the mean sequence decision 98.9%, and the agreement of consensus and mean decisions correctly classified 99.1% of the recordings. (Acceptance rates were 58.8, 47.4, and 47.3% respectively using a 0.90 threshold setting.) Note: Automated batch runs will only consider the first default segment of files larger than your max segment to process setting. Current file length. Max segment to process. The quality and accuracy of call trending and classification depends upon the quality of the recorded signals. Recording from the ground, near flat surfaces, or through tubes will render distorted signals such as these. A std view from the sequence in the previous slide reveals the extent of the signal distortion and how it inhibits call trending and the recognition of call parameters. In summary: garbage in, garbage out. Although distorted and noisy calls typically prevent confident and reliable classification, SonoBat still attempts to distinguish bat calls from noise and tallies sequences with calls during batch processing to facilitate counting total bat passes*. * To best tally passes, run the SonoBatch utility without first scrubbing files as you would for manual processing. SonoBat 3 classification preference settings Acceptable discriminant probability threshold for classification decisions. Recommended and default value: 0.90. In general, this term provides a measure of how closely the data approach the centroid of the multivariate data space of the group or species decision. Note the term “probability” in the name. As a result of species call variability, noise, and recording distortions, species with similar call morphologies can still occasionally generate calls and provide recordings that intrude on the centroid of another species’ data space and result in a high discriminant probability rating despite an incorrect absolute classification. SonoBat 3 classification preference settings Recommended and default value: 0.80. SonoBat will reject calls with a quality rating below this value and not consider them for sequence classification or for parameterization output. To interpret where to set the acceptable quality for your recordings, you can observe the quality rating of calls manually selected for std view display and analysis enabled. SonoBat synthesizes the quality rating based on a combination of signal strength and noise. The rating works for most recording situations, however some out of range calls in very quiet recording environments may nevertheless score disproportionally high quality ratings, i.e., bats at a distance from the microphone that will only yield call fragments. SonoBat 3 classification preference settings Six or more calls will generally produce more reliable sequence classifications. Recommended and default value: 8. When enabled, SonoBat will immediately classify individual calls when rendered as std views. Batch processing of files for classification decisions and call parameter extraction. Press this button to open the SonoBatch panel. SonoBat will process designated batches of individual files or folders of files to extract call parameters or to classify to species. Set to classify. Set to parameterize. Drop files or folders here to assemble a batch. Or push this button to navigate to files and folders. If processing stereo files, select channel to process. Dropping a folder will go one directory level deep, i.e., a folder of folders will load all the files within the imbedded folders. (Except for files in folders named “Deleted Files” or Scrubbed Files.”) Click for a do-over. Clicking on the Directory listing tab will display the folders designated for processing and the number of files in each folder. Total no. of files to process. Toggles option to append classification decisions to the filename when a decision exceeds the discriminant probability threshold. This can facilitate vetting classification decisions after a SonoBatch run. E.g., if SonoBat classified the file "BatSpring-29Jul09-2031,12.wav" as an Epfu, the SonoBatch processor would change the filename to "BatSpring-29Jul09-2031,12-Epfu.wav" Files already processed. Batch process in progress. Batch progress status. Total no. of files to process. Example spreadsheet output from SonoBatch run. Consensus of vote and mean decision. Classification by vote. Rudimentary bat recognition for tallying passes. Mean sequence decision entered in cell if discriminant probability exceeds threshold. Empty cell if decision did not reach species level. Example spreadsheet section from SonoBatch run. The last columns of the output spreadsheet display in descending prevalence all calls in the sequence classified to species with a DP of at least 0.75. Individual calls classified to a species of interest will display here, even if the sequence did not meet the conditions for a sequence decision, or these may be from a second, less dominant species in the sequence. This enables querying the spreadsheet for particular species to perform manual inspection for species of interest.