Supplemental Discussion Reduction in Activities Modules 2 and 3 each contain fourteen observational activities designed to elicit specific responses that are scored in their respective 28 behavioral items. By reducing the number of behavioral items in module 2 and module 3 by 67.86% and 57.14%, respectively, certain activities in each module may no longer be necessary. However, determining which activities remain valid in a reduced framework is difficult as the same behavioral item can be measured in multiple activities. For the activities that measure a single item, the decision is clear. Hence for module 2, activities #2 Response to Name and #6 Response to Joint Attention are no longer necessary as item B4 (Response to Name) and B7 (Response to Joint Attention) were not used in the classifier. The same goes for activity #3, Make-Believe Play, because neither C1 (Functional Play with Objects) nor C2 (Imagination/Creativity) were included. In regards to module 3, activities #2 Make-Believe Play and #9 Emotions can be removed as B5 (Empathy/Comments on Others’ Emotions) and C1 (Imagination/Creativity) were not part of the twelve features used in the SVM. While decisions on whether to keep an activity are simple when the activity measures a single item, it becomes challenging to decide whether to keep those that measure multiple behaviors. While it is easy to eliminate activities in which all items measured were not used in the classifier (e.g., module 3 activities #1 Construction Task and #4 Demonstration Task), those activities in which only a subset are used is much less clear. For instance in module 3, activities #10 Social Difficulties and Annoyance, #12 Friends and Marriage, and #13 Loneliness primarily measure B6 (Insight), but also measure 8 of the 11 features in the SVM classifier (A4, A7, A8, B1, B2, B7, B8, B9). Even though B6 (Insight) was not used in our classifier, we decided all three activities were worth keeping since they cover 73% of the items in our classifier. However whether all three activities are necessary remains a matter of conjecture. The decision of which activities are worth keeping became muddled as one can imagine any behavior in the 28 items could be observed during any activity. The items in section E (Other Abnormal Behaviors) are a prime example. Hence, when we claim the logistic regression classifier can eliminate three activities from module 2 (Table 4) and the SVM classifier can eliminate six activities from module 3 (Table 5), this merely represents an estimate to the best our abilities. Table S1: Co-morbidities in sample Disorder ADHD Anxiety Disorder Anxiety Disorder NOS Bipolar Disorder Chronic Motor or Vocal Tic Disorder Cognitive Disorder NOS Communication Disorder NOS Developmental Coordination Disorder Disruptive Behavior Disorder NOS Dysthymic Disorder Encopresis Enuresis Expressive Language Disorder Generalized Anxiety Disorder Language Delay Learning Disorder NOS Mathematics Disorder Mixed Receptive Expressive Language Disorder Mood Disorder NOS Obsessive Compulsive Disorder Oppositional Defiant Disorder Phonological Disorder Reading Disorder Schizophrenia Stereotypic Movement Disorder Stuttering Tic Disorder NOS Tourette Syndrome Written Expression Disorder Abbreviations: NOS, not otherwise specified Module 2 N Module 3 N 84 434 6 79 3 7 1 11 0 4 0 4 1 3 18 27 4 3 1 9 0 1 4 13 0 12 0 2 3 0 0 3 3 24 8 8 0 2 10 59 3 4 11 28 3 35 6 22 0 2 0 1 1 0 7 39 4 36 Co-morbidities in the 1799 individuals who were administered module 2 and 2741 individuals who were administered module 3. Table S2: Test results of the module 2 logistic regression classifier Dataset TP FN FP TN Sensitivity Specificity PPV NPV Accuracy AC 124 3 0 10 0.9764 1.0000 1.0000 0.7692 0.9781 AGRE 337 5 4 19 0.9854 0.8261 0.9883 0.7917 0.9753 SSC 598 4 0 0 0.9934 NA 1.0000 NA 0.9934 SVIP 17 1 3 30 0.9444 0.9091 0.8500 0.9677 0.9216 Total 1076 13 7 59 0.9881 0.8939 0.9935 0.8194 0.9827 Abbreviations: AC, Autism Consortium; AGRE, Autism Genetic Resource Exchange; SSC, Simons Simplex Collection; SVIP, Simons Variance in Individuals Project; TP, true positive; FP, false positive; FN, false negative; TN, true negative; PPV, positive predictive value; NPV, negative predictive value. Test results for the nine-feature module 2 logistic regression classifier. Specificity and NPV are not available for SSC because it lacks individuals without ASD. Table S3: Test results of the module 3 Support Vector Machine classifier Dataset TP FN FP TN Sensitivity Specificity PPV NPV Accuracy AC 187 10 3 57 0.9492 0.9500 0.9842 0.8507 0.9494 NDAR 127 3 0 27 0.9769 1.0000 1.0000 0.9000 0.9809 SSC 1537 29 0 0 0.9815 NA 1.0000 NA 0.9815 SVIP 29 2 3 124 0.9355 0.9764 0.9063 0.9841 0.9684 Total 1880 44 6 208 0.9771 0.9720 0.9968 0.8254 0.9766 Abbreviations: AC, Autism Consortium; NDAR, National Database of Autism Research; SSC, Simons Simplex Collection; SVIP, Simons Variance in Individuals Project; TP, true positive; FP, false positive; FN, false negative; TN, true negative; PPV, positive predictive value; NPV, negative predictive value. Test results of the twelve-feature module 3 SVM classifier. Specificity and NPV are not available for SSC because it lacks individuals without autism.