The X1 family of methods that combines B3LYP with neural network corrections for an accurate yet efficient prediction of thermochemistry Jianming Wu, Yuwei Zhou, and Xin Xu Correspondence to: Xin Xu (E-mail: xxchem@fudan.edu.cn) Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Laboratory for Computational Physical Science, Department of Chemistry, Fudan University, Shanghai, 200433, China Table S1. The training set for the X1 family of methods. The core set includes the G2/97 set and the border set. G2/97 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 H2 HLi HBe CH CH2 CH2 CH3 CH4 HN H2N H3N HO H2O HF H2Si H2Si H3Si H4Si H2P H3P H2S HCl Li2 LiF C2H2 C2H4 C2H6 CN CHN CO CHO CH2O CH4O N2 H4N2 NO O2 H2O2 F2 CO2 H2 LiH BeH CH CH2 (3B1) CH2 (1A1) CH3 CH4 NH NH2 NH3 HO HOH HF SiH2 (1A1) SiH2 (3B1) SiH3 SiH4 PH2 PH3 H2S HCl Li2 LiF C2H2 H2C=CH2 C2H6 CN HCN CO HCO H2C=O CH3OH N2 H2N-NH2 NO O2 HOOH F2 CO2 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 Na2 Si2 P2 S2 Cl2 NaCl OSi CS OS OCl FCl H6Si2 CH3Cl CH4S HOCl O2S BF3 BCl3 F3Al AlCl3 CF4 CCl4 COS CS2 COF2 F4Si SiCl4 N2O NOCl NF3 F3P O3 OF2 F3Cl C2F4 C2Cl4 C2NF3 C3H4 C3H4 C3H4 C3H6 C3H6 C3H8 C4H6 Na2 Si2 P2 S2 Cl2 NaCl SiO CS SO ClO ClF H3Si-SiH3 CH3Cl H3C-SH HOCl SO2 BF3 BCl3 AlF3 AlCl3 CF4 CCl4 O=C=S CS2 COF2 SiF4 SiCl4 N2O ClNO NF3 PF3 O3 F2O ClF3 C2F4 C2Cl4 CF3CN propyne allene cyclopropene propylene cyclopropane propane butadiene 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 C4H6 C4H6 C4H6 C4H6 C4H8 C4H8 C4H10 C4H10 C5H8 C6H6 CH2F2 CHF3 CH2Cl2 CHCl3 CH5N C2H3N CH3NO2 CH3NO2 CH6Si CH2O2 C2H4O2 C2H5NO C2H5N C2N2 C2H7N C2H7N C2H2O C2H4O C2H4O C2H2O2 C2H6O C2H6O C2H4S C2H6OS C2H6S C2H6S C2H3F C2H5Cl C2H3Cl C3H3N C3H6O C2H4O2 C2H3OF C2H3OCl 2-butyne methylene cyclopropane bicyclobutane cyclobutene cyclobutane isobutene trans butane isobutane spiropentane benzene H2CF2 HCF3 H2CCl2 HCCl3 Methylamine methyl cyanide nitromethane methyl nitrite methyl silane formic acid methyl formate acetamide aziridine cyanogen Dimethylamine trans ethylamine H2C=C=O (ketene) CH2-O-CH2 (oxirane) CH3CHO (acetaldehyde) O=CH-CH=O (glyoxal) CH3CH2OH (ethanol) CH3-O-CH3 (dimethylether) CH2-S-CH2 (thiooxirane) CH3CH3SO (dimethyl sulfoxide) CH3-CH2-SH (ethanethiol) CH3-S-CH3 (dimethyl sulphide) H2C=CHF CH3-CH2-Cl (ethyl chloride) H2C=CHCl (vinyl chloride) H2C=CHCN (acrylonitrile) CH3-CO-CH3 (acetone) CH3COOH (acetic acid) CH3COF (acetyl fluoride) CH3COCl (acetyl chloride) 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 The border Set 211 305 306 382 671 687 690 698 715 737 740 755 769 789 797 833 838 839 C3H7Cl C3H8O C3H8O C3H9N C4H4O C4H4S C4H5N C5H5N HS C2H C2H3 C2H3O CH3O CH3O C2H5O CH3S C2H5 C3H7 C4H9 NO2 CH3CH2CH2Cl (propyl chloride) (CH3)2CH-OH (isopropanol) C2H5-O-CH3 (methyl ethyl ether) (CH3)3N (trimethylamine) C4H4O (furan) C4H4S (thiophene) C4H4NH (pyrrole) C5H5N (pyridine) SH CCH C2H3 (2A') CH3CO (2A') H2COH (2A) CH3O (2A') CH3CH2O (2A") CH3S (2A') C2H5 (2A') (CH3)2CH (2A') (CH3)3C NO2 P4 C7H16 C7H16 N3 Al2Cl6 Na2Cl2 Li3Cl3 H9B5 C6F10 C32H66 CN4O8 C7H5N3O6 S8 OF3P H8Si3 Be2 CH3N5 Mg2 P4 3,3-dimethylpentane 2,2,3-trimethylbutane azide radical Aluminum trichloride dimer Disodium dichloride (LiCl)3 Pentaborane Perfluorocyclohexene n-C32H66 Tetranitromethane 2,4,6-trinitrotoluene sulfur octamer (cyclic) phosphorus oxyfluoride Trisilane Beryllium dimer 5-Aminotetrazole Magnesium dimer Other training molecules 149 C4H6 1,2-Butadiene 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 C5H8 C5H10 C5H12 C5H12 C6H8 C6H8 C6H12 C6H14 C6H14 C7H8 C7H16 C8H8 C8H18 C10H8 C10H8 C3H6O2 C4H10O C6H7N C6H6O C4H6O C4H8O C5H8O C6H4O2 C4H4N2 C2H6O2S C6H5Cl C4H4N2 C4H4N2 C4H4O C4H6O C4H6O3 C4H6S C4H7N C4H8O C4H8O C4H8O2 C4H8S C4H9Cl C4H9Cl C4H9N C4H9NO2 C4H10O C4H10O2 C4H10S Isoprene Cyclopentane n-pentane Neopentane 1,3-cyclohexadiene 1,4-cyclohexadiene Cyclohexane n-Hexane Pentane, 3-methylC6H5CH3 (toluene) n-heptane 1,3,5,7-Cyclooctatetraene n-octane Naphthalene Azulene CH3COOCH3 (methyl acetate) (CH3)3COH (t-butanol) C6H5NH2 (aniline) C6H5OH (phenol) CH2=CH-O-CH=CH2 (divinyl ether) Tetrahydrofuran Cyclopentanone [p-]benzoquinone Pyrimidine C2H6O2S (dimethyl sulphone) chlorobenzene 1,2-dicyano ethane pyrazine CH3-CO-CCH (acetyl acetylene) CH3-CH=CH-CHO (crotonaldehyde) CH3-CO-O-CO-CH3 (acetic anhydride) C4H6S (2,5-dihydrothiophene) isobutane nitrile methyl ethyl ketone isobutanal 1,4-dioxane tetrahydrothiophene t-butyl chloride n-butyl chloride tetrahydropyrrole nitro-s-butane diethyl ether 1,1-dimethoxy ethane t-butanethiol 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 212 213 214 215 216 217 218 219 220 221 222 223 227 230 235 236 237 242 250 252 255 258 259 274 286 287 292 C4H10S2 C4H11N C4H12Si C5H6S C5H7N C5H10O C5H10O C5H10O2 C5H10S C5H11N C5H12O C6H4F2 C6H4F2 C6H5F C6H14O PF5 F6S SO3 SCl2 OPCl3 PCl5 O2SCl2 PCl3 S2Cl2 SiCl2 CF3Cl C2F6 CF3 C6H5 C2H2F2 C2H3Cl3 C2H5NO3 C2H8N2 C3H3NO C3H6S C3H8S C3H8S C3H10N2 C4H10O C4H10O2 C5H11Cl C6H4Cl2 C6H5NO2 C6H10 CH3CH2-S-S-CH2CH3 (diethyl disulfide) (CH3)3C-NH2 (t-butylamine) Si(CH3)4 (tetramethylsilane) C5H6S (2-methyl thiophene) C5H7N (N-methyl pyrrole) C5H10O (tetrahydropyran) CH3-CH2C(=O)-CH2CH3 (diethyl ketone) CH3-C(=O)-O-CH(CH3)2 (isopropyl acetate) C5H10S (tetrahydrothiopyran) cyc-C5H10NH (piperidine) (CH3)3-C-O-CH3 (t-butyl methyl ether) C6H4F2 (1,3-difluorobenzene) C6H4F2 (1,4-difluorobenzene) C6H5F (fluorobenzene) (CH3)2CH-O-CH(CH3)2 (di-isopropyl ether) Phosphorus pentafluoride Sulfur hexafluoride Sulphur trioxide Sulfur dichloride Phosphorus oxychloride Phosphorus pentachloride Sulphuryl dichloride Phosphorus trichloride Sulfur monochloride Dichlorosilylene (singlet) Chlorotrifluoromethane Ethane, hexafluoroTrifluoromethyl radical (*CF3) Phenyl radical (*C6H5) Ethene, 1,1-difluoro- (CH2=CF2) 1,1,1-trichloroethane ethylnitrate ethylenediamine (CRC#1,2-Ethanediamine) oxazole thiacyclobutane (CRC# Thietane) 1-propanthiol Ethyl methyl sulfide 1,2-propanediamine sec-butanol (CRC#2-Butanol) 1,4-butanediol 1-chloropentane 1,3-dichlorobenzene nitrobenzene Cyclopentene, 1-methyl- 293 301 305 307 314 318 323 324 330 336 344 357 363 364 377 378 379 380 381 383 384 385 395 398 399 403 404 412 437 460 473 479 504 518 523 525 527 530 534 536 537 538 540 544 C6H10 C7H6O C7H16 C7H16S C8H16 C8H18 C9H18O C9H20 C12H10 C4H6 C5H8 C8H8 C14H10 C3O2 C3H8O2 C4H6O2 C4H10O2 C7H6O2 C4H2O3 H2N2 HN3 C4N2 C4H2N2 CH2N4 NO3 HNO2 HNO3 CH3NO3 NOF C2H3Cl3 COCl C7H5OCl H2S5 C12H10S C2H6S2 C3H8S2 C12H10S2 C3H4S3 H2SO4 C2H4OS C4H8OS C4H10OS C3H8O2S C2H6O3S 1,5-hexadiene benzaldehyde Pentane, 3,3-dimethyln-heptyl mercaptan 2,4,4-trimethyl-2-pentene 2,3,4-trimethylpentane diisobutyl ketone (2,6-Dimethyl-4-heptanone) Pentane, 3,3-diethylacenaphthene ethylacetylene (CRC# 1-Butyne) bicyclo[2.1.0]pentane (housane) styrene phenanthrene carbon suboxide(O=C=C=C=O) dimethoxymethane (CRC#methylal) biacetyl(2,3-butanedione) diethyl peroxide benzoic acid maleic anhydride Z-diazene hydrazoic acid Dicyanoacetylene;2-Butynedinitrile fumaronitrile;trans-2-Butenedinitrile 1H-tetrazole nitrate radical(O=N(-O*)=O) nitrous acid, trans nitric acid methyl nitrate nitrosyl fluoride 1,1,2-trichloroethane Carbonyl chloride 2-chlorobenzaldehyde hydrogen pentasulfide diphenyl sulfide 2,3-dithiabutane propane-1,3-dithiol Disulfide, diphenyl 1,3-dithiolan-2-thione sulfuric acid thiolacetic acid s-ethylthioacetate diethyl sulfoxide methylethyl sulfone dimethyl sulfite 547 550 554 555 556 560 561 562 566 567 569 571 590 596 598 623 631 640 641 642 643 646 655 656 663 664 669 673 674 675 676 677 678 679 680 681 682 683 684 685 686 688 689 691 C2H6O4S C3H4OS2 C2H5NS C4H5NS C7H5NS CH4N2S CH5N3S CH6N4S F4S F5S O2F2S HO3FS C18H15P C3H9O3P C6H15O4P C2H8Si C8H20Si C3H10OSi C5H14OSi C6H16O2Si C6H18OSi2 C5H15NSi CH5SiCl C2H7SiCl H5SiP Al2 OAl2 FAlCl2 BeO LiNa NaF MgF HLiO NaOH MgF2 Na2O BeCl2 MgCl2 H3B B2O2 Li2Cl2 Be2OF2 B2F4 B2Cl4 dimethyl sulfate 1,3-dithiolan-2-one Ethanethioamide 4-methylthiazole benzothiazole thiourea hydrazinecarbothioamide carbonothioic dihydrazide sulfur tetrafluoride sulfur pentafluoride (*SF5) sulfuryl fluoride fluorosulfonic acid (HOSO2F) triphenylphosphine trimethyl phosphite Triethyl phosphate dimethylsilane tetraethylsilane trimethylsiliconhydroxide Si(CH3)3OC2H5 Si(CH3)2(OC2H5)2 hexamethyldisiloxane (CH3)3SiN(CH3)2 Methyl chlorosilane chlorodimethylsilane Silylphosphine aluminum dimer dialuminum oxide (:Al-O-Al:) aluminum dichloridefluoride BeO LiNa NaF *Mg-F LiOH NaOH MgF2 Sodium oxide (Na-O-Na) Beryllium dichloride Magnesium Chloride Borane Oxo(oxoboranyl)borane Dilithium dichloride O(BeF)2 Diboron Tetrafluoride Diboron Tetrachloride 692 693 694 695 696 697 699 700 701 702 704 706 707 709 710 711 712 713 716 718 720 721 723 724 727 728 731 732 733 734 735 736 771 774 777 785 787 794 796 819 820 824 826 827 H6B2 C3H3N3 B3O3F3 B3O3Cl3 C3H6O3 CN4F8 C12H8 C12H10 C12H9N C13H9N C14H12 C16H10 C18H12 C20H12 C2H4N2O2 C3H2N2 C3H7NO2 C4H4O2 C8H18S C10H22S C11H24 C12H14O4 C12H24 C12H24O2 C13H26O2 C14H14 C16H22O4 C16H32O2 C16H34 C16H34O C18H14 C18H38 H2S2 H2S4 C2H4N2S2 OP FP H4Si2 H5Si2 O2Al O2Al2 FAl F6Al2 AlCl Boranylidyneborane 1,3,5-triazine 2,4,6-trifluoro-1,3,5,2,4,6-trioxatriborinane 2,4,6-trichloro-1,3,5,2,4,6-trioxatriborinane trioxane (1,3,5-Trioxane) Octafluoromethanetetramine acenaphthylene Biphenyl Carbazole acridine (Z)-stilbene Fluoranthene Benz[a]anthracene Perylene Oxamide malononitrile Sarcosine diketene dibutyl sulfide n-decyl mercaptan n-undecane diethyl phthalate 1-Dodecene n-dodecanoic acid methyl dodecanoate 1,2-diphenylethane dibutyl phthalate n-hexadecanoic acid n-hexadecane 1-hexadecanol p-terphenyl n-Octadecane Disulfides Hydrogen tetrasulfide Ethane dithioamide Phosphorus oxide(*P=O) Phosphorus monofluoride (triplet) Silylidenesilane Disilanyl radical Aluminum dioxide (radical) Dialuminum dioxide Aluminum fluoride (singlet) Aluminumtrifluoride dimer Aluminum chloride (singlet) 829 830 831 832 834 835 836 837 840 Only for X1se 911 930 931 972 973 974 975 976 977 978 979 980 981 FAlCl C16H26 C8H23N5 C6H18N4 C2H4N4 C2H4N4 C2H6N2O C3H8N2O OS2 Aluminum chloride fluoride (radical) n-decylbenzene tetraethylenepentamine triethylene tetramine 1H-Tetrazole, 5-methyl1H-Tetrazole, 1-methylUrea, methylUrea, ethylDisulfur monoxide C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 C8H18 Butane, 2,2,3,3-tetramethylHexane, 3,4-dimethylHexane, 2,4-dimethylHeptane, 2-methylHeptane, 3-methylHexane, 2,2-dimethylHexane, 2,5-dimethylHexane, 3,3-dimethylPentane, 3-ethyl-2-methylPentane, 3-ethyl-3-methylPentane, 2,2,3-trimethylPentane, 2,2,4-trimethylPentane, 2,3,3-trimethyl- The general rule to calculate ECB can be found on the Web (http://www.xdft.org/dft , 2015). Here are some examples. 1. For n-C5H12, the ECB value equals to 6. The number assigned on the bond indicates the quantity of single bonds connected to it. This number happened to be exactly double of the number of protobranching of alkanes. 1 2 2 1 2. In 2,2,4-trimethylhexane (C9H20), there are 1 single C-C bond (C6-C8) connected to the C8-C9 bond. The ECB value for this bond is 1. And there are 3 bond connected to the bond (C6-C8), then its ECB value is 3. The total ECB value for this molecule is to sum up all the values. So ECB= 1+2+3*5+4 = 22. Here the subscripts of the carbon atom are the labels of the atoms. 3. For (1R,2R,4S,5S)-tricyclo[3.2.1.02,4]oct-6-ene (C8H10), ECB=18. The bonds on the cyclopropane do not contribute to the ECB value. 3 2 4 3 2 4 4. For (CH3)3C-OCH3, the ECB value is composed of 3 similar parts. One is ECB(C1-C4)=3, which indicates that there are 3 single bonds connected to it. They are C2-C4, C3-C4, and C4-O. The SC descriptor was defined as the difference between the numbers of singly-occupied electrons (SE) in the constituent atoms and in the molecule (∑ππ‘ππ ππΈ − ππΈππππππ’ππ ). For example, H2O has no singly-occupied electron, SEH2O=0, H has one singly-occupied electron and O has two. Therefore SC for H2O is (2*1 + 2 -0) = 4.