Uploaded by smokiesan418

Data Mining and Graph Network Deep Learning for Band Gap Prediction in Crystalline Borate Materials

advertisement
pubs.acs.org/IC
Article
Data Mining and Graph Network Deep Learning for Band Gap
Prediction in Crystalline Borate Materials
Ruihan Wang,∥ Yeshuang Zhong,∥ Xuehua Dong, Meng Du, Haolun Yuan, Yurong Zou, Xin Wang,
Zhien Lin,* and Dingguo Xu*
Downloaded via INDIAN INST OF TECH MANDI on September 5, 2023 at 05:00:37 (UTC).
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.
Cite This: Inorg. Chem. 2023, 62, 4716−4726
ACCESS
Metrics & More
Read Online
Article Recommendations
sı Supporting Information
*
ABSTRACT: Crystalline borates are an important class of
functional materials with wide applications in photocatalysis and
laser technologies. Obtaining their band gap values in a timely and
precise manner is a great challenge in material design due to the
issues of computational accuracy and cost of first-principles
methods. Although machine learning (ML) techniques have
shown great successes in predicting the versatile properties of
materials, their practicality is often limited by the data set quality.
Here, by using a combination of natural language processing
searches and domain knowledge, we built an experimental database
of inorganic borates, including their chemical compositions, band
gaps, and crystal structures. We performed graph network deep
learning to predict the band gaps of borates with accuracy, and the
results agreed favorably with experimental measurements from the visible-light to the deep-ultraviolet (DUV) region. For a realistic
screening problem, our ML model could correctly identify most of the investigated DUV borates. Furthermore, the extrapolative
ability of the model was validated against our newly synthesized borate crystal Ag3B6O10NO3, supplemented by the discussion of an
ML-based material design for structural analogues. The applications and interpretability of the ML model were also evaluated
extensively. Finally, we implemented a web-based application, which could be utilized conveniently in material engineering for the
desired band gap. The philosophy behind this study is to use cost-effective data mining techniques to build high-quality ML models,
which can provide useful clues for further material design.
■
INTRODUCTION
Borates, the largest group of oxide minerals, have been widely
employed as non-linear optical (NLO) crystals, photocatalysts,
flame retardants, and luminescent materials.1−4 An understanding and manipulation of the band structures of borate
crystals are important for designing novel advanced functional
materials. For instance, deep-ultraviolet (DUV; wavelength, λ,
< 200 nm) NLO crystals require a band gap of >6.2 eV,5
whereas visible-light photocatalysts require a band gap of <3.2
eV.6 Therefore, the rapid and precise prediction of band gaps
can accelerate the discovery of new borate materials with
favorable properties, and chemists can benefit from predicting
band gaps before synthesis.
Supervised machine learning (ML) is a powerful and
efficient tool for predicting the band gap,7−9 adsorption
volume,10,11 formation energy,12,13 and stability.14,15 In the
context of band gap predictions, Omprakash et al.16 applied
graph neural networks (GNNs) to predict varying perovskite
band gaps in a few milliseconds. The GNN model was trained
using a database of 24,501 perovskites created based on density
functional theory (DFT) calculations. Xie and Grossman17
developed a generalized crystal-graph convolution neural
© 2023 American Chemical Society
network (CGCNN) to predict the band gaps of 46,744
inorganic crystals. Once trained, the properties of thousands of
materials could be predicted within seconds. The relatively low
computational demand makes the ML technology an extremely
promising option for novel material discovery paradigms.
The foundation of a successful ML project depends on the
data set. Most public databases have been generated based on
DFT calculations at the less accurate generalized gradient
approximation (GGA) level; these include the Automatic Flow
(AFLOW),18 Open Quantum Materials database (OQMD),19
and Materials Project.20 However, a well-known drawback is
that the band gaps determined by GGA severely underestimate
the actual band gaps.21,22 This underestimation is even more
pronounced when calculating ultrawide band gap compositions, particularly for some DUV borates. He et al.5 compared
Received: January 19, 2023
Published: March 8, 2023
4716
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
Article
Figure 1. (a) Data mining procedure in this study. (b) Schematic diagram to illustrate the CGCNN model.
the calculated band gap obtained from the GGA level and the
experimental band gap for 10 DUV borate crystals, such as βBaB2O423 (4.30 vs 6.57 eV), KBe2BO3F224 (5.79 vs 8.27 eV),
and KAl2B2O725 (3.98 vs 6.89 eV). Therefore, if we employ the
GGA level in DFT calculations to generate the data set for
subsequent ML model construction, we could miss some
candidate structures in the screening for target properties. A
practical remedy for such GGA underestimation is to use the
many-body-perturbation-theory-based GW approach26 or the
hybrid exchange−correlation (XC) functional, including exact
Hartree−Fock exchange,27 to generate data sets. Furthermore,
theoretical methods that consume computational resources are
currently unsuitable for high-throughput calculation research.
Directly using experimental values as the data sets for the
ML training process has emerged as a more efficient approach
to bypass the band gap underestimation problem of the
Perdew−Burke−Ernzerhof (PBE)28 functional or the timeconsuming problem of the hybrid XC functional. For example,
Zhou et al.29 recorded 3896 experimental band gap samples for
inorganic solids into a highly useful database and constructed
an ML model based on the properties of the constituent
elements. However, no structural information was provided in
this database, limiting the development of structure-based
algorithms and the discussion of structure−function relationships. To improve the accuracy of predictions for the
experimental band gap, Chen et al.30 developed multi-fidelity
graph networks (MF-GNNs) that allowed for the efficient
learning of latent structural characteristics from large amounts
of low-fidelity computed data. When using these previously
published ML models to predict specific systems, such as the
band gaps of borates, some inaccuracies still exist (see Figure
S1). Possible reasons for this may include a limited number of
borates in the training set, and the lack of a widely accessible
borate database of experimental band gaps. Therefore, we
propose that frequent updates of databases and system-specific
ML models are essential to fulfill diverse prediction purposes.
In this study, we constructed a borate database of
experimental band gaps by employing data mining techniques
to extract text from scientific literature. Several state-of-the-art
CGCNN models were applied to predict the band gaps of
borate crystals. More importantly, the predictions agreed
reasonably well with experimental measurements, achieving a
mean absolute error (MAE) of 0.40 eV for the test set. A new
borate crystal, Ag3B6O10NO3, was synthesized to validate the
robustness of the workflow. For a realistic screening of borates
at a short ultraviolet−visible (UV) cutoff edge (λcutoff < 200
nm), our prediction accuracy was the highest compared to that
of previously published ML models and DFT calculations. This
excellent extrapolation capability also motivated us to extend
the trained model to a larger computational database
consisting of approximately 2000 borates. Furthermore, the
resulting learned characteristics allowed us to conduct
interesting investigations into the interpretability of crystalgraph networks. Finally, we developed a web tool that could
enable users to conveniently upload borate crystallographic
information files (CIFs) to obtain experiment-based band gaps.
■
METHODOLOGY
Data Set Collation. Numerous studies31−34 have focused
on text extraction from scientific literature based on a
combination of the natural language processing (NLP)35−37
toolkit and full-text publisher application programming
interfaces (APIs).38 Details on the mining procedure can be
found in the ref 36, and only a brief summary is provided
herein. From four publishers, namely, Elsevier, the American
Chemical Society, the Royal Society of Chemistry, and
Springer, we compiled a corpus of borates including
approximately 1000 papers. The chemical names and band
gaps were automatically mined from the paragraphs of each
article in this highly specialized borate corpus using keywords
[band gap, ultraviolet−visible (UV−vis) spectroscopy, and
cutoff edges].
4717
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
Article
Figure 2. (a) Heat map plot of the periodic table shows the elements that are shown in all borates in the database. (b) Density of the distribution of
band gaps for all the structures contained in the database.
Despite the efficiency of automated extraction techniques,
additional steps must be considered to acquire a complete data
set. First, several studies have only provided theoretical band
gaps based on DFT calculations. Consequently, in our analysis,
each extracted piece of data was manually examined to ensure
that the results were obtained from experimental measurements. Second, the NLP approach is unable to capture some of
the band gap data from the UV−vis images. Thus, the band
gap must be extracted using domain knowledge. The band gap
corresponds to the absorption limit, which was roughly
estimated by the following equation.39
Eg =
hc
eV =
1240
The kinetic energy cutoff for the plane-wave basis set
expansion was set to 520 eV. A Gamma-centered grid of kpoints was used for integration in the reciprocal space. During
relaxation, the thresholds for the forces on the atoms were set
to less than 0.05 eV/Å. A convergence criterion of 10−5 eV was
chosen for the electronic optimization. The quickly use VASP
(qvasp)42 program was used to obtain the band gaps of all
borates.
CGCNN Model and Training. A diagram of the CGCNN
calculation procedure is shown in Figure 1b; more details can
be found elsewhere.17 Essentially, nodes and edges are
characterized by vectors for atoms and bonds, respectively.
The overall architecture of the CGCNN model consists of four
steps. The first step, named the preprocessing layer, encodes
the node and edge features into specified dimension vectors.
Next are the graph convolution layers, which perform the
convolution operation for iterative updates. Once all atom
properties have been aggregated, the pooling layer then offers a
vector for the overall representation of the crystal graph.
Finally, a fully connected layer and output layer are used to
predict the target. To avoid any bias introduced by overfitting,
we used L2 regularization algorithms with a weight decay of
0.0001.
Synthesis and Characterization of Ag3B6O10NO3.
Colorless crystals of Ag3B6O10NO3 were obtained by heating
a mixture of AgNO3 (0.340 g), LiNO3 (0.069 g), and H3BO3
(0.927 g) in a Teflon-lined stainless-steel vessel at 230 °C for 4
d (76% yield based on silver). Single-crystal X-ray diffraction
(XRD) data for this compound were collected using a New
Gemini, Dual, Cu at zero, and EosS2 diffractometer at room
temperature. The crystal structure was determined using a
direct method. The structure was refined on F2 by full-matrix
least-squares methods using the SHELXTL program package.43,44 The powder XRD pattern of the as-synthesized
compound obtained using a Shimadzu XRD-6100 diffrac-
eV
where Eg is the band gap (eV), h is Planck’s constant (6.62 ×
10−34 J s), c is the light velocity (3 × 108 m/s), and λ is the
wavelength (nm).
Most band gaps derived from the publications were not
accompanied by adequate crystallographic data; only compositional information was provided.29 Thus, to build a more
comprehensive database, we needed to obtain the structure
files corresponding to each chemical formula of the borates.
We compared the chemical names obtained from the
Cambridge Crystallographic Data Centre to determine the
structural details of each borate component. Another difficult
problem is that some downloaded crystal structures have atom
partial occupancy issues. After the data cleaning operation, the
data set was left with 276 useable data and valid CIF files, as
shown in Figure 1a. CIF formats were selected because of their
superiority in representing material information.
DFT Calculation. The electronic structures of some
borates were determined using the Vienna ab initio simulation
package (VASP).40 GGA with the PBE functional was
applied.28 The core−valence electron interactions were
described using the projector augmented wave method.41
4718
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
tometer with Cu Kα radiation (λ = 1.5418 Å) was in
agreement with the simulated pattern, confirming its phase
purity (Figure S2). Furthermore, thermogravimetric analysis,
performed using a Mettler Toledo TGA/DSC 2/1600 thermal
analyzer, showed that it remained stable up to 500 °C (Figure
S3). The weight loss between 500 and 650 °C is caused by the
decomposition of NO3 groups, assuming the solid residue to
be a mixture of Ag2O and B2O3 (observed 91.7%; calculated
91.2%). The IR spectrum was measured using a Nicolet Impact
410 FTIR spectrometer, indicating the coexistence of NO3,
BO3, and BO4 groups in the structure (Figure S4). The strong
band at 1379 cm−1 can be attributed to the stretching vibration
of the NO3 groups. The bands at 1339 and 1173 cm−1 are
attributed to asymmetric stretching and symmetric stretching
of the BO3 groups, respectively. The strong band at 1013 cm−1
is attributed to the asymmetric stretching of B−O in the BO4
tetrahedra. The bands at approximately 810−890 cm−1 arise
from the bending vibration of NO3 groups and the B−O
symmetric stretching in the BO4 tetrahedra.
Article
The Pearson product−moment correlation coefficient (PCC),
root-mean-squared error (RMSE), and MAE were calculated
to evaluate the performance of the CGCNN models. Notably,
higher PCC values and lower RMSE and MAE values indicate
better prediction accuracies.
To visualize the model performance, the predictions of the
CGCNN model were obtained by selecting the best among the
10 random seeds. The accuracy of the CGCNN model can be
further verified by visualizing scatter plots, as shown in Figure
3, in which most of the points are observed to be distributed
■
RESULTS AND DISCUSSION
Extracted Data Set. Utilizing both human and NLP
searches, we constructed a database that could be as
representative as possible. The database was first analyzed to
capture variations in data dispersion. Data collection consisted
of 48 different elements. Figure 2a demonstrates that the Ba
element (24%) has the largest probability of appearing in the
database, and alkali metal elements (Li, Na, and K) also
account for a relatively high proportion. However, the
proportion of some elements (Nd, Pd, and Sm) is less than
1%, indicating that the database may have some degree of
imbalance, which could further limit the effectiveness of
subsequent applications of the developed ML model. The
heatmap displayed in Figure 2a can determine whether the
new borate to be tested contains elements that have previously
been identified in the training set. It would be more reliable to
predict the band gaps for any of the above-mentioned types.
However, it is essential that more samples be added to the
other types of borates by employing data mining technology.
We believe that this will be an important direction for future
research.
However, we must also evaluate the distribution of label
values in the regression task. Figure 2b shows that the band
gaps range from 2.43 eV (Pb3OBO3F45 ) to 9.92 eV
(SrB4O746), and the highest density regions are located at
4.05 and 5.50 eV, respectively. In particular, 20% of the
samples have band gaps greater than 6.20 eV (wavelength λ <
200 nm), which implies that this part of the data can be trained
for the ML model to improve the underestimation of PBE
calculations. In general, all information in the database,
including chemical formulas, band gaps, structure files (CIF),
and references, is available in the Supporting Information
(Table S1).
Model Training and Evaluation. The CGCNN17
architecture was selected for predicting the borate band gaps.
The entire database was partitioned randomly into two parts:
an 85% training set and 15% test set, using the same 10
random seeds, which resulted in similar performance. For a
maximum of 1000 epochs, two graph convolution layers and a
fully connected layer after the pooling operation were trained
using the relevant training data. The hyperparameters were
determined using a Bayesian optimization search strategy, and
all optimized hyperparameters are summarized in Table S2.
Figure 3. Accuracy representations for the CGCNN model. The red
diagonal line indicates a perfect correlation between experimental
band gaps and the values predicted by regression models.
close to the diagonal line. The statistical PCC, RMSE, and
MAE for the test set were 0.90, 0.54, and 0.40 eV, respectively.
Previously, Zhuo et al.29 obtained an MAE of 0.75 eV and an
RMSE of 1.46 eV for experimental band gaps using a support
vector regression model. Chen et al.30 developed multi-fidelity
graph networks to train with disordered and ordered crystals
for experimental band gaps, and the corresponding MAEs were
0.51 and 0.37 eV, respectively. Compared to these reports, we
believe that our ML model has adequate accuracy.
Robustness of the Model. To demonstrate the robustness of our predictive band gap models, we predicted 24
borates that were not included in the data set, as shown in
Table 1. Notably, only a range of UV cutoff edges was
provided for all 24 borates, rather than the band gap values.
For example, BaBe2BO3F347 has a short UV cutoff edge (λcutoff
< 185 nm), indicating that the band gap is larger than 6.70 eV.
The band gaps of these borates are all greater than 6.2 eV,
belonging to the DUV type, according to the results of UV
spectroscopy.
For a more comprehensive evaluation, we selected a
commonly used PBE functional in this high-throughput
screening task for comparison. In addition, we selected two
previously reported GNN models, the MEGNet64 and MFGNN30 models. Although MEGNet performed well on the
DFT computational database,64 and the MF-GNN model
performed well on experimental data sets,30 neither of these
models discussed the performance of DUV type crystals. For
instance, experimental measurements indicate that the band
gap of GdBe2B5O1163 is larger than 6.20 eV. However, the
calculated value based on the PBE functional was only 5.45 eV.
The values predicted based on the MEGNet and MF-GNN
4719
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
Article
Table 1. Some DUV-Type Borate Crystals without Specific Values of Band Gapsa
borate materials
exp
PBE
MEGNet
MF-GNN
this work
refs.
Cd3LiNa4Be4B10O24F
Na3B4O7Br
Li5Rb2B7O14
Na3B4O7Cl
BaBe2BO3F3
K2Ba[B4O5(OH)4]2·10H2O
Li3Ba4Sc3(BO3)4(B2O5)2
K2B4O11H8
KBe4B3O9
CsBe4(BO3)3
BaB8O12F2
Li3KB4O8
K5B19O31
K7BaY2(B5O10)3
Li6CuB4O10
LiBeBO3
K2B5O8(OH)·2H2O
BaB2O3F2
GdBe2B5O11
YBe2B5O11
Sr3LiNa4Be4B10O24F
K7SrY2(B5O10)3
LiRbB5O8(OH)·H2O
K7CaY2(B5O10)3
>6.2
>6.2
>6.53
>6.2
>6.7
>6.2
>6.2
>6.89
>6.2
>6.2
>6.89
>6.53
>6.89
>6.53
>6.2
>6.2
>6.22
>6.89
>6.2
>6.2
>6.2
>6.52
>6.22
>6.52
3.55
4.16
4.27
4.53
6.37
0.14
4.44
5.46
5.76
5.98
6.72
5.48
5.43
4.47
<0
6.31
4.72
6.90
5.45
6.72
4.69
4.51
4.58
4.53
3.70
4.29
4.69
4.34
6.39
3.98
4.00
4.04
5.82
5.31
6.08
4.98
5.45
4.61
0.52
6.06
4.00
6.26
3.34
5.65
4.87
4.78
4.89
5.03
4.30
5.65
6.88
6.37
8.72
5.42
5.03
5.50
7.85
7.69
6.36
7.15
6.52
5.47
0.52
8.20
6.16
7.71
3.98
6.07
6.92
5.62
6.89
5.45
6.37
7.19
6.64
6.34
8.20
7.07
6.47
5.87
8.34
8.53
10.07
6.71
5.96
6.05
6.37
8.51
6.86
8.58
6.52
6.33
6.36
5.79
5.85
5.88
48
49
50
49
47
51
52
53
54
55
56
57
58
59
2
60
61
62
63
63
48
59
61
59
a
Exp represents the range experimentally calculated by the cutoff edges. PBE stands for DFT calculation using PBE functionals. MEGNet and MFGNN are GNNs trained by Chen et al. released in the GitHub repository. The numbers in bold represent failed predictions in this work.
models were even worse with corresponding results of 3.34 and
3.98 eV, respectively. The prediction errors were clearly too
large. However, although this crystal was not included in our
data set, our ML model could provide a predicted value of 6.52
eV, which is in good agreement with the experimental
estimation. An additional proof has been provided for
Li6CuB4O10;2 metal characterization can be predicted using
the PBE calculation, MEGNet, and MF-GNN models.
However, such a prediction is inconsistent with the
experimental fact that this material shows insulator characteristics, and the band gap of Li6CuB4O102 is measured to be
greater than 6.2 eV. Impressively, the result predicted by our
model is 6.37 eV, indicating that our model is accurate enough
for DUV-type crystals. Note that we do not emphasize the
merits of the various GNN model algorithms themselves but
only evaluate the practical application of these trained models.
Upon further analysis of Table 1, we infer that the predicted
values obtained from MEGNet are generally close to the PBE
calculation results, which is not unexpected considering that
the best model presented in MEGNet was trained using the
band gap data set constituted by the PBE calculated values
obtained from the Materials Project.20 To evaluate the
prediction models using the compounds listed in Table 1,
for which the band gaps were all greater than 6.2 eV, we
calculated the identification accuracy of the PBE method
calculation to be 13%. However, the MEGNet model failed for
all 24 borates. Compared with the PBE calculation and
MEGNet model, the prediction of the MF-GNN model was
improved to a certain extent (identification accuracy of 42%).
In addition to training using low-precision PBE functionals, the
MF-GNN includes an additional step by considering higherlevel computational methods and incorporating experimental
values for inorganic crystalline materials. This clearly
demonstrates that the main bottlenecks in ML are the data
quality and quantity constraints. Therefore, in this study, we
focused on the quality of the data, particularly for borates, by
constructing a complete experimental band gap database using
human and NLP searches, with an identification accuracy of
75%. At the algorithm level, the CGCNN model still has
potential for improvement, as shown in our previous study11
involving the addition of domain knowledge of material
information to the graph network.
To assess the degree of deviation of the prediction value, we
used a more expensive hybrid functional, Heyd−Scuseria−
Ernzerhof-06 (HSE06),65,66 to calculate the band gaps, and
this approach is commonly adopted to obtain a more accurate
band gap.67 Considering the computational expense, we chose
BaBe2BO3F347 with 60 atoms in the unit cell as a template
compound (Figure S5). Previous experimental research has
indicated that the band gap value is larger than 6.70 eV. On a
workstation with a configuration of 192 cores with an Intel
Xeon E5-2692 CPU, the band gap calculation requires a total
of 279,153 s, demonstrating that this computationally
resource-intensive theoretical approach is currently unsuitable
for high-throughput computing research. The calculated result
based on the HSE06 hybrid functional is 8.18 eV, which agrees
well with our predicted value of 8.20 eV. This verifies the
reliability of the proposed model. Impressively, less than 1 s is
required to obtain this value on our local server. In general, the
robustness and low computing costs of our model are adequate
for the application of high-throughput virtual screening in
practical situations.
Experimental Verification. Notably, the 24 selected
borates contained elements that were previously identified in
the database, as illustrated in Figure 2a. To assess the
extrapolative ability of our model, we examined the band gap
4720
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
of a new borate crystal containing an element (e.g., silver) that
was not included in the original database. Based on this idea,
we introduced AgNO3 into a borate reaction system and
successfully obtained a new borate crystal Ag3B6O10NO3 as the
target compound. Single-crystal XRD analysis revealed that the
compound crystallized in the orthorhombic space group, Pnma
(no. 62). Note that the compound has a three-dimensional
structure containing a B6O13 cluster as the building unit. The
center of the oxo-boron cluster is occupied by one oxygen
atom, which is bonded to three tetrahedral boron atoms. The
remaining three boron atoms in the cluster are surrounded by
three oxygen atoms to provide triangular coordination. Each
B6O13 cluster shares six corner oxygen atoms with the adjacent
clusters, forming a three-dimensional open-framework structure. Silver atoms and nitrate groups were encapsulated within
their free voids (Figure 4a). By considering the B6O13 cluster
Article
type network constructed from B6O13 clusters, except for their
extra-framework species. UV−vis diffuse reflectance spectra
showed that Ag3B6O10NO3 had a band gap of 4.16 eV, whereas
K3B6O10Cl has a band gap of 6.89 eV. The band gap difference
between the two compounds could arise from their extraframework species. Our ML model can recognize the effect of
extra-framework species on their band gaps, and it accurately
predicts a value of 3.88 eV for Ag3B6O10NO3 and 6.27 eV for
K3B6O10Cl.
Applications and Interpretability. Chemical substitution
has been widely explored in molecular engineering design for
the discovery of many NLO materials. Considerable efforts
have been made for the replacement of transition metal ions
with main group elements to avoid d−d electronic transitions
and increase the band gap of NLO materials. Lin et al. reported
the design and synthesis of a selenite-based NLO material
Pb2GaF2(SeO3)2Cl with a larger band gap (Eg = 4.32 eV) than
its transition metal analogue Pb2TiOF(SeO3)2Cl (Eg = 3.34
eV).70,71 The extrapolation tests revealed that the predicted
band gap values obtained from our model were closer to the
experimental values than those predicted by the PBE
calculations and other GNN models. In this regard, our results
could improve the exploration of new borate materials with
tunable band gaps using a ML-based band engineering
strategy. For example, our model can quickly show that the
replacement of Ag+ by Na+ in Ag3B6O10NO3 will cause the
band gap to increase by approximately 0.55 eV, which is
consistent with the trend calculated by the HSE06 hybrid
functional (an increase of 0.94 eV). The dual-site substitution
of Ag+ and NO3− with Na+ and Br− in Ag3B6O10NO3 could
produce a hypothetical borate crystal Na3B6O10Br with a larger
band gap of 5.71 eV, which agrees well with the HSE06
calculated value of 6.06 eV. This satisfactory quantitative
prediction further illustrated the feasibility of ML-based metal
substitution strategies.
The next step is to utilize the resulting ML model to predict
more borate structures from a larger database to obtain
experiment-based band gaps and further enrich and expand the
optical field. The Materials Project (www.materialsproject.org)
employs high-throughput computing to uncover the properties
of all known inorganic materials and has built a sizable
materials database that contains computed structural and
electronic data for over 33,000 compounds.20 The Materials
Project API allows anyone to have direct access to current, upto-date information from the Materials Project database in a
structured manner. Next, we collected 1673 borates in the
chemical form MxByOz (M is a different metal) using the
Materials Project API. The downloaded data also contained
1673 DFT calculated band gaps (PBE level). The green violin
plot in Figure 5a clearly represents the distribution of these
PBE calculated values, and the median of the data is 2.90 eV.
The purple violin plot in Figure 5a shows 276 experimental
band gaps of borates obtained via data mining. The median of
the violin plot for the experimental database was 5.09 eV.
Following this, we used the resulting ML model on 1673
borates and obtained predicted ML results, as illustrated in the
yellow violin plot. Evidently, the distribution of the ML
prediction results was closer to the experimental database than
the distribution of the PBE calculation results.
It is interesting to compare the band gaps predicted by the
ML model with those obtained using the PBE calculation.
Figure 5b shows that the band gaps calculated by the PBE
functional severely underestimate the experimental band gaps
Figure 4. (a) View of the framework structure of Ag3B6O10NO3
encapsulated with Ag+ ions and NO3− groups. (b) UV−vis diffusive
reflectance spectrum for Ag3B6O10NO3.
as a six-connected node, the borate framework can be
simplified as a pcu net (Figure S6). Such a framework
topology has also been observed in several porous materials
such as MOF-5. 68 The experimental band gap of
Ag3B6O10NO3 is approximately 4.16 eV according to the
absorption spectrum converted from the diffuse reflectance
spectrum using the Kubelka−Munk function (Figure 4b).
We employed a previously published GNN network to
evaluate the band gap of Ag3B6O10NO3, and the MEGNet and
MF-GNN models resulted in predicted values of 1.57 and 3.26
eV, respectively. This demonstrates that these GNN models,
without special considerations of the borate database, cannot
accurately predict newly synthesized borate crystals. Next, we
performed first-principles calculations. The DFT calculated
values using the PBE and HSE06 functionals were 2.21 and
3.91 eV, respectively. It is clear that the DFT calculated values
for the HSE06 functional closely match the experimental
values, as discussed in the previous section. Impressively, our
prediction of 3.88 eV is better than the PBE functional
calculation result, and a difference of only 0.03 eV can be
observed compared to the result of the expensive HSE06
functional. Thus, when presented with newly synthesized
crystals, our model can also provide good prediction accuracy.
Notably, our ML model could clearly distinguish the band
gap difference between Ag3B6O10NO3 and its structural
analogue K3B6O10Cl.69 The two borates had the same pcu4721
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
Article
Figure 5. (a) Probability densities of data distributions are shown via violin plots. The data range is represented by the stretched black line, which
has maximum and lowest values at both ends, while the white point is the median. (b) Comparison of DFT calculation based on PBE functional
and experimentally measured values. (c) Comparison of DFT calculation based on PBE functional and our CGCNN model prediction values. The
colors of the heatmaps correspond to the number of samples, where the red means high density, and blue denotes low density.
for most borates. If the ML predictions are close to the
experimental values, then PBE calculated values should be
lower than the ML predictions. As expected, most points in
Figure 5c are below the red diagonal line, which indicates that
most of the PBE calculation results are lower than the ML
predictions. To a certain extent, the ML model trained based
on the experimental values can correct the previous highthroughput PBE level DFT calculation values to the experiment-based prediction values. This avoids missing key
candidates for future screening. Further investigations should
be conducted to verify this aspect. Therefore, we obtained
experiment-based band gaps for the 1673 borates, which were
openly released in the GitHub repository as our initial
contribution to the borate field. Depending on the value of
interest, chemists may filter the chemical data based on certain
values, such as selecting a band gap > 6.2 eV or < 3.2 eV.
Another interesting aspect of ML is the exploration of the
interpretability of neural networks. After graph convolution
and pooling operations, each constructed borate crystal graph
finally becomes a 698 dimensional vector in our CGCNN
model. However, 698 dimensions are difficult to visually
analyze. Here, we used the principal component analysis
(PCA) algorithm72 for reduction to 3D visualization. Notably,
PCA is an effective multivariate mathematical technique for
reduction analysis. After PCA, the original [276 × 698] matrix
of 276 borates was reduced to a [276 × 3] matrix. The
experimental band gap of each borate is shown in Figure 6 in a
color scale. The points changed from red to purple, indicating
an increase in the band gap. It is evident that data points that
are close in proximity have similar band gaps. For example, the
predicted values based on our ML model for β-BaB2O4
(BBO),5 Ba2Mg(B3O6)2 (BMBO),5 Ba2Ca(B3O6)2 (BCBO),5
and Bi2ZnOB2O6 (BZBO)73 are 6.56, 7.39, 6.97, and 3.34 eV,
while the corresponding experiment band gap values are 6.57,
6.97, 6.96, and 3.75 eV, respectively. Figure 6 clearly
demonstrates that BBO, BMBO, and BCBO are clustered
together, whereas BZBO is far away owing to a low band gap
of 3.34 eV. Interestingly, the former three materials are wellknown DUV NLO materials,5 whereas the latter is a polar
material that undergoes photocatalytic degradation of Rhod4722
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
Article
Figure 6. Visualization of crystal-graph feature space. PCA plot of the crystal-graph level feature from the pooling layer for CGCNN trained on the
experimental data set of 276 borates, with each point representing an individual borate crystal. Each point is color coded according to its band gap.
amine B.74 This analysis supports the conclusion that the ML
model can distinguish between different types of borates and
learn the characteristics of different atoms and bond
information in the crystal graphs.
Web Server. Our group has created a web-based prediction
tool (http://www.predborate.com/) that includes a regression
model based on the CGCNN algorithm. The online prediction
tool enables researchers to submit borate structures (in CIF
format) and obtain prediction results for band gaps.
Furthermore, if millions of structures are predicted, a structural
file can be uploaded in a rar compressed format. The predicted
outcomes can be observed on the webpage or saved directly to
a local workstation for further screening.
Code Availability. All source codes for this study are freely
available under the MIT license. The source code and database
are available at https://github.com/ruihwang/
bandgapboraterpred. The ChemDataExtractor version 2.0
code is available at http://www.chemdataextractor2.org/
download.
of the ML model, a new borate crystal, Ag3B6O10NO3, was
synthesized in this study. The prediction error obtained based
on a comparison with experiments was only 0.28 eV (3.88 vs
4.16 eV), which is much better than that for the PBE
functional and is close to the expensive HSE06 functional
calculation result. The resulting ML model was used to obtain
the experiment-based band gaps of approximately 2000 borates
extracted from the Materials Project database. The newly
developed web application may serve as a powerful tool for
chemists who wish to estimate the band gaps of borates. More
importantly, the entire procedure can be easily applied to
various property or material systems. Therefore, future material
design research will benefit from the computational techniques
described in this paper to predict more properties, such as
birefringence, the second-harmonic generation coefficient, or
carrier mobility, and accelerate the discovery of advanced
functional materials.
■
ASSOCIATED CONTENT
sı Supporting Information
*
■
The Supporting Information is available free of charge at
https://pubs.acs.org/doi/10.1021/acs.inorgchem.3c00233.
Performance of the previous GNN model, experimental
band gap database for borates, characterizations, and
optimized hyperparameters (PDF)
CONCLUSIONS
In summary, we used a data-driven framework to develop a
band gap prediction approach for borates, and its accuracy was
found to approach that of experiment-based models on the
second scale. A database of inorganic borates including 276
experimental band gaps was built by extraction from scientific
literature. We then trained an ML model to predict the
experimentally measured band gaps of borates. The ML model
achieved an MAE of 0.40 eV and PCC of 0.90 on the test set,
guiding practical applications. For a realistic screening problem
of DUV borates, our model demonstrated a high extrapolation
capacity and low computing cost. To examine the robustness
Accession Codes
CCDC 2237125 contains the supplementary crystallographic
data for this paper. These data can be obtained free of charge
via www.ccdc.cam.ac.uk/data_request/cif, or by emailing
data_request@ccdc.cam.ac.uk, or by contacting The Cambridge Crystallographic Data Centre, 12 Union Road,
Cambridge CB2 1EZ, UK; fax: +44 1223 336033.
4723
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
■
pubs.acs.org/IC
Second-Order Nonlinear Optical Materials Based on π-conjugated
[BO3]3‑ Groups. Coord. Chem. Rev. 2018, 366, 1−28.
(3) Wang, G. J.; Jing, Y.; Ju, J.; Yang, D. F.; Yang, J.; Gao, W. L.;
Cong, R. H.; Yang, T. Ga4B2O9: An Efficient Borate Photocatalyst for
Overall Water Splitting without Cocatalyst. Inorg. Chem. 2015, 54,
2945−2949.
(4) Szczeszak, A.; Grzyb, T.; Barszcz, B.; Nagirnyi, V.; Kotlov, A.;
Lis, S. Hydrothermal Synthesis and Structural and Spectroscopic
Properties of the New Triclinic Form of GdBO3:Eu3+ Nanocrystals.
Inorg. Chem. 2013, 52, 4934−4940.
(5) (a) He, R.; Huang, H. W.; Kang, L.; Yao, W. J.; Jiang, X. X.; Lin,
Z. S.; Qin, J. G.; Chen, C. T. Bandgaps in the Deep Ultraviolet Borate
Crystals: Prediction and Improvement. Appl. Phys. Lett. 2013, 102,
231904. (b) Zhao, W. Z.; Zhang, Y. N.; Lan, Y. Z.; Cheng, J. W.;
Yang, G. Y. Ba2B10O16(OH)2·(H3BO3)(H2O): A Possible DeepUltraviolet Nonlinear-Optical Barium Borate. Inorg. Chem. 2022, 61,
4246−4250. (c) Zhao, S. G.; Gong, P. F.; Bai, L.; Xu, X.; Zhang, S.
Q.; Sun, Z. H.; Lin, Z. S.; Hong, M. C.; Chen, C. T.; Luo, J. H.
Beryllium-free Li4Sr(BO3)2 for Deep-Ultraviolet Nonlinear Optical
Applications. Nat. Commun. 2014, 5, 4019.
(6) Horiuchi, Y.; Toyao, T.; Saito, M.; Mochizuki, K.; Iwata, M.;
Higashimura, H.; Anpo, M.; Matsuoka, M. Visible-Light-Promoted
Photocatalytic Hydrogen Production by Using an Amino-Functionalized Ti(Iv) Metal-Organic Framework. J. Phys. Chem. C 2012, 116,
20848−20853.
(7) Stanley, J.; Gagliardi, A. Machine Learning Bandgaps of
Inorganic Mixed Halide Perovskites. IEEE 18th International Conference on Nanotechnology; IEEE 2018, 1−4.
(8) Pilania, G.; Mannodi-Kanakkithodi, A.; Uberuaga, B. P.;
Ramprasad, R.; Gubernatis, J. E.; Lookman, T. Machine Learning
Bandgaps of Double Perovskites. Sci. Rep. 2016, 6, 19375.
(9) Knøsgaard, N. R.; Thygesen, K. S. Representing Individual
Electronic States for Machine Learning Gw Band Structures of 2d
Materials. Nat. Commun. 2022, 13, 468.
(10) Wang, R. H.; Zhong, Y. S.; Bi, L. M.; Yang, M. L.; Xu, D. G.
Accelerating Discovery of Metal-Organic Frameworks for Methane
Adsorption with Hierarchical Screening and Deep Learning. ACS
Appl. Mater. Interfaces 2020, 12, 52797−52807.
(11) Wang, R. H.; Zou, Y. R.; Zhang, C. C.; Wang, X.; Yang, M. L.;
Xu, D. G. Combining Crystal Graphs and Domain Knowledge in
Machine Learning to Predict Metal-Organic Frameworks Performance in Methane Adsorption. Microporous Mesoporous Mater. 2022,
331, 111666.
(12) Liang, Y. Z.; Chen, M. W.; Wang, Y. A.; Jia, H. X.; Lu, T. L.;
Xie, F. K.; Cai, G. H.; Wang, Z. G.; Meng, S.; Liu, M. A Universal
Model for Accurately Predicting the Formation Energy of Inorganic
Compounds. Sci. China Mater. 2022, 66, 343−351.
(13) Ward, L.; Agrawal, A.; Choudhary, A.; Wolverton, C. A
General-Purpose Machine Learning Framework for Predicting
Properties of Inorganic Materials. npj Comput. Mater. 2016, 2, 16028.
(14) Stanley, J. C.; Mayr, F.; Gagliardi, A. Machine Learning
Stability and Bandgaps of Lead-Free Perovskites for Photovoltaics.
Adv. Theory Simul. 2020, 3, 1900178.
(15) Schmidt, J.; Shi, J. M.; Borlido, P.; Chen, L. M.; Botti, S.;
Marques, M. A. L. Predicting the Thermodynamic Stability of Solids
Combining Density Functional Theory and Machine Learning. Chem.
Mater. 2017, 29, 5090−5103.
(16) Omprakash, P.; Manikandan, B.; Sandeep, A.; Shrivastava, R.;
Viswesh, P.; Panemangalore, D. B. Graph Representational Learning
for Bandgap Prediction in Varied Perovskite Crystals. Comput. Mater.
Sci. 2021, 196, 110530.
(17) Xie, T.; Grossman, J. C. Crystal Graph Convolutional Neural
Networks for an Accurate and Interpretable Prediction of Material
Properties. Phys. Rev. Lett. 2018, 120, 145301.
(18) Curtarolo, S.; Setyawan, W.; Hart, G. L. W.; Jahnatek, M.;
Chepulskii, R. V.; Taylor, R. H.; Wang, S. D.; Xue, J. K.; Yang, K. S.;
Levy, O.; Mehl, M. J.; Stokes, H. T.; Demchenko, D. O.; Morgan, D.
Aflow: An Automatic Framework for High-Throughput Materials
Discovery. Comput. Mater. Sci. 2012, 58, 218−226.
AUTHOR INFORMATION
Corresponding Authors
Zhien Lin − MOE Key Laboratory of Green Chemistry and
Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China; orcid.org/00000002-5897-9114; Email: zhienlin@scu.edu.cn
Dingguo Xu − MOE Key Laboratory of Green Chemistry and
Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China; Research Center for
Materials Genome Engineering, Sichuan University, Chengdu,
Sichuan 610065, PR China; orcid.org/0000-0002-98348296; Email: dgxu@scu.edu.cn
Authors
Ruihan Wang − MOE Key Laboratory of Green Chemistry
and Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China
Yeshuang Zhong − Department of Physics, School of Biology
and Engineering, Guizhou Medical University, Guiyang,
Guizhou 550025, PR China
Xuehua Dong − MOE Key Laboratory of Green Chemistry
and Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China
Meng Du − MOE Key Laboratory of Green Chemistry and
Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China
Haolun Yuan − MOE Key Laboratory of Green Chemistry
and Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China
Yurong Zou − MOE Key Laboratory of Green Chemistry and
Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China
Xin Wang − MOE Key Laboratory of Green Chemistry and
Technology, College of Chemistry, Sichuan University,
Chengdu, Sichuan 610064, PR China
Complete contact information is available at:
https://pubs.acs.org/10.1021/acs.inorgchem.3c00233
Author Contributions
∥
R.W. and Y.Z. contributed equally.
Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS
■
REFERENCES
Article
This study was sponsored by the Natural Science Foundation
of Sichuan, China (no. 2022NSFSC0029), and supported by
the National Natural Science Foundation of China (grant no.
21973064 to D.X. and no. 21971164 to Z.L.) and the Guizhou
Provincial Natural Science Foundation (2022-406). Some of
the results described in this paper were obtained from the
National Supercomputing Center of Guangzhou and the
Supercomputing Center of Sichuan University. We also
thank Leming Bi for web app development.
(1) (a) Becker, P. Borate Materials in Nonlinear Optics. Adv. Mater.
1998, 10, 979−992. (b) Tran, T. T.; Yu, H.; Rondinelli, J. M.;
Poeppelmeier, K. R.; Halasyamani, P. S. Deep Ultraviolet Nonlinear
Optical Materials. Chem. Mater. 2016, 28, 5238−5258.
(2) (a) Mutailipu, M.; Poeppelmeier, K. R.; Pan, S. Borates: A Rich
Source for Optical Materials. Chem. Rev. 2021, 121, 1130−1202.
(b) Shen, Y. G.; Zhao, S. G.; Luo, J. H. The Role of Cations in
4724
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
(19) Kirklin, S.; Saal, J. E.; Meredig, B.; Thompson, A.; Doak, J. W.;
Aykol, M.; Rühl, S.; Wolverton, C. The Open Quantum Materials
Database (OQMD): Assessing the Accuracy of DFT Formation
Energies. npj Comput. Mater. 2015, 1, 15010.
(20) Jain, A.; Ong, S. P.; Hautier, G.; Chen, W.; Richards, W. D.;
Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; Persson, K.
A. Commentary: The Materials Project: A Materials Genome
Approach to Accelerating Materials Innovation. APL Mater. 2013,
1, 011002.
(21) Crowley, J. M.; Tahir-Kheli, J.; Goddard, W. A. Resolution of
the Band Gap Prediction Problem for Materials Design. J. Phys. Chem.
Lett. 2016, 7, 1198−1203.
(22) Lee, J.; Seko, A.; Shitara, K.; Nakayama, K.; Tanaka, I.
Prediction Model of Band Gap for Inorganic Compounds by
Combination of Density Functional Theory Calculations and
Machine Learning Techniques. Phys. Rev. B 2016, 93, 115104.
(23) Chuangtian, C.; Bochang, W.; Aidong, J.; Guiming, Y. A NewType Ultraviolet SHG Crystal�β-BaB2O4. Sci. China, Ser. B 1985, 28,
235−243.
(24) Chen, C. T.; Wang, G. L.; Wang, X. Y.; Xu, Z. Y. Deep-UV
Nonlinear Optical Crystal KBe2BO3F2-Discovery, Growth, Optical
Properties and Applications. Appl. Phys. B: Lasers Opt. 2009, 97, 9−
25.
(25) Ye, N.; Zeng, W. R.; Jiang, J.; Wu, B. C.; Chen, C. T.; Feng, B.
H.; Zhang, X. L. New Nonlinear Optical Crystal K2Al2B2O7. J. Opt.
Soc. Am. B 2000, 17, 764−768.
(26) Aryasetiawan, F.; Gunnarsson, O. The Gw Method. Rep. Prog.
Phys. 1998, 61, 237−312.
(27) Perdew, J. P.; Ernzerhof, M.; Burke, K. Rationale for Mixing
Exact Exchange with Density Functional Approximations. J. Chem.
Phys. 1996, 105, 9982−9985.
(28) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized Gradient
Approximation Made Simple. Phys. Rev. Lett. 1996, 77, 3865−3868.
(29) Zhuo, Y.; Mansouri Tehrani, A. M.; Brgoch, J. Predicting the
Band Gaps of Inorganic Solids by Machine Learning. J. Phys. Chem.
Lett. 2018, 9, 1668−1673.
(30) Chen, C.; Zuo, Y.; Ye, W.; Li, X.; Ong, S. P. Learning
Properties of Ordered and Disordered Materials from Multi-Fidelity
Data. Nat. Comput. Sci. 2021, 1, 46−53.
(31) Kim, E.; Huang, K.; Saunders, A.; McCallum, A.; Ceder, G.;
Olivetti, E. Materials Synthesis Insights from Scientific Literature Via
Text Extraction and Machine Learning. Chem. Mater. 2017, 29,
9436−9444.
(32) Georgescu, A. B.; Ren, P. W.; Toland, A. R.; Zhang, S. T.;
Miller, K. D.; Apley, D. W.; Olivetti, E. A.; Wagner, N.; Rondinelli, J.
M. Database, Features, and Machine Learning Model to Identify
Thermally Driven Metal-Insulator Transition Compounds. Chem.
Mater. 2021, 33, 5591−5605.
(33) Jensen, Z.; Kwon, S.; Schwalbe-Koda, D.; Paris, C.; GómezBombarelli, R.; Román-Leshkov, Y.; Corma, A.; Moliner, M.; Olivetti,
E. A. Discovering Relationships between Osdas and Zeolites through
Data Mining and Generative Neural Networks. ACS Cent. Sci. 2021, 7,
858−867.
(34) Zhang, Y.; Wang, C.; Soukaseum, M.; Vlachos, D. G.; Fang, H.
Unleashing the Power of Knowledge Extraction from Scientific
Literature in Catalysis. J. Chem. Inf. Model. 2022, 62, 3316−3330.
(35) Hawizy, L.; Jessop, D. M.; Adams, N.; Murray-Rust, P.
Chemicaltagger: A Tool for Semantic Text-Mining in Chemistry. J.
Cheminf. 2011, 3, 17.
(36) Swain, M. C.; Cole, J. M. Chemdataextractor: A Toolkit for
Automated Extraction of Chemical Information from the Scientific
Literature. J. Chem. Inf. Model. 2016, 56, 1894−1904.
(37) Zhu, M.; Cole, J. M. Pdfdataextractor: A Tool for Reading
Scientific Text and Interpreting Metadata from the Typeset Literature
in the Portable Document Format. J. Chem. Inf. Model. 2022, 62,
1633−1643.
(38) Lammey, R. Crossref’s Text and Data Mining Services. Learn.
Publ. 2014, 27, 245−250.
Article
(39) Saif, S.; Tahir, A.; Asim, T.; Chen, Y. S.; Khan, M.; Adil, S. F.
Green Synthesis of Zno Hierarchical Microstructures by Cordia Myxa
and Their Antibacterial Activity. Saudi J. Biol. Sci. 2019, 26, 1364−
1371.
(40) Kresse, G.; Furthmüller, J. Efficient Iterative Schemes for Ab
Initio Total-Energy Calculations Using a Plane-Wave Basis Set. Phys.
Rev. B: Condens. Matter Mater. Phys. 1996, 54, 11169−11186.
(41) Kresse, G.; Joubert, D. From Ultrasoft Pseudopotentials to the
Projector Augmented-Wave Method. Phys. Rev. B: Condens. Matter
Mater. Phys. 1999, 59, 1758−1775.
(42) Yi, W. C.; Tang, G.; Chen, X.; Yang, B. C.; Liu, X. B. Qvasp: A
Flexible Toolkit for Vasp Users in Materials Simulations. Comput.
Phys. Commun. 2020, 257, 107535.
(43) Sheldrick, G. M. A Short History of Shelx. Acta Crystallogr., Sect.
A: Found. Adv. 2008, 64, 112−122.
(44) Sheldrick, G. M. SHELXTL-97, Program for Crystal Structure
Solution; University of Göttingen: Germany, 1997.
(45) Zhao, W. W.; Pan, S. L.; Dong, X. Y.; Li, J. J.; Tian, X. L.; Fan,
X. Y.; Chen, Z. H.; Zhang, F. F. Synthesis, crystal structure and
properties of a new lead fluoride borate, Pb3OBO3F. Mater. Res. Bull.
2012, 47, 947−951.
(46) Pan, F.; Shen, G. Q.; Wang, R. J.; Wang, X. Q.; Shen, D. Z.
Growth, Characterization and Nonlinear Optical Properties of SrB4O7
Crystals. J. Cryst. Growth 2002, 241, 108−114.
(47) Guo, S.; Jiang, X. X.; Liu, L. J.; Xia, M. J.; Fang, Z.; Wang, X. Y.;
Lin, Z. S.; Chen, C. T. BaBe2BO3F3: A KBBF-Type Deep-Ultraviolet
Nonlinear Optical Material with Reinforced [Be2BO3F2](Infinity)
Layers and Short Phase-Matching Wavelength. Chem. Mater. 2016,
28, 8871−8875.
(48) Xiao-Shan, W.; Li-Juan, L.; Ming-Jun, X.; Xiao-Yang, W.;
Chuang-Tian, C. Two Isostructural Multi-Metal Borates: Syntheses,
Crystal Structures and Characterizations of M3LiNa4Be4B10O24F (M =
Sr, Cd). Chin. J. Struct. Chem. 2016, 34, 1617−1625.
(49) Bai, C. Y.; Han, S. J.; Pan, S. L.; Zhang, B. B.; Yang, Y.; Li, L.;
Yang, Z. H. Na3B4O7X (X = Cl, Br): Two New Borate Halides with a
1D Na-X (X = Cl, Br) Chain Formed by the Face-Sharing XNa6
Octahedra. RSC Adv. 2015, 5, 12416−12422.
(50) Yang, Y.; Pan, S. L.; Hou, X. L.; Dong, X. Y.; Su, X.; Yang, Z.
H.; Zhang, M.; Zhao, W. W.; Chen, Z. H. Li5Rb2B7O14: A New
Congruently Melting Compound with Two Kinds of B-O OneDimensional Chains and Short Uv Absorption Edge. CrystEngComm
2012, 14, 6720−6725.
(51) Lin, F.; Dong, Y. P.; Peng, J. Y.; Wang, L. P.; Li, W.; Yang, B. A
New Acentric Borate of K2Ba[B4O5(OH)4]2·10H2O: Synthesis,
Structure and Nonlinear Optical Property. Phase Transitions 2016,
89, 996−1005.
(52) Meng, X. H.; Xia, M. J.; Li, R. K. Li3Ba4Sc3(BO3)4(B2O5)2:
Featuring the Coexistence of Isolated BO3 and B2O5 Units. New J.
Chem. 2019, 43, 11469−11472.
(53) Luo, X. C.; Pan, S. L.; Fan, X. Y.; Wang, J. D.; Liu, G. Crystal
Growth and Characterization of K2B4O11H8. J. Cryst. Growth 2009,
311, 3517−3521.
(54) Wang, S. C.; Ye, N.; Zou, G. H. A New Alkaline Beryllium
Borate KBe4B3O9 with Ribbon Alveolate [Be2BO5](Infinity) Layers
and the Structural Evolution of ABe4B3O9 (A = K, Rb and Cs).
CrystEngComm 2014, 16, 3971−3976.
(55) Huang, H. W.; Yao, W. J.; He, R.; Chen, C. T.; Wang, X. Y.;
Zhang, Y. H. Synthesis, Crystal Structure and Optical Properties of a
New Beryllium Borate, CsBe4(BO3)3. Solid State Sci. 2013, 18, 105−
109.
(56) Zhang, Z. Z.; Wang, Y.; Li, H.; Yang, Z. H.; Pan, S. L.
BaB8O12F2: A Promising Deep-UV Birefringent Material. Inorg. Chem.
Front. 2019, 6, 546−549.
(57) Wu, H. P.; Yu, H. W.; Pan, S. L.; Jiao, A. Q.; Han, J.; Wu, K.;
Han, S. J.; Li, H. Y. New Type of Complex Alkali and Alkaline Earth
Metal Borates with Isolated (B12O24)12‑ Anionic Group. Dalton Trans.
2014, 43, 4886−4891.
4725
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Inorganic Chemistry
pubs.acs.org/IC
(58) Huang, S. Z.; Zhou, C.; Cheng, S. C.; Yu, F. K5B19O31: A DeepUltraviolet Congruent Melting Compound. ChemistrySelect 2019, 4,
10436−10441.
(59) Mutailipu, M.; Xie, Z. Q.; Su, X.; Zhang, M.; Wang, Y.; Yang, Z.
H.; Janjua, M. R. S. A.; Pan, S. L. Chemical Cosubstitution-Oriented
Design of Rare-Earth Borates as Potential Ultraviolet Nonlinear
Optical Materials. J. Am. Chem. Soc. 2017, 139, 18397−18405.
(60) Yao, W. J.; Jiang, X. X.; Huang, R. J.; Li, W.; Huang, C. J.; Lin,
Z. S.; Li, L. F.; Chen, C. T. Area Negative Thermal Expansion in a
Beryllium Borate LiBeBO3 with Edge Sharing Tetrahedra. Chem.
Commun. 2014, 50, 13499−13501.
(61) Shi, Y. T.; Luo, M.; Lin, C. S.; Peng, G.; Ye, N. Two Deep
Ultraviolet Hydrated Borate Crystals: Centrosymmetric LiRbB5O8(OH)·H2O and Non-Centrosymmetric K2B5O8(OH)·2H2O.
Cryst. Growth Des. 2019, 19, 3052−3059.
(62) Huang, C. M.; Zhang, F. F.; Li, H.; Yang, Z. H.; Yu, H. H.; Pan,
S. L. BaB2O3F2: A Barium Fluorooxoborate with a Unique2(Infinity)
[B2O3F]‑ Layer and Short Cutoff Edge. Chem.�Eur. J. 2019, 25,
6693−6697.
(63) Yan, X.; Luo, S. Y.; Lin, Z. S.; Yao, J. Y.; He, R.; Yue, Y. C.;
Chen, C. T. ReBe2B5O11 (Re = Y, Gd): Rare-Earth Beryllium
Borates as Deep-Ultraviolet Nonlinear-Optical Materials. Inorg. Chem.
2014, 53, 1952−1954.
(64) Chen, C.; Ye, W. K.; Zuo, Y. X.; Zheng, C.; Ong, S. P. Graph
Networks as a Universal Machine Learning Framework for Molecules
and Crystals. Chem. Mater. 2019, 31, 3564−3572.
(65) Krukau, A. V.; Vydrov, O. A.; Izmaylov, A. F.; Scuseria, G. E.
Influence of the Exchange Screening Parameter on the Performance of
Screened Hybrid Functionals. J. Chem. Phys. 2006, 125, 224106.
(66) Heyd, J.; Scuseria, G. E.; Ernzerhof, M. Hybrid Functionals
Based on a Screened Coulomb Potential. J. Chem. Phys. 2003, 118,
8207−8215.
(67) Zhang, B. B.; Zhang, X. D.; Yu, J.; Wang, Y.; Wu, K.; Lee, M. H.
First-Principles High-Throughput Screening Pipeline for Nonlinear
Optical Materials: Application to Borates. Chem. Mater. 2020, 32,
6772−6779.
(68) Li, H.; Eddaoudi, M.; O’Keeffe, M.; Yaghi, O. M. Design and
Synthesis of an Exceptionally Stable and Highly Porous MetalOrganic Framework. Nature 1999, 402, 276−279.
(69) Wu, H. P.; Pan, S. L.; Poeppelmeier, K. R.; Li, H. Y.; Jia, D. Z.;
Chen, Z. H.; Fan, X. Y.; Yang, Y.; Rondinelli, J. M.; Luo, H. S.
K3B6O10Cl: A New Structure Analogous to Perovskite with a Large
Second Harmonic Generation Response and Deep UV Absorption
Edge. J. Am. Chem. Soc. 2011, 133, 16317.
(70) You, F. G.; Liang, F.; Huang, Q.; Hu, Z. G.; Wu, Y. C.; Lin, Z.
S. Pb2GaF2(SeO3)2Cl: Band Engineering Strategy by Aliovalent
Substitution for Enlarging Bandgap While Keeping Strong Second
Harmonic Generation Response. J. Am. Chem. Soc. 2019, 141, 748−
752.
(71) Cao, X. L.; Hu, C. L.; Xu, X.; Kong, F.; Mao, J. G.
Pb2TiOF(SeO3)2Cl and Pb2NbO2(SeO3)2Cl: Small Changes in
Structure Induced a Very Large SHG Enhancement. Chem. Commun.
2013, 49, 9965−9967.
(72) Wold, S.; Esbensen, K.; Geladi, P. Principal Component
Analysis. Chemom. Intell. Lab. Syst. 1987, 2, 37−52.
(73) Li, F.; Hou, X. L.; Pan, S. L.; Wang, X. Growth, Structure, and
Optical Properties of a Congruent Melting Oxyborate, Bi2ZnOB2O6.
Chem. Mater. 2009, 21, 2846−2850.
(74) Liu, J.; Zhao, W. W.; Wang, B.; Yan, H. Bi2ZnOB2O6: A Polar
Material Capable of Photocatalytic Degradation of Rhodamine B. J.
Mater. Sci.: Mater. Electron. 2018, 29, 13803−13809.
Article
Recommended by ACS
A Study on Efficient Technique for Generating Vertex-based
Topological Characterization of Boric Acid 2D Structure
Sahaya Vijay Jeyaraj and Roy Santiago
JUNE 09, 2023
ACS OMEGA
READ
Convolutional Neural Networks to Assist the Assessment of
Lattice Parameters from X-ray Powder Diffraction
Juan Iván Gómez-Peralta, Patricia Quintana, et al.
AUGUST 30, 2023
THE JOURNAL OF PHYSICAL CHEMISTRY A
READ
Accelerating Materials Discovery through Machine
Learning: Predicting Crystallographic Symmetry Groups
Yousef A. Alghofaili, Fahhad H. Alharbi, et al.
AUGUST 11, 2023
THE JOURNAL OF PHYSICAL CHEMISTRY C
READ
Revealing Hidden Patterns through Chemical Intuition and
Interpretable Machine Learning: A Case Study of Binary
Rare-Earth Intermetallics RX
Volodymyr Gvozdetskyi, Arthur Mar, et al.
JANUARY 30, 2023
CHEMISTRY OF MATERIALS
READ
Get More Suggestions >
4726
https://doi.org/10.1021/acs.inorgchem.3c00233
Inorg. Chem. 2023, 62, 4716−4726
Download