Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards ISSN: 1749-9518 (Print) 1749-9526 (Online) Journal homepage: https://www.tandfonline.com/loi/ngrk20 The story of statistics in geotechnical engineering Kok-Kwang Phoon To cite this article: Kok-Kwang Phoon (2020) The story of statistics in geotechnical engineering, Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 14:1, 3-25, DOI: 10.1080/17499518.2019.1700423 To link to this article: https://doi.org/10.1080/17499518.2019.1700423 Published online: 09 Dec 2019. Submit your article to this journal Article views: 3956 View related articles View Crossmark data Citing articles: 29 View citing articles Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=ngrk20 GEORISK 2020, VOL. 14, NO. 1, 3–25 https://doi.org/10.1080/17499518.2019.1700423 SPOTLIGHT ARTICLE The story of statistics in geotechnical engineering Kok-Kwang Phoon Department of Civil and Environmental Engineering, National University of Singapore, Singapore, Singapore ABSTRACT ARTICLE HISTORY The story of statistics in geotechnical engineering can be traced to Lumb’s classical Canadian Geotechnical Journal paper on “The Variability of Natural Soils” published in 1966. In parallel, the story of risk management in geotechnical engineering has progressed from design by prescriptive measures that do not require site-specific data, to more refined estimation of site-specific response using limited data from site investigation as inputs to physical models, to quantitative risk assessment (QRA) requiring considerable data at regional/national scales. In an era where data is recognised as the “new oil”, it makes sense for us to lean towards decision making strategies that are more responsive to data, particularly if we have zettabytes coming our way. In fact, we already have a lot of data, but the vast majority is shelved after a project is completed (“dark data”). It does not make sense to reduce one zettabyte to a few bytes describing a single cautious value. It does not make sense to expect big data to be precise and to fit a particular favourite physical model as demanded by the classical deterministic world view. This paper advocates the position that there is value in data of any kind (good or not so good quality, or right or wrong fit to a physical model) and the challenge is for the new generation of researchers to uncover this value by hearing what data have to say for themselves, be it using probabilistic, machine learning, or other data-driven methods including those informed by physics and human experience, and to re-imagine the role of the geotechnical engineer in an immersive environment likely to be imbued by machine intelligence. Received 8 October 2019 Accepted 30 November 2019 1. Introduction Errors using inadequate data are much less than those using no data at all. Charles Babbage Every story has a beginning. The idea that statistics can be used to quantify uncertainties in the properties of natural soils (an intrinsic characteristic of site data) and this statistical approach can provide a rational basis for the selection of a suitably cautious design value may arguably be traced to Lumb’s classical Canadian Geotechnical Journal paper on “The Variability of Natural Soils” published in 1966. One may view this paper as being ahead of its time. The First International Conference on Applications of Statistics and Probability to Soil and Structural Engineering was organised in Hong Kong in 1971. It was not surprising that Professor Peter Lumb played a key role in launching this important conference series. The first ICASP in Hong Kong was followed by Aachen (1975), Sydney (1979), Florence (1983), Vancouver (1987), Mexico City (1991), Paris (1995), Sydney (1999), San Francisco (2003), Tokyo (2007), Zurich (2011), Vancouver (2015) and Seoul (2019). KEYWORDS Big Indirect Data (BID); generic databases; MUSIC-X; transformation models; site challenge; similarity index In response to a question on the “importance of statistics as a tool in engineering applications” posed during an interview before his retirement (Lam and Li 1986), Professor Lumb opined that Traditionally engineering and civil engineering are very deterministic in their teaching and in the attitude of their practitioners. When something goes wrong, it takes them by surprise. And yet all the things they are handling, the raw materials, the input and output, are random processes. If that can be taken seriously the method of design can be improved considerably. Instead of the old fashioned safety factor, the probability of failure type of approach is more satisfactory and practically far more useful. He further added that once you think of all these things as being random processes, it does clear up the engineer’s mind as well as improving his design. It makes him realize that he cannot predict what is going to happen precisely. This is what most engineers try to do. That is what they taught in schools and in universities in general: Engineering is precise, it is a science. Yet, in reality, it is vague. It is not a science. It is more an art. CONTACT Kok-Kwang Phoon kkphoon@nus.edu.sg Department of Civil and Environmental Engineering, National University of Singapore, Block E1A, #07-03, 1 Engineering Drive 2, Singapore 117576, Singapore © 2019 Informa UK Limited, trading as Taylor & Francis Group 4 K.-K. PHOON It was only in 1995 that a National Research Council report “Probabilistic Methods in Geotechnical Engineering” recommended that probabilistic methods, while not a substitute for traditional deterministic design methods, do offer a systematic and quantitative way of accounting for uncertainties encountered by geotechnical engineers, and they are most effective when used to organize and quantify these uncertainties for engineering designs and decisions. A Recommended Practice DNV-RP-C207 (DNV 2012) provides principles, guidance and recommendations for use of statistical methods for analysis and representation of soil data. The latest 4th edition of the international standard “General Principles on Reliability for Structures” (ISO2394:2015) includes a new Annex D dedicated to the reliability of geotechnical structures. Annex D recognises that geotechnical reliability-based design should place site investigation and the interpretation of site conditions/profile/data as the cornerstone of the methodology (Phoon et al. 2016). Despite these notable advances, it is accurate to say that data plays a supporting rather than a leading role in decision making in practice. After all, data have not spoken for themselves. Phoon (2017) pointed out that data scarcity (“curse of small sample size”) is more conspicuous in geotechnical engineering than structural engineering. Decision making strategies have evolved to be effective in such a data poor environment, thus making it even harder to monetise data because these predominant strategies do not need much data – some such as design by prescriptive measures requires almost none. One could say there is selection pressure against data in evolutionary parlance. This paper covers the estimation of useful statistics from the original classical univariate setting to a more realistic incomplete multivariate setting encountered in a typical site investigation programme. It traces the evolution of uncertainties as an inconvenient feature out of step with a deterministic world view that could be mitigated by adopting a judicious precedent-based cautious stance, to something entirely undesirable that should be minimised, to a fact of reality that should be coped with explicitly using statistics, and to an asset that can be exploited using Bayesian machine learning. The focus is not on mathematical gymnastics, but to demonstrate that data can produce valuable insights to support decision making in its own right, over and above its current value in physical modelling, ultimate/proof load testing, and monitoring. The common lament is that we do not have enough data to do this. This is not true. We have a lot of data, but they are shelved in design and regulatory offices after completion of a project. They are stored primarily for compliance purposes, rather than shared and further analysed to provide more insights and to support future decision making. In short, our data is mainly “dark”, a term defined in Gartner’s IT glossary (Gartner 2019). This is not true even for a smaller set of data published in the literature that has been progressively compiled into generic soil/rock databases in recent years. However, this is frequently true for one specific site. In fact, site-specific data are more challenging to deal with than simply “not enough” and “uncertain”. We now understand that there are at least seven rather than two attributes that define our data. Phoon, Ching, and Wang (2019) refer to these attributes as “MUSIC-X”: Multivariate, Uncertain and Unique, Sparse, Incomplete, and potentially Corrupted with “X” denoting the spatial/temporal dimension. The “unique” and “potentially corrupted” (in the sense of data containing outliers) attributes are very hard problems. It is important to emphasise here that the term “uncertainty” adopted in this paper refers to both imprecision in the knowledge of a particular physical parameter, say undrained shear strength, and the deeper imprecision in modeling this imprecise knowledge, say the mean and standard deviation of the random variable model or the scale of fluctuation of the random field model. This uncertainty of the uncertainty model, commonly referred to as statistical uncertainty, is notoriously difficult to address for sparse data. This paper shows that reasonable solutions are available even for our exceedingly modest site-specific data. There are also non-probabilistic solutions (Beer et al. 2013). This is outside the scope of this paper. All sensible engineers know that generic correlation models must be used with caution. It is better to adopt quasi-local correlation models supported by site-specific data and data from “similar” sites possessing comparable geology. At present, the engineer relies almost entirely on his/her experience working on other sites to construct such quasi-local models. One research challenge is how to do this algorithmically. This is called the “site challenge” (Phoon 2018). This will potentially extend the search for “similar” sites to anywhere anywhen, beyond regional/municipal databases that the engineer is familiar with over the duration of his practice and beyond the deep isolation that human experience encases each and every one of us in because it cannot be shared in full or with ease. Recent research shows that this site challenge is tractable even under full MUSIC constraints. The existence of a quasi-local correlation model that maintains an optimal balance between a generic database (that may not be directly and completely applicable GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS although extremely data rich) and a site database (that is fully applicable but extremely data poor) (cf. “Goldilocks dilemma” in Phoon 2020) remains an open research question at this point. 2. Role of data in design Simple calculations based on a range of variables are better than elaborate ones based on limited input. Professor Ralph Peck’s Legacy Website (Geoengineer 2019) All engineers make decisions in the face of uncertainty. After all, why would we need a factor of safety if we have omniscient access to perfect knowledge and information? Einstein and Baecher (1982) put this across succinctly: In thinking about sources of uncertainty in engineering geology, one is left with the fact that uncertainty is inevitable. One attempts to reduce it as much as possible, but it must ultimately be faced. It is a well-recognised part of life for the engineer. The question is not whether to deal with uncertainty but how? Uncertainty, in its broadest sense, can range from known unknowns where some knowledge and/or data exist for characterisation to unknown unknowns where alternate strategies such as robust or resilient design could be more applicable. Phoon (2017) opined that in between “white swans” (known unknowns) and “black swans” (unknown unknowns), there will be “grey swans” covering events that are foreseeable even in the absence of data or unforeseeable events that do not result in disproportionate consequences. In all likelihood, the factor of safety is intended to cover only white and possibly some grey swans. Casagrande (1965)’s classic paper and Terzaghi Lecture on “calculated risk” remains relevant to our practice. One should not confuse managing risk in the broad sense articulated by Casagrande with quantitative risk Table 1. Three-tier classification scheme of soil property variability for reliability calibration (Source: Table 9.7, Phoon and Kulhawy 2008). Geotechnical parameter Property variability COV (%) Undrained shear strength Lowa Mediumb Highc Lowa Mediumb Highc Lowa Mediumb Highc 10–30 30–50 50–70 5–10 10–15 15–20 30–50 50–70 70–90 Effective stress friction angle Horizontal stress coefficient Typical of good quality direct lab or field measurements. Typical of indirect correlations with good field data, except for the standard penetration test (SPT). c Typical of indirect correlations with SPT field data and with strictly empirical correlations. a b 5 assessment (QRA). One can manage risks cleverly with experience alone. For example, Terzaghi and Peck (1967) cited Kidder-Parker Architects’ and Builders’ Handbook (1931) in their Table 54.1 “Soil pressures allowed by various building codes”. An engineer could select an allowable bearing pressure based on the “character of foundation bed” and the name of a city. Table 1 of BS8004 (1986), “Presumed allowable bearing values under static loading” provides a range of bearing values for each type of rock and soil. Qualitative information such as “strong limestones and strong sandstones” and “schists and slates” for rocks and “firm clays” and “soft clays and silts” for cohesive soils is sufficient. Section 2.5 of Eurocode 7 describes design by prescriptive measures (EN 1997−1:2004). These presumed/prescribed values are conservative, but this is a sensible approach to risk management in the absence of sitespecific data and calculation models. In our current practice, building regulations typically mandate minimum ground investigation at a specific site, for example, the number of boreholes should be the greater of (i) one borehole per 300 m2 or (ii) one borehole at every interval between 10 and 30 m, but no less than 3 boreholes in a project site. It is possible to “predict” a reasonable sitespecific response (say bearing pressure) based on such limited information through the mediation of a physical model and a healthy dose of engineering judgment. We need some site-specific data to use this approach. We also need the “right” kind of data, which is chained to the input side of the model. What is “right” for one model may not be “right” for another model. Lambe (1973)’s Rankine Lecture explains the need to calibrate both data and model together. In addition, a factor of safety and experience are still needed as noted by Burland (1987) in his Nash Lecture. One may conclude that engineers are clever in adopting risk management strategies that work with the imperfect knowledge and the limited information they have at hand. There is no doubt that we have been successful. Failures are rare. So, what has changed? Some claimed that our internet traffic has exceeded a zettabyte (1021 bytes or roughly the number of sand grains on all the beaches on the planet) throughput per year as of 2016. We may not have a zettabyte currently, but we have a lot of data. They are just not directly useful such as not specific to the site of interest or need to be transformed to fit the input side of a physical model. Needless to say, they will also be imperfect in the sense of being uncertain, incomplete, and possibly even corrupted to some extent. Phoon, Ching, and Wang (2019) coined the term Big Indirect Data (BID) to refer to any data that are potentially useful but not directly applicable to the decision at hand. All engineering decisions are 6 K.-K. PHOON ultimately black and white, be it choosing the dimensions of a structure, time interval between maintenance, or issuance of an evacuation notice, notwithstanding the imperfect nature of our data, methods, and understanding of reality. The adjective “useful” is used in the context of supporting such real world decisions. The generic soil/rock and load test databases presented in the next section will be one type of BID. Monitoring data is another BID. It is timely to ask ourselves how existing strategies that are tailored to work effectively in a data poor environment can monetise BID. First, there is no mechanism to update presumptive bearing pressures or factors of safety using data. There is a formal mechanism to do this for resistance factors that are calibrated from statistics (e.g. Phoon, Kulhawy, and Grigoriu 2003; Paikowsky et al. 2004; Fenton et al. 2016; Tang and Phoon 2018). There is no space to engage in a full discussion on reliability-based design, simplified or otherwise, but it is reasonable to lean towards mechanisms that are responsive to data and self-improve with data, particularly if we have zettabytes coming our way. Second, a physical model erects a significant computational barrier between input and all other data. For example, monitoring data constitutes the basis for our observational approach. However, it is assuredly “non-input” data. System identification techniques are needed to back-calculate the equivalent input data before updated predictions are possible. For a large 3D finite element model, forward calculations (inputs to outputs) can take days using current computers. Backward calculations (outputs to inputs) will take longer. It is difficult to leverage on our powerful physical models to glean deeper insights from monitoring data, particularly for risk management of big projects in real time. System identification in its most general form can link inputs and outputs without the mediation of a physical model (black box approach as opposed to the physicsbased white box approach). The most common example of system identification in geotechnical engineering is an artificial neural network (Shahin, Jaksa, and Maier 2001; Jaksa, Maier, and Shahin 2008). Interestingly, machinelearning methods are also data-, rather than physics-driven, in part because they accommodate all data, whether they are good or not so good quality, or right or wrong fit to a physical model. In fact, some machine-learning methods have been successful even when the data used have been judged “useless” by a human expert. Such a physics-free and judgment-neutral approach can be exceedingly powerful as demonstrated by Google’s AlphaGo project. It is interesting that our practice is almost entirely dominated by a white box approach, although it is known in many fields that this approach has limits in dealing with complex real world processes. One should be mindful that real data emerge from such complex processes, not from idealised physical models. Another limitation is that a physical model cannot “learn” on its own to be better as data and problem scenarios evolve in real time. Actually, in the author’s opinion, our practice is more accurately described as pure white box, because it does not admit uncertainty explicitly. Although somewhat exaggerated, one could argue that we are philosophically aligned to Laplace’s Demon who famously said (Laplace, not the hypothetical demon) in his “A Philosophical Essay on Probabilities”: We ought then to regard the present state of the universe as the effect of its anterior state and as the cause of the one which is to follow. Given for one instant an intelligence which could comprehend all the forces by which nature is animated and the respective situation of the beings who compose it – an intelligence sufficiently vast to submit these data to analysis – it would embrace in the same formula the movements of the greatest bodies of the universe and those of the lightest atom; for it, nothing would be uncertain and the future, as the past, would be present to its eyes. We do go to some lengths to collect good quality data directly relevant to our physical models and if this is not possible, we apply empirical rules to be conservative. Do we collect only high quality data that are limited in quantity or only lower quality data in larger quantity? Do we combine them? When shouldn’t they be combined? The jury is still out on these important questions, but there is no hope to make progress without an explicit uncertainty model. Scott A. Barnhill left the following message on Professor Ralph Peck’s legacy website: “Perhaps engineers trained in geology have an advantage. They are more likely to accept mother nature as she exists, rather than as created in the mind of the engineer” (Geoengineer 2019). This practical wisdom needs reinforcing if we would like to engage emerging digital technologies with greater haste. The Institution of Civil Engineers (UK) State of the Nation Report (2017) observed that “the infrastructure sector has been slow to engage with the uptake of new digital technologies compared with other industries”. Let me call the future of geotechnical engineering (unimaginatively) as Geo 4.0. If Geo 4.0 needs to operate in an overwhelmingly data-rich cyber-physical environment, it is reasonable to question if a pure white box approach will continue to be a winning strategy. A black box or grey box (physics-informed data-driven) approach may be more effective. The point is not to be philosophical, but to be pragmatic. The statistician George Box once said: “All models are wrong, but GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS some are useful”. This seems like a good adage to follow when we explore how physics, data, and experience can be combined in even more clever ways to support decision making. Peck (1980) adopted the intriguing question “Where has all the judgment gone?” as the title for his fifth Laurits Bjerrum memorial lecture. The honest answer nowadays is no one knows. Philosophers are asking the same question as the power of artificial intelligence expands beyond what was previously thought to be reachable only by the human mind. The Leverhulme Centre for the Future of Intelligence (http://lcfi.ac.uk/) was established for this reason. One can safely say that machinehuman interactions will be transformed in unimaginable ways and our human minds will be enhanced (to put it mildly) as part of this transformation. At a more mundane level, it is already possible for Bayesian methods to learn some limited aspects of expert judgment, which will vastly expand the sharing of this “digitized experience” beyond what we can do with conventional “on the job” training at the individual level (Vick 2002). The next two sections will briefly discuss: (1) generic databases (BID) to provide an overview of the attributes of geotechnical data and (2) preliminary research to address the “site challenge” under the realistic data constraints of MUSIC as an example of what Bayesian machine learning could do. 3. Generic databases Numbers have an important story to tell. They rely on you to give them a clear and convincing voice. Stephen Few Lumb (1966)’s classic paper on “The Variability of Natural Soils” showed that the variations in the properties of four typical Hong Kong soils (a soft marine clay, an alluvial sandy clay, a residual silty sand, and a residual clayey silt) about a mean trend can be characterised as random variables following distributions such as normal, lognormal, and bi-normal distributions. The ensuing body of work on soil properties was published in diverse venues such as ICASP, ASCE symposiums (e.g. Characterisation of soil properties: bridge between theory and practice, Atlanta, Georgia, 1984; Uncertainty in the geologic environment: from theory to practice, Madison, Wisconsin, 1996), and reports (e.g. Filippas, Kulhawy, and Grigoriu 1988; Orchant, Kulhawy, and Trautmann 1988; Spry, Kulhawy, and Grigoriu 1988; Kulhawy, Birgisson, and Grigoriu 1992; National Research Council 1995; Vanmarcke and Fenton 2003), before culminating in an extensive compilation of univariate statistics by Phoon & Kulhawy (1999a, 1999b) that further generalised the characterisation of natural variations from a 7 random variable model (measurements at different depths are independent) to a random field model (measurements at different depths are correlated). The theoretical application of a random field in geotechnical engineering was popularised in an earlier paper by Vanmarcke (1977). Other seminal contributions were also made by pioneers such as Wilson Tang (Tang 1984; Lacasse, Liu, and Nadim 2017), Tien H Wu (Wu et al. 1989, 1996; Baecher and Christian 2019), Harr (1987), Lacasse and Nadim (1996), Gregory Baecher and John Christian (Baecher 1987; Baecher and Christian 2003), Herbert Einstein (Einstein and Baecher 1983; Einstein et al. 1996), and many others. The author is unable to do even partial justice in this cursory overview of more than five decades of work in trying to coax data that are 100% accurate (within measurement limits) to say something useful about the unknown state of the ground between measured locations. It goes without saying that these ground truths can only be approximate (no free lunch!) and an explicit uncertainty model is again necessary. Over the years, the terms random field and spatial variability have become synonymous, although the two concepts are distinct. The former is a mathematical model. The latter is a description of a geological reality. There is no guarantee that a random field model, particularly the common second-order (stationary) version fully described by an autocorrelation function, is an adequate representation of this reality for all geologic settings. The predominance of a second-order field model in the literature arises in part from the difficulty of characterising higher-order fields with limited data. Theoretical higherorder fields do exist (Shields and Kim 2017). Phoon, Ching, and Wang (2019) pointed to several other limitations of the widely used second-order field model. While observing that this model is a closer match to reality compared to the independent and identically distributed (i.i.d.) model, the authors observed that it cannot be applied in its most general non-stationary form because we do not have sufficient site investigation data for statistical characterization. The current practice is to assume a trend function can be removed from the data and the residuals are second-order stationary within a typical site. The reason for this assumption is that pairs of measurements regardless of where they are measured can be used to estimate the autocorrelation function. Needless to say, there is no trend, no stationary residuals, and no autocorrelation function in reality. These concepts exist purely within the stationary random field model. The authors further highlighted that trend removal can be difficult (Ching, Wu, and Phoon 2016; Ching et al. 2017; Ching and Phoon 2017). 8 K.-K. PHOON Estimation of random field parameters is also computationally challenging (Tian et al. 2016; Wang H. et al. 2018; Xiao et al. 2018). Fine details of the autocorrelation function such as sample path “smoothness” are important (Ching and Phoon 2019a). Characterization of site stratigraphy is a major missing feature of past random field studies until quite recently. (Wang, Huang, and Cao 2013; Ching et al. 2015; Li et al. 2016; Qi et al. 2016; Wang X. et al. 2016; Wang H. et al. 2017; Wang X. et al. 2018; Cao et al. 2019; Wang H. et al. 2019; Wang X. et al. 2019; Wang, Hu, and Zhao 2019) The difficulties have nothing to do with theory. They have everything to do with statistical characterisation using actual data. Two statistics are needed to describe a second-order (stationary) random field model, namely the coefficient of variation (COV) and the scale of fluctuation (SOF). The COV is needed to describe the scatter about the mean trend in the basic random variable model, basically a characteristic amplitude of the fluctuations. The SOF is an additional statistics that roughly describe the distance over which the measurements are strongly correlated in a random field model (DeGroot and Baecher 1993; Jaksa 1995; DeGroot 1996). It is a characteristic wavelength of the fluctuations. If the SOF is much larger than the mobilised volume of soil, the basic random variable model can be adopted. It is accurate to say that measurements sampled at a depth interval larger than the SOF can be modelled as independent random variables. Or to put this in another way, information on spatial variability cannot be captured by a sparse sampling grid. From this practical perspective, the random field model merely allows measurements sampled at any depth interval, including near continuous cone penetration test soundings, to be modelled with greater realism. The most complete compilation of COVs for both soils and rocks to date is given by Phoon et al. (2016). An updated compilation for the SOF is currently in progress (Cami, Javankhoshdel, and Phoon 2020). The practical value of characterising natural variations in the form of a COV is that resistance factors can be calibrated more realistically based on our knowledge of soil parameters, such as the three-tier classification scheme of soil property variability shown in Table 1. A similar scheme appears in the 2014 edition of the Canadian Highway Bridge Design Code (CAN/CSAS614:2014) that presents different resistance factors depending on the “degree of understanding” (low, typical, high) (Fenton et al. 2016) and others (e.g. Paikowsky et al. 2004; Bathurst, Javankhoshdel, and Allen 2017). For the first time, we can establish a defensible link between site investigation efforts and the economy of the design, which gives us a fair chance of swaying more businessoriented clients to accept site investigation as an investment rather than a cost (Ching, Phoon, and Yu 2014). The practical value of characterising spatial correlations in the form of a SOF is that the COV of a spatial average can be reduced, thus permitting higher resistance factors to be used for problems governed by spatial averages. Another practical value is that soil properties at unsampled locations can be interpolated more accurately using kriging or general regression (Yuen, Ortiz, and Huang 2016; Yuen and Ortiz 2016, 2018) when spatial correlations are available. This brief review is not intended to be up-to-date or comprehensive on what we know about spatial variability, its value, and its impact on design. Research has advanced considerably beyond Vanmarcke’s classic paper in 1977. The interested reader can refer to the Joint TC205/TC304 Working Group Report (2017) on “Discussion of statistical/ reliability methods for Eurocodes”, which has been made available at the ISSMGE TC304 website: http:// 140.112.12.21/issmge/tc304.htm. The characterisation of geotechnical data has become even more realistic with the compilation of multivariate databases over the past decade. Momentum is gathering worldwide to screen, organise, and share our valuable data to hasten the pace of our digital transformation such as Project 304 dB (TC304 2019). Ching, Li, and Phoon (2016) provided a useful overview of generic multivariate databases on soil/rock properties. Table 2 shows an updated summary of these databases, labelled as (geo-material type)/(number of parameters of interest)/(number of data points). For example, the CLAY/ 10/7490 database consists of 7490 records from 251 studies carried out in 30 countries. Each record contains ten clay parameters measured at roughly the same depth, although some may be missing. The CLAY/10/7490 database is global in coverage. In contrast, the SHCLAY/11/4051 municipal database covers 50 sites in Shanghai (Zhang et al. 2019). Another source of information frequently collected comes from pile load tests. The performance databases for other geotechnical structures (in addition to piles) are available, but less commonly reported in the literature. A comprehensive survey of these databases was recently carried out by Phoon and Tang (2019). Table 3 includes further updates. The following geotechnical structures are covered: (1) shallow and deep foundations, (2) offshore spudcans, (3) mechanically stabilised earth and soil nail walls, (4) pipes and anchors (plate, helical, and shoring), (5) slopes and base heave, (6) cantilever walls, and (7) braced excavations. Details are given elsewhere (Phoon and Tang 2019). Case studies are even more informative, but no systematic compilation has been Table 2. Summary of some soil/rock property databases (updated from Phoon and Ching 2017). Range of parameters Database Reference Parameters of interest # Data points # Sites/studies Ching and Phoon (2012) Ching, Phoon, and Yu (2014) CLAY/7/6310 su from 7 different test procedures 6310 164 studies CLAY/10/7490 Ching and Phoon (2013, 2015) Ching and Phoon (2014) 7490 251 studies CLAY/9/249 D’Ignazio et al. (2019) LL, PI, LI, s′ v /Pa , St, Bq, s′ p /Pa , su /s′ v , (qt − sv )/s′ v , (qt − u2 )/s′ v s′ v /Pa , σv/Pa, s′ v /Pa , qt/Pa, u2/Pa, u0/Pa, PI, wn, St FG/KSAT/4/1358 FI-CLAY/7/216a Feng and Vardanega (2019) D’Ignazio et al. (2016) e, LL, wn/LL, −ln(ksat) ′ ′ sFV u , s v , s p , wn, LL, PL, St 1358 216 JS-Clay/5/124b Liu et al. (2016) Mr, qc, fs, wn, γd 124 16 RFG/TXCU-278 Beesley and Vardanega (2019) Hov et al. (2019) su /s′ v , γ50 CIU, OCR, γ50 CKU 278 21 studies 499 Sweden c SE-CLAY/4/499 DSS sFV u , su , sv )/s′ v , (qt − u2 )/s′ v , (u2 − u0 )/s′ v , Bq ′ s p , LL 345 535 249 37 sites 40 sites 18 sites 33 studies 24 sites SH-CLAY/11/ 4051 Zhang et al. (2019) LL, PI, LI, e, K0, s′ v /Pa Su /s′ v(UCST) , St(UCST), Su /s′ v(VST) , St(VST), ps /s′ v 4051 50 sites (Shanghai) SAND/7/2794 ROCK/9/4069 Ching et al. (2017) Ching et al. (2018) D50, Cu, Dr, s′ v /Pa , fʹ, qt1, (N1)60 n, γ, RL, Sh, σbt, Is50, Vp, σc, E 2794 4069 176 studies 184 studies a 1–4 1–6 PI St – Low to very high plasticity Sensitive to quick clays Insensitive to quick clays 1–10 Low to very high plasticity Insensitive to quick clays 1–10 Low to very high plasticity Insensitive to quick clays 1–10 Low to very high plasticity Insensitive to quick clays – Low to very high plasticity – 1– Low to very high plasticity Insensitive to quick 7.5 clays Soft to stiff clayey soils and silty clay soils with high variability of the strength and stiffness characteristicsMr = 12.54–95.82 MPa, qc = 0.22–3.93 MPa, fs = 0.03–0.14 MPa, wn (%) = 6.91–78.11, γd = 10.47–19.92 kN/m3 1–32 Low to medium-high plasticity 10–20 s′ p = 13–505 kPa; sFC u = 5–101 kPa; sDSS u = 6–53 kPa; LL = 22–145% Normal consolidated to slightly over-consolidated clay; Very soft clay (LI = 0.49–2.19) with slight to medium plasticity (PI = 10.4– 26.5) and with medium to high sensitivity (St = 2.7–7.8) 1–15 D50 = 0.1–40 mm, Cu = 1–1000 + Dr = −0.1–117% γ = 15–35 kN/m3, n = 0.01–55%σc = 0.7–380 MPa, E = 0.03– 120 GPa F-CLAY renamed as FI-CLAY to follow internet domain for Finland (FI). J-CLAY renamed as JS-CLAY to follow phonetics abbreviation of Jiangsu (JS). c SE-CLAY/4/499 based on S. Larsson (personal communications, 2019). Notes: LL = liquid limit; PL = plastic limit; PI = plasticity index; LI = liquidity index; wn = natural water content; e = void ratio; ksat=saturated hydraulic conductivity; Mr = resilient modulus; qc = cone tip resistance; fs = sleeve friction; γd = dry density; D50 = median grain size; Cu = coefficient of uniformity; Dr = relative density; σv = vertical total stress; s′ v = vertical effective stress; s′ p = preconsolidation stress; su = undrained shear strength; sFV u = ′ ′ undrained shear strength from field vane; sre u = remoulded su; fʹ = effective friction angle; St = sensitivity; OCR = overconsolidation ratio, (qt − sv )/s v = normalised cone tip resistance; (qt − u2 )/s v = effective cone tip resistance; u0 = hydrostatic pore pressure; (u2 − u0 )/s′ v = normalised excess pore pressure; Bq = pore pressure ratio = (u2-u0)/(qt-σv); Pa = atmospheric pressure = 101.3 kPa; qt1 = (qt/Pa) × CN (CN is the correction factor for overburden stress); (N1)60 = N60×CN (N60 is the N value corrected for the energy ratio); n = porosity; γ = unit weight; R = Schmidt hammer hardness (RL = L-type Schmidt hammer hardness); Sh = Shore scleroscope hardness; σbt = Brazilian tensile strength; Is = point load strength index (Is50 = Is for diameter 50 mm); Vp = P-wave velocity; σc = uniaxial compressive strength; E = Young’s modulus; γ50 CIU = shear strain to mobilise 0.5su under isotropically-consolidated undrained conditions; γ50 CKU = shear strain to mobilise 0.5(su – τ0); τ0 = initial shear stress; ps = cone tip resistance from CPT which is unique in China without the measurement of pore pressure; SDSS u = undrained shear strength from direct simple shear test. b GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS CLAY/5/345 CLAY/6/535 ′ ′ LI, su, sre u , s p, s v su /s′ v , OCR, (qt − OCR 9 10 K.-K. PHOON Table 3. Summary of performance databases for some geotechnical structures (updated from Table 1; Phoon and Tang 2019). Geotechnical structures Shallow foundations Offshore spudcans Drilled shafts (vertical load) Drilled shafts (lateral load) Augered cast-in-place piles Driven piles Helical piles Driven cast-in-situ piles Pile foundations Micropiles Foundations Mechanically stabilised earth walls Soil nail walls Multi-anchor walls Slopes Excavations (base heave) Pipes Plate anchors Plate anchors Helical anchors Database/reference Data source Test type Geomaterial N UML-GTR ShalFound07 (Paikowsky et al. 2010) UML-GTR RockFound07 (Paikowsky et al. 2010) Akbas (2007) Samtani and Allen (2018) SpreadFound/1026 (Tang et al. 2019) Tang and Phoon (2019a) Ng et al. (2001) AbdelSalam, Baligh, and El-Naggar (2015) Asem, Long, and Gardoni (2018) DSHAFT (Garder et al. 2012) Motamed, Elfass, and Stanton (2016) Stark et al. (2017) TxDOT (Moghaddam et al. 2018) Tang, Phoon, and Chen (2019) EPRI (Chen and Kulhawy 1994) Chen and Lee (2010) Chen, Lin, and Kulhawy (2011) Marcos and Chen (2013) Reddy and Stuedlein (2017) McVay et al. (2016) AAU-NGI (Augustesen 2006) Zhang et al. (2006) Long et al. (2009) PILOT (Roling, Sritharan, and Suleiman 2011) PSU (Smith et al. 2011) Long and Anderson (2014) ZJU-ICL (Yang et al. 2016) Long (2016) Lehane et al. (2017) Adhikari et al. (2018) TxDOT (Moghaddam et al. 2018) Tang and Phoon (2018a, 2018b, 2019b) Tang and Phoon (2018c, 2019c) Long (2013) Flynn (2014) FHWA DFTLD (Abu-Hejleh et al. 2015) Dithinde et al. (2011) IFSTTAR (Burlon et al. 2014) Niazi (2014) Galbraith, Farrell, and Byrne (2014) AUT-CPT (Moshfeghi and Eslami 2018) WBPLT (Chen et al. 2014) LADOTD (Rauser and Tsai 2016) Nanazawa et al. (2019) Almeida and Liu (2018) EPRI (Kulhawy et al. 1983) Huang and Bathurst (2009) Miyata and Bathurst (2012a) Miyata and Bathurst (2012b) Miyata, Bathurst, and Allen (2014) Miyata and Bathurst (2015) Miyata and Bathurst (2019) Allen and Bathurst (2018) Miyata, Yu, and Bathurst (2018) Wood et al. (2012a, 2012b) Lazarte (2011) Cheung and Shum (2012) Lin, Bathurst, and Liu (2017) Liu et al. (2018) Yuan et al. (2019) Miyata, Bathurst, and Konami (2011) Travis, Schmeeckle, and Sebert (2011) Bahsan et al. (2014) Wu, Ou, and Ching (2014) White, Cheuk, and Bolton (2008) Stuyts, Cathie, and Powell (2016) Ismail, Najjar, and Sadek (2018) White, Cheuk, and Bolton (2008) Stuyts, Cathie, and Powell (2016) Tang and Phoon (2016) Global Global Global USA/Europe Worldwide – Hong Kong Egypt Global Iowa, USA Las Vegas Valley Illinois, USA Texas Global Global Global Global Global USA Florida, USA Global Hong Kong Wisconsin, USA Iowa, USA Global Illinois, USA Global Wisconsin, USA Global Wyoming, USA Texas Global Canada/USA Wisconsin, USA United Kingdom Mainly in USA South Africa France Global Ireland Global Global Louisiana, USA Japan Canada USA – Japan Japan Japan Japan Global – – Texas, USA – Hong Kong Global – China Japan Global – Global – – – – – – Laboratory/field Field Field Field Prototype Centrifuge Field Field Field Field Field Field Field Field Field Field Field Field Field Field Field Field (static/dynamic) Field (dynamic) Field Field Field (dynamic) Field Field (static/dynamic) Field Field Field Field Field Field Field Field Field Field Field Field Field Field Field (static/dynamic) Field Field Field Laboratory Laboratory/in situ Laboratory Laboratory Field In situ Field In situ/laboratory Laboratory Field Field In situ In situ In situ In situ Field Field In situ Small/full-scale Small/full-scale Small scale/centrifuge Small/full-scale Small/full-scale Laboratory Field Cohesionless Rock Cohesionless Cohesionless Various Clay with sand Rock/saprolite Various Soft rock Various Caliche Weak rock Various Various Clay/sand Clay/sand Clay/sand Gravel Cohesionless Various Various Weathered granite Various Various Various Various Sand IGM Various Soft rock Various Various Various Various Sand Various Various Various Various Various Various Various Various Various Ontario soils Various Cohesionless Cohesionless Various N/A Various Cohesionless Various Various Cohesionless – CDG/CDV – – Various Various Various Clay Cohesive Sand Sand Sand Sand Sand Cohesive 549 122 400 80 1026 159 38 318 190 38 41 155 27 320 88 99 40 24 112 78 420 1514 316 275 322 111 117 215 120 25 33 783 1010 182 116 1567 174 174 330 175 466 613 1465 441 47 804 318 652 503 362 520 113 378 202 650 166 913 123 95 144 28 157 43 24 61 108 143 54 192 78 25 (Continued ) GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS 11 Table 3. Continued. Geotechnical structures Database/reference Data source Test type Shoring anchors Chahbaz, Sadek, and Najjar (2019) Beirut Field Cantilever wall Excavation (stability) Phoon et al. (2009) Marsland (1953) – – Excavation (wall displacement) Long (2001) Moormann (2004) Wang, Xu, and Wang (2010) Wu, Ching, and Ou (2013) Global Global Shanghai Taipei Centrifuge Small-scale Large-scale Field Field Field Field Geomaterial Clay/marl/ limestone Sand Loose/dense sand Various Soft soil Soft soil Soft clay N 70 20 23 10 296 530 300 22 Notes: CDG = completely decomposed granite; CDV = completely decomposed volcanic; IGM = intermediate geomaterial; N = number of load tests; NUS = National University of Singapore; UWA = University of Western Australia; ZJU = Zhejiang University; ICL = Imperial College London. carried out perhaps with the exception of liquefaction (Andrus, Stokoe, and Chung 1999; Cetin et al. 2004; Moss et al. 2006; Idriss and Boulanger 2010; Ku et al. 2012; Juang, Ching, and Luo 2013; Kayen et al. 2013). We have a lot of data, but our data is mainly “dark” because it is typically not exploited to provide insights or to support decision making once the project that produced the data is completed. For soil/rock properties, the most basic design decision in current geotechnical practice is to estimate their values from other test results, typically field test results. Phoon and Kulhawy (1999a) identified at least three sources of uncertainties in a comprehensive statistical study of a broad range of laboratory and field test data: (1) spatial variability, (2) measurement errors (including statistical uncertainty due to limited measurements), and (3) transformation uncertainty. The third source of uncertainty arising from transforming a soil/rock parameter such as the overconsolidation ratio to a design parameter such as the normalised undrained shear strength can be significant as shown in the data scatter in Figure 1. Although it is widely known that local transformation models (dashed lines in Figure 1) are preferred to those calibrated from a generic database such as CLAY/10/ 7490, there are no methods to quantify this “site effect” for data routinely collected in a typical project (in contrast to data specially collected for a research study). Nonetheless, the variety of dashed lines, each referring to a local transformation model, clearly shows that this site effect is important. The need for building regulations to mandate a site investigation in every project is a recognition that every site is unique to some degree. Clearly, geotechnical data are “uncertain” and “unique” to some extent. The former characteristic is more familiar and better studied. Building regulations do not permit site investigation efforts to vary as a function of how much is known at neighbouring/comparable sites, even if the sites were to be adjacent to the site of interest, possibly because there are no statistical methods that can manage uncertain, sparse, and somewhat unique data from different sites in an acceptable way. The million dollar question (literally, if one were to consider how many mandatory site investigations are carried out worldwide in any time period) is whether we can get more value from site data beyond establishing generic transformation models. 4. Value of data Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom. Clifford Stoll Geotechnical data are often referred to as “uncertain” in the qualitative sense of “I don’t know”, rather than with a mathematical formalism in mind. In fact, it is accurate to say that the majority harbours the sentiment that formalism such as statistics is not possible, because of data scarcity. One may venture to guess that the slow progress in geotechnical reliability-based design or other formal risk-informed design methodology is partially impeded by the lack of an accurate understanding of the attributes of geotechnical data beyond broad generalities such as “uncertain” and “scarce”. In fact, many practitioners do not appreciate the power of statistics in its unusual ability to quantify even the uncertainties in the models it posits. In short, statistics can quantify the precision limits of its own models consistently based on the available data at hand. The National Research Council (1995) made this point rather clearly: “The lack of a large data set does not preclude the use of probability theory. Probability theory can be used to evaluate the uncertainties involved in working with meager information”. Nonetheless, this deep insight has been lost, as practitioners continue to express some reservations under the misconception that there is insufficient data to characterise a probability model, such as its parameters (mean, COV) and its shape (normal, lognormal, beta, etc.). Many miss the point that even ignorance can be approximately quantified. This is exceedingly powerful, because it allows an engineer 12 K.-K. PHOON Figure 1. Correlation between normalised undrained shear strength (su /s′ v ) and overconsolidation ratio (OCR) (Ching and Phoon 2019b). to weigh the cost of making a decision against the cost of collecting more information to reduce imprecision in some aspects of the problem. This paper does not cover sensitivity analysis, but its usefulness to decision making is clear. A naïve one-at-a-time deterministic sensitivity analysis can produce misleading results for a number of reasons, but one simple reason would be the inability to take care of dependencies in deterministic analysis. Dependency is a feature of all multivariate real world data (Ching, Li, and Phoon 2016). It is most commonly captured by a correlation coefficient in basic statistics (Ching, Phoon, and Li 2016). Phoon (2018) suggested that the attributes of geotechnical data can be succinctly described as MUSIC: Multivariate, Uncertain and Unique, Sparse, and InComplete. It is useful to clarify in passing that “scarce” refers to a small number of measurements, while “sparse” refers to a small number of measurements widely distributed in space. Given that all site data are situated in space (and sometimes, in time), the term “sparse” is more descriptive as it is unlikely for measurements to be taken at one corner of a site. Phoon, Ching, and Wang (2019) further suggested that MUSIC can be re-interpreted to cover extremes: Multivariate, Uncertain and Unique, Sparse, Incomplete, and potentially Corrupted. Ching and Phoon (2019b) subsequently extended MUSIC to MUSIC-X, where “X” denotes the spatial/temporal context of the data. The screening for extremes or outliers and spatial variability are clearly important, but these aspects are not covered in this paper. Table 4 is a site-specific example of an actual MUSIC database from Taipei. With the exception of the mobilised undrained shear strength [su(mob)], each column contains the results from a different and independent test. Good practice requires a suite of tests to be conducted for cross-validation, identification of layer boundaries, estimation of design properties, and others. Hence, geotechnical data are intrinsically “multivariate” in nature. There is an obvious tradeoff between conducting different tests in different locations and conducting different tests in the same location. The former strategy collects more information on the spatial variability of the site. The latter strategy collects information on the cross-correlations among all tests. In practice, it is common to adopt an intermediate strategy that involves conducting different test combinations at different depths and locations. The greyed out cells in Table 4 denote absent measurements. Hence, geotechnical data are typically “incomplete”. A data table without missing entries is an exception rather than the norm in geotechnical engineering. It is useful to observe that the greyed out cells are not randomly distributed. They occur more frequently in columns where measurements are more costly and the percentage of absent measurements can be very high in these columns. GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS 13 Table 4. Site investigation data for a silty clay layer at a Taipei site (Ou and Liao 1987). Test results Depth (m) 2 su (kN/m ) 2 su(mob) (kN/m ) LL PI LI ′ s v /Pa s′ p /Pa su (mob)/s′ v qt1 12.8 UU 55.2 46.9 30.1 9.1 1.20 1.26 1.71 0.37 3.35 14.8 VST 50.7 52.9 32.8 12.8 1.43 1.43 0.36 3.34 16.1 UU 61.9 51.7 36.4 14.5 1.24 1.54 0.33 3.15 17.8 UU 54.2 42.8 41.9 18.9 0.90 1.68 1.79 0.25 2.74 18.3 VST 59.5 59.3 1.72 0.34 2.76 20.2 UU 73.1 60.5 38.1 17.3 0.70 1.88 0.32 2.73 22.7 VST 63.3 64.4 37.0 16.0 0.58 2.08 0.31 2.97 24.0 UU 82.2 67.5 38.0 16.2 0.75 2.19 2.19 0.30 2.80 26.6 UU 98.1 82.1 34.8 13.8 0.80 2.41 0.34 3.92 Note: LL = liquid limit; PI = plasticity index; LI = liquidity index; s′ v = vertical effective stress; s′ p = preconsolidation stress; Pa = atmospheric pressure = 101.3 kPa; qt = (corrected) cone tip resistance; qt1 = (qt − sv )/s′ v ; su = undrained shear strength; su(mob) = mobilised su values (Mesri and Huvaj 2007). Each row (record) in a MUSIC database such as Table 4 refers to data collected at the same depth from different tests conducted in close proximity. There are only n = 9 rows in Table 4. Hence, geotechnical data are “sparse”. However, this is not true for a generic database such as CLAY/10/7490. The CLAY/10/7490 database consists of n = 7490 records from 30 countries for ten clay parameters. In short, site-specific data can be sparse, but generic data are not sparse, although they may not be complete and directly applicable to a specific site. To counter the prevalent sentiment that there is no big data in geotechnical engineering, Phoon, Ching, and Wang (2019) explicitly refer to any data that are potentially useful but not directly applicable to the decision at hand as Big Indirect Data (BID). A generic database will be one type of BID. A compilation of case studies can be regarded as another type of BID. Although site effects are well known, they are mainly characterised in research studies through a testing programme that is more detailed than what is routinely carried out in practice and for rather distinctive geomaterials. Kulhawy and Mayne (1990) pointed out that “comprehensive characterization of the soil at a particular site would require an elaborate and costly testing programme, well beyond the scope of most project budgets”. To the knowledge of the author, no one has quantified site effects numerically based on more routine data such as those shown in Table 4 commonly collected at a project level. In practice, site effects are broadly appreciated based on geology, soil mechanics, and experiences at comparable sites, rather than characterised quantitatively through a detailed multivariate analysis of the site data that meets MUSIC constraints. The typical caveat included in design guides would include a general statement such as caution must always be exercised when using broad, generalized correlations of index parameters or in-situ test results with soil properties. The source, extent, limitations of each correlation should be examined carefully before use to ensure that extrapolation is not being done beyond the original boundary conditions. ‘Local’ calibrations, where available, are to be preferred over the broad, generalized correlations. (Kulhawy and Mayne 1990) Notwithstanding this sensible caveat, the engineer is typically left with no recourse but to use these generalised correlations in the absence of “local” versions. Hence, BID is already routinely used in practice in the form of Figure 1. One could surmise that it has some real value. The research challenge is to distil more value out of BID. 4.1. Bayesian machine learning Can we address some attributes of MUSIC-X based on the meagre information we have at hand? This is arguably the central question that practitioners are most interested to know. The short answer is yes. Recently, Ching and Phoon (2019c) proposed a novel Bayesian machine learning method to do this, namely to construct a site-specific distribution function for a MUSIC database such as that shown in Table 4. Each database is effectively a table with m columns representing soil parameters (Y1, Y2, … , Ym) and n rows representing measurements at different depths. The observed data are denoted by Y o and unobserved data denoted by Y u. Because soil parameters can be highly non-normal, Ching and Phoon (2015) adopted an analytical transformation based on the Johnson distribution to convert (Y1, … , Ym) to approximately normal data. The approximately normal data are denoted by x = (X1, … , Xm)T, where “T” refers to vector/matrix transpose. A key assumption made in Ching and Phoon (2019c) is that x at a certain depth follows the multivariate normal probability density function (PDF): 1 m − f (x|ms , Cs ) = |Cs | 2 (2p) 2 1 T −1 × exp − (x − ms ) Cs (x − ms ) (1) 2 − 14 K.-K. PHOON The multivariate normal PDF has mean vector = μs and covariance matrix = Cs; the subscript “s” is to highlight that μs and Cs are “site-specific”. Because site-specific data are sparse (small n), it is technically challenging to estimate μs and Cs using conventional methods such as matching moments or maximising likelihood. It is also very challenging to estimate the statistical uncertainties associated with μs and Cs, which are significant for a fairly typical record size of around 10. Ching and Phoon (2014) pointed out that it is impossible to guarantee Cs to be positive definite when the multivariate data is incomplete. To address these critical limitations, Ching and Phoon (2019c) developed a novel Gibbs sampler (GS) to overcome this long standing challenge. The key idea is to treat μs, Cs, and x u (transformed from Y u) as unknown random quantities and to sequentially sample one random quantity at a time from distributions conditioned on the rest of the quantities and the observed data x o (transformed from Y o) using GS. Simulation is practical because these conditional probabilities are available in closed-form for suitably chosen conjugate priors. While Bayesian methods are known to be very powerful, there is no acknowledgment in the literature that these methods are very complex and computationally intensive. Excessive emphasis on what works in principle rather than what works in practice will not attract more users. This GS has been generalised to MUSIC-X recently (Ching and Phoon 2019b). Consider properties at a new depth (xnew) that does not appear in the training data previously used in the GS. Based on the total probability theorem, the conditional multivariate PDF f(xnew|X o) is a mixture of multivariate normal PDFs: f (x new |X 0 ) = f (x new |m s , 1 ≈ T − tb T t=tb +1 Cs )f (m s , Cs |X 0 )dm s dCs N(x new |m s.t , Cs,t ) (2) where (μs,t, Cs,t) are the GS samples at time step = t; tb is the end of the burning-period; and T is the total number of GS time steps or samples. The simulation of a sitespecific probability distribution appears very complicated to the average engineer, but it can support a critical design decision on how to choose soil/rock properties at a particular site by “learning” from site-specific data alone. An example based on actual data from a Taipei site (Table 4) is shown in Figure 2. Although Table 4 contains 9 records, note that s′ p /Pa is only measured at 3 depths. It is not surprising that the statistical uncertainties in Figure 2(b) is large. For Figure 2(c or d) where 9 site-specific data points are available, it can be seen that the site-specific PDF (solid grey markers for MUSIC-X and open black markers for MUSIC) is less scattered and arguably more informative for this particular site than the generic PDF (blue markers). The MUSIC and MUSIC-X PDFs are similar, because the sampling intervals in Table 4 are large and spatial variability is thus not well captured. Figure 2(d) further shows that Su /s′ v is less correlated to qt1 at the Taipei site than the generic version. The generic correlations in CLAY/10/7490 are 0.91, −0.57, −0.50, and 0.73 for the transformation models shown in Figure 2(a–d), respectively, in standard normal space (Ching and Phoon 2014). To the author’s knowledge, this learning algorithm (MUSIC or MUSIC-X) is the first of its kind. Even when used by itself, the site-specific PDF can guide the engineers to select conservative design values more appropriate for a particular site by using the approximate lower bounds of the solid grey markers (rather than generic blue markers) shown in Figure 2. If these values were ascertained to be overly conservative, more measurements could be taken and the favourite question “how many measurements are enough” can be addressed quantitatively by the reduction in the scatter of the solid grey markers that will improve the lower bounds. This MUSIC or MUSIC-X PDF is basically a quantification of site “uniqueness” from a data-informed perspective and there is clear value to do this. Once site uniqueness can be captured numerically, it opens all kinds of interesting research avenues to combine site-specific data with generic data from other sites in the entire world, not merely in the restricted region that an engineer practices in. The similarity index approach outlined below is one such example. 4.2. Similarity index approach The next natural question is how to combine a sitespecific PDF with a generic PDF in a more discriminate way that accounts for site “uniqueness”. Ching and Phoon (2019d) developed a similarity index (S) based on f(xnew|X o) to identify records from a generic database that are “similar” to those from a specific site. A second example of a MUSIC database from Onsøy, Norway is given in Table 5. This set of site-specific data is shown as red solid triangles in Figure 3 against a background of generic data from CLAY/10/7490 shown as grey solid circles. Figure 3 also presents data from another site in Norway (Drammen) from Lacasse and Lunne (1982). The Drammen and Onsøy sites are roughly 50 km apart with comparable geologic origins (Lacasse et al. 1981; Lacasse and Lunne 1982). The data are identified as “similar” (S > 1) (black solid circles) or GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS 15 Figure 2. Transformation models based on generic data (CLAY/10/7490) and site-specific data (MUSIC and MUSIC-X simulations) for a Taipei site (Table 4) (J. Ching, personal communications, 2019). Note: solid and dashed lines are the median and 95% confidence interval for the generic data. “similarity” to the data at one site, and (3) perform a weighted regression on a combined dataset containing the site-specific data and the generic database. The equivalent generic sample size (Neq) is the sum of the weights produced by the generic records. Figure 4 presents the construction of quasi-site-specific transformation models using different number of records from Table 5: (a) all 9 records; (b) 6 records (depths = 1.9, 3.5, 5.2, 9.5, 10.8, and 13.4 m); (c) 2 records (depths = “dissimilar” (S < 1) (black open circles) using this similarity index. The practical benefit of doing this is that we can replace a generic transformation model such as Figure 1 by a quasi-site-specific version that is based on the site-specific data and appropriately weighted generic data. Ching and Phoon (2019d) recommended the following procedure to do this: (1) assume the weight of a site-specific data point is 1, (2) assign weights to records in a generic database as a function of their Table 5. Site investigation data for a marine clay layer in Onsøy (Norway) (Lacasse and Lunne 1982). Site-specific data Y Index 1 2 3 4 5 6 7 8 9 Depth (m) 1 1.9 3.5 5.2 7.6 9.5 10.8 13.4 16.3 LL 56.2 50.2 59.9 56.8 66.3 65.1 74.4 71.4 72.7 PI LI s′ v /Pa s′ p /Pa su (mob)/s′ v St Bq qt1 qtu OCR 20 18.1 30.5 22.9 31.5 29.6 36.1 35.8 34.7 1.54 1.82 0.93 1.07 0.87 0.97 0.81 0.87 0.76 0.06 0.12 0.22 0.32 0.47 0.58 0.65 0.81 0.99 0.85 0.6 0.48 0.45 0.54 2.03 0.91 0.48 0.37 0.24 0.25 0.25 0.24 0.24 6 14 15 7 14 12 9 0.16 0.24 0.3 0.35 0.47 0.41 0.46 0.47 0.55 29.11 17.69 10.52 7.7 5.89 6.19 5.93 5.95 6.13 25.57 14.58 8.41 6.11 4.25 4.74 4.31 4.24 3.88 13.99 5.2 2.26 1.42 1.17 0.84 1.05 0.99 1.28 1.29 1 Notes: LL = liquid limit; PI = plasticity index; LI = liquidity index; s′ v = vertical effective stress; s′ p = preconsolidation stress; Pa = atmospheric pressure = 101.3 kPa; qt = (corrected) cone tip resistance; qt1 = (qt − sv )/s′ v ; su = undrained shear strength; su(mob) = mobilised su values (Mesri and Huvaj 2007); St = sensitivity; qt = (corrected) cone tip resistance; u2 = pore pressure behind cone; Bq = pore pressure ratio = (u2-u0)/(qt-σv); u0 = hydrostatic pore pressure; qt1 = (qt − sv )/s′ v ; qtu = (qt − u2 )/s′ v . 16 K.-K. PHOON Figure 3. Automatic detection of records from a generic database CLAY/10/7490 that are “similar” to those from a specific site in Onsøy, Norway (Ching and Phoon 2019d). 1.9 and 13.4 m); and (d) no site-specific data is available. Note that s′ p /Pa is not measured at a depth = 9.5 m. Hence, the number of site records for this specific transformation model is Ns = 8 and 5 for Figure 4(a and b), respectively. It could be seen that the quasi-site-specific transformation model is more customised to Onsøy when there are 8 site records but revert back to the generic form when there are only 2 site records. Note that the black solid circles are meant for reference only; the full set of generic records (grey solid circles) with appropriate weights is used for regression. This similarity index approach treats the records in a generic database as independent, although a natural grouping based on site exists. Data records measured within a site tend to be more similar with each other than those measured in other sites. Ching, Wu, and Phoon (2020) proposed a hierarchical Bayesian model to capture this additional site information commonly made available in generic databases. No quantitative information related to the site location such as GPS location, nearest city, region, country and others is needed. Only qualitative knowledge that a group of data records are measured within the same project site is needed. In the author’s opinion, the above research is preliminary, because many structural elements of a generic database such as groups have not been studied and the algorithms are not in a full learning mode, including learning from expert judgment. However, they are founded on Bayesian theory and they will set the scene for more powerful algorithms to emerge in the near future. It is quite likely that some aspects of our engineering experience would be “digitized” in the future in the sense of being captured by algorithms that can learn from both data and their interactions with the engineers. 5. Let data speak for themselves Hand (2014) said: In general, when building statistical models, we must not forget that the aim is to understand something about the real world. Or predict, choose an action, make a decision, summarize evidence, and so on, but always about the real world, not an abstract mathematical world: our models are not the reality. Let’s take a step back and ask ourselves why we need a model in the first place. One answer is that we do not have sufficient data to make a decision without mediation by a model. The simplest statistical model is to assume data are independent and identically distributed (i.i.d.). Limited data are needed to characterise this model, but it clearly deviates from a reality that exhibits spatial variability. The classical random field model tries to do better, but it requires more data for statistical characterisation. More recent multiple point methods in geostatistics can consider more than two-point autocorrelation (or second-order) information (Mariethoz and Caers 2015), but they require richer data in the form of training images. Phoon, Ching, and Wang (2019) opined that a “taxonomy of methods based on the type/amount of data available could help guide future development in data-driven algorithms and strengthen a virtuous cycle of data collection hardware developing hand in hand with algorithms”. Wang and Zhao (2016, 2017) explored a GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS 17 Figure 4. Construction of quasi-site-specific transformation models by combining different amount of Onsøy data with appropriately weighted records in CLAY/10/7490 (Ching and Phoon 2019b). Note: median (solid line) and 95% confidence interval (dashed lines). new sampling paradigm in digital signal processing called compressive sampling (or sensing, CS) that can reconstruct a near replica of the original signal from a small number of measurements. Wang, Zhao, and Phoon (2018) subsequently developed a Bayesian Compressive Sampling-Karhunen-Loève (BCS-KL) expansion version that can generate random field samples (RFSs) directly from sparse measurements. This BCSKL generator is shown to be capable of dealing with much more general non-Gaussian and non-stationary RFSs, including RFSs with unknown non-stationary auto-covariance structure without explicit estimation of the autocorrelation function (Montoya-Noguera et al. 2019) and RFSs with unknown trend function without de-trending (Wang Y et al. 2019). In addition, the BCSKL generator may be readily extended to simulate cross-correlated bivariate RFSs (Zhao and Wang 2018). An important open question is whether the basis functions in BCS (which are prescribed) can reproduce the finer details of some sample paths such as those produced by the Whittle-Matérn autocorrelation function correctly when there are sufficient measurements (convergence). A second related question is whether it can retain its key practical advantage of representing such “rough” sample paths using sparse measurements (rate of convergence). It will be interesting to further explore the possibility of decoupling BCS from the KL expansion, which contains a fairly strong multivariate Gaussianity assumption. 18 K.-K. PHOON Notwithstanding the above, it is clear that more recent non-classical methods can handle much more realistic data up to bivariate vector fields that are potentially non-stationary and non-Gaussian, without making strong demands on data such as sample sizes and completeness that cannot be fulfilled in practice. Classical models have reigned supreme as we have always assumed one can collect sufficient and appropriate data for characterisation. Many models were developed without a characterisation method in mind and in fact, many models exist without a satisfactory characterisation method in place even years after they were introduced. One cannot help but ask if this supposedly minor footnote pertaining to data collection and empirical characterisation thereof is really minor. It will be fruitful to ask ourselves the inverse question: what data we have and what models can we develop to make full use of the data at hand, warts and all? In some sense, these non-classical models that can function under very general conditions are allowing our data to speak for themselves. decisions to account for uncertainties explicitly. Geo 4.0 will do a lot more, but we (the entire geotechnical engineering community) need to engage in re-imagining the future of our profession with greater boldness. So, where did engineering judgment go? To quote Professor Ralph Peck, who is widely recognised for his practical engineering wisdom: “Theory and calculation are not substitute for judgement, but are the basis for sounder judgments” (NGI 2019). Professor Peck may not have imagined digitalisation, but he would agree that better calculations grounded on actual data will make our decisions even better. In the much longer term, no one really knows the role of humans in an immersive cyber-physical reality. In fact, why don’t we initiate an “AlphaGeo” project to see how far we can monetise our data and to sharpen the role of engineering judgment? One suspects (with good reasons) that the story of statistics will unfold in exciting and unexpected ways in the near future. It is not far-fetched to wonder if data may be the only reality. 6. Concluding thoughts Acknowledgements The moral of this story is that the value of geotechnical data is significantly under-appreciated and not fully exploited for decision making. Our data is “dark” in the sense that it is stored primarily for compliance purposes, rather than shared and actively mined for insights that can inform future decision making. The world is being revolutionised by new and powerful ways of collecting, sharing, analysing, and monetising data. Clearly, there is a pressing need for the geotechnical engineering community to engage in this digital transformation. There is no doubt that reasoned judgment is further enhanced when it is guided by relevant data and analytical tools that make the most sensible use of data, be it using physics, statistics, machine learning, or some combinations thereof. One Geo 4.0 approach could be to develop a clever physics-informed and data-driven “grey box” algorithm to shortlist “similar” sites from BID for the engineer to further refine based on his/her experience and for the algorithm to “learn” and thus become even more discriminating in the future. In this way, our collective experiences could be partially “digitized” as well. For the first time, we are starting to realise we can combine experience and data in even more clever ways. This paper describes some preliminary research on Bayesian learning that grew out of a simple idea studied by pioneers such as Professor Peter Lumb since the sixties. The idea is that uncertainties and even limited knowledge of these uncertainties can be quantified numerically using statistics, thus allowing design This paper is an update of the 10th Lumb Lecture, delivered at the University of Hong Kong, 6 December 2018. The author would like to thank Professor Limin Zhang, Editor in Chief of Georisk, for his encouragement to prepare this paper. The author is also grateful to the Department of Civil Engineering, The University of Hong Kong and the Geotechnical Division, The Hong Kong Institution of Engineers for their kind invitation to deliver this lecture. In particular, the generous hospitality extended by Professor Zhongqi Quentin Yue, Honorary Professor Chack Fan Lee, and Dr Victor Li is deeply appreciated. The author also thanked Dr Victor Li for sharing the article: “Excerpts from interview with Professor Peter Lumb”, Hong Kong Statistical Society Newsletter, Vol. 9, Issue 1, 1986. This paper was drafted during the author’s sabbatical at the Institute for Risk and Reliability, Leibniz University, which was funded by the Alexander von Humboldt Foundation. Last but not least, the author is deeply indebted to Prof Jianye Ching, National Taiwan University, for sharing his many deep insights and research in Bayesian learning and for preparing all the figures, to Dr Chong Tang for his extensive editorial assistance, and to the following colleagues for their invaluable comments: Zijun Cao, Marco D’Ignazio (for updating CLAY/9/249), Sina Javankhoshdel, C. Hsein Juang, Leena Korkiala-Tanttu, Tim Länsivaara, Stefan Larsson (for updating SE-CLAY/4/499), Andy Yat-fai Leung, Monica Löfman, Sukumar Pathmanandavel, Anders Prästings, Mengfen Shen (for updating liquefaction databases), Yu Wang (for discussions on Bayesian compressive sampling), and Dongming Zhang (for updating SH-CLAY/11/4051). Disclosure statement No potential conflict of interest was reported by the author. GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS ORCID Kok-Kwang Phoon http://orcid.org/0000-0003-2577-8639 References AbdelSalam, S., F. Baligh, and H. M. El-Naggar. 2015. “A Database to Ensure Reliability of Bored Pile Design in Egypt.” Proceedings of the Institution of Civil Engineers – Geotechnical Engineering 168 (2): 131–143. Abu-Hejleh, N., M. Abu-Farsakh, M. Suleiman, and C. Tsai. 2015. “Development and Use of High-Quality Databases of Deep Foundation Load Tests.” Transportation Research Record: Journal of the Transportation Research Board 2511: 27–36. Adhikari, P., Y. Gebreslasie, K. Ng, T. Sullivan, and S. Wulff. 2018. “Static and Dynamic Analysis of Driven Piles in Soft Rocks Considering LRFD Using a Recently Developed Electronic Database.” In Installation, Testing, and Analysis of Deep Foundations (GSP 294), 83–92. Reston, VA: ASCE. Akbas, S. 2007. “Deterministic and Probabilistic Assessment of Settlements of Shallow Foundations in Cohesionless Soils.” Ph.D. thesis, Cornell University Allen, T., and R. Bathurst. 2018. “Application of the Simplified Stiffness Method to Design of Reinforced Soil Walls.” Journal of Geotechnical and Geoenvironmental Engineering 144 (5): 04018024. Almeida, A. P. R. P., and J. Liu. 2018. “Statistical Evaluation of Design Methods for Micropiles in Ontario Soils.” DFI Journal - The Journal of the Deep Foundation Institute 12 (3): 133–146. Andrus, R. D., K. H. Stokoe, and R. M. Chung. 1999. Draft Guidelines for Evaluating Liquefaction Resistance Using Shear Wave Velocity Measurements and Simplified Procedures. Gaithersburg, MD: US Department of Commerce, Technology Administration, National Institute of Standards and Technology. Asem, P., J. Long, and P. Gardoni. 2018. “Probabilistic Model and LRFD Resistance Factors for the Tip Resistance of Drilled Shafts in Soft Sedimentary Rock Based on Axial Load Tests.” In Innovations in Geotechnical Engineering: Honoring Jean-Louis Briaud (GSP 299), 1–46. Reston, VA: ASCE. Augustesen, A. 2006. “The Effects of Time on Soil Behaviour and Pile Capacity.” Ph.D. thesis, Aalborg University Baecher, G. B. 1987. Statistical Analysis of Geotechnical Data. Report No. GL-87-1, U.S. Vicksburg: Army Engineer Waterways Experiment Station. Baecher, G. B., and J. T. Christian. 2003. Reliability and Statistics in Geotechnical Engineering. London and New York: John Wiley and Sons. Baecher, G. B., and J. T. Christian. 2019. “TH Wu and the Origins of Geotechnical Reliability.” Georisk 13 (4): 242–246. Bahsan, E., H. J. Liao, J. Y. Ching, and S. W. Lee. 2014. “Statistics for the Calculated Safety Factors of Undrained Failure Slopes.” Engineering Geology 172: 85–94. Bathurst, R. J., S. Javankhoshdel, and T. M. Allen. 2017. “LRFD Calibration of Simple Soil-structure Limit States Considering Method Bias and Design Parameter Variability.” Journal of Geotechnical and Geoenvironmental Engineering 143 (9): 04017053. Beer, M., Y. Zhang, S. T. Quek, and K. K. Phoon. 2013. “Reliability Analysis with Scarce Information: Comparing 19 Alternative Approaches in a Geotechnical Engineering Context.” Structural Safety 41: 1–10. Beesley, M. E., and P. J. Vardanega. 2019. “Parameter Variability of Undrained Shear Strength and Strain Using a Database of Reconstituted Soil Tests.” Canadian Geotechnical Journal, in press. BS8004. 1986. Code of Practice for Foundations. London: British Standards Institution. Burland, J. B. 1987. “The Teaching of Soil Mechanics – a Personal View.” Proceedings, 9th European Conference on Soil Mechanics and Foundation Engineering, Dublin, 3 vols., 1427–1447. Burlon, S., R. Frank, F. Baguelin, J. Habert, and S. Legrand. 2014. “Model Factor for the Bearing Capacity of Piles from Pressuremeter Test Results: Eurocode 7 Approach.” Géotechnique 64 (7): 513–525. Cami, B., S. Javankhoshdel, and K. K. Phoon. 2020. “A Review of Scale of Fluctuation for Spatially Varying Soils: Estimation Methods and Values.” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering, under review. Cao, Z., S. Zheng, D. Q. Li, and K. K. Phoon. 2019. “Bayesian Identification of Soil Stratigraphy Based on Soil Behavior Type Index.” Canadian Geotechnical Journal 56 (4): 570–586. Casagrande, A. 1965. “Role of the ‘Calculated Risk’ in Earthwork and Foundation Engineering.” Journal of Soil Mechanics and Foundations Division 91 (SM4): 1–40. Cetin, K. O., R. B. Seed, A. Der Kiureghian, K. Tokimatsu, L. F. Harder, R. E. Kayen, and R. E. S. Moss. 2004. “Standard Penetration Test-based Probabilistic and Deterministic Assessment of Seismic Soil Liquefaction Potential.” Journal of Geotechnical and Geoenvironmental Engineering 130 (12): 1314–1340. Chahbaz, R., S. Sadek, and S. Najjar. 2019. “Uncertainty Quantification of the Bond Stress – Displacement Relationship of Shoring Anchors in Different Geologic Units.” Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards 13 (4): 276–283. Chen, Y. J., and F. H. Kulhawy. 1994. Case History Evaluation of Behavior of Drilled Shafts under Axial and Lateral Loading. Report EPRI TR-104601. Palo Alto, CA: Electric Power Research Institute (EPRI) Chen, Y. J., and Y. H. Lee. 2010. “Evaluation of Lateral Interpretation Criteria for Drilled Shaft Capacity.” Journal of Geotechnical and Geoenvironmental Engineering 136 (8): 1124–1136. Chen, Y. J., M. R. Liao, S. S. Lin, J. K. Huang, and M. C. M. .Marcos. 2014. “Development of an Integrated Web-Based System with a Pile Load Test Database and pre-Analyzed Data.” Geomechanics and Engineering 7 (1): 37–53. Chen, Y. J., S. W. Lin, and F. H. Kulhawy. 2011. “Evaluation of Lateral Interpretation Criteria for Rigid Drilled Shafts.” Canadian Geotechnical Journal 48 (4): 634–643. Cheung, R. W. M., and K. W. Shum. 2012. Review of the Approach for Estimation of Pullout Resistance of Soil Nails. GEO Report No. 264. Geotechnical Engineering Office, Civil Engineering and Development Department, Hong Kong Ching, J., D. Q. Li, and K. K. Phoon. 2016. “Statistical Characterization of Multivariate Geotechnical Data.” In Reliability of Geotechnical Structures in ISO2394, edited by K. K. Phoon and J. V. Retief, 89–126. Balkema: CRC Press. 20 K.-K. PHOON Ching, J., K. H. Li, K. K. Phoon, and M. C. Weng. 2018. “Generic transformation models for some intact rock properties.” Canadian Geotechnical Journal 55 (12): 1702–1741. Ching, J., and K. K. Phoon. 2012. “Modeling Parameters of Structured Clays as a Multivariate Normal Distribution.” Canadian Geotechnical Journal 49 (5): 522–545. Ching, J., and K. K. Phoon. 2013. “Multivariate Distribution for Undrained Shear Strengths Under Various Test Procedures.” Canadian Geotechnical Journal 50 (9): 907–923. Ching, J., and K. K. Phoon. 2014. “Correlations Among Some Clay Parameters – the Multivariate Distribution.” Canadian Geotechnical Journal 51 (6): 686–704. Ching, J., and K. K. Phoon. 2015. “Constructing Multivariate Distribution for Soil Parameters.” Chap. 1 in Risk and Reliability in Geotechnical Engineering, 3–76. Balkema: CRC Press. Ching, J., and K. K. Phoon. 2017. “Characterizing Uncertain Site-specific Trend Function by Sparse Bayesian Learning.” Journal of Engineering Mechanics 143 (7): 04017028. Ching, J., and K. K. Phoon. 2019a. “Impact of Auto-correlation Function Model on the Probability of Failure.” Journal of Engineering Mechanics 145 (1): 04018123. Ching, J., and K. K. Phoon. 2019b. “Constructing a Sitespecific Multivariate Probability Distribution Using Sparse, Incomplete, and Spatially Variable (MUSIC-X) Data.” Journal of Engineering Mechanics, under review. Ching, J., and K. K. Phoon. 2019c. “Constructing Site-specific Multivariate Probability Distribution Model Using Bayesian Machine Learning.” Journal of Engineering Mechanics 145 (1): 04018126. Ching, J., and K. K. Phoon. 2019d. “Measuring Similarity between Site-specific Data and Records from Other Sites.” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering, in press. Ching, J., K. K. Phoon, J. L. Beck, and Y. Huang. 2017. “Identifiability of Geotechnical Site-specific Trend Functions.” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 3 (4): 04017021. Ching, J., K. K. Phoon, and D. Q. Li. 2016. “Robust Estimation of Correlation Coefficients Among Soil Parameters Under the Multivariate Normal Framework.” Structural Safety 63: 21–32. Ching, J., K. K. Phoon, and J. W. Yu. 2014. “Linking Site Investigation Efforts to Final Design Savings with Simplified Reliability-based Design Methods.” Journal of Geotechnical and Geoenvironmental Engineering 140 (3): 04013032. Ching, J., J. S. Wang, C. H. Juang, and C. S. Ku. 2015. “Cone Penetration Test (CPT)-based Stratigraphic Profiling Using the Wavelet Transform Modulus Maxima Method.” Canadian Geotechnical Journal 52 (12): 1993–2007. Ching, J., S. S. Wu, and K. K. Phoon. 2016. “Statistical Characterization of Random Field Parameters Using Frequentist and Bayesian Approaches.” Canadian Geotechnical Journal 53 (2): 285–298. Ching, J., S. Wu, and K. K. Phoon. 2020. “Constructing Quasisite-specific Multivariate Probability Distribution Using Hierarchical Bayesian Model.” Journal of Engineering Mechanics, ASCE, under review. DeGroot, D. J. 1996. “Analyzing Spatial Variability of in Situ Soil Properties.” In Uncertainty in the Geologic Environment: From Theory to Practice (GSP 58), edited by C. D. Shackelford, P. P. Nelson, and M. J. S. Roth, 210– 238. Reston, VA: ASCE. DeGroot, D. J., and G. B. Baecher. 1993. “Estimating Autocovariance of In-situ Soil Properties.” Journal of Geotechnical Engineering 119 (1): 147–166. D’Ignazio, M., T. Lunne, K. H. Andersen, S. L. Yang, B. Di Buò, and T. T. Länsivaara. 2019. “Estimation of Preconsolidation Stress of Clays from Piezocone by Means of High-quality Calibration Data.” AIMS Geosciences 5 (2): 104–116. D’Ignazio, M., K. K. Phoon, S. A. Tan, and T. T. Länsivaara. 2016. “Correlations for Undrained Shear Strength of Finnish Soft Clays.” Canadian Geotechnical Journal 53 (10): 1628–1645. Dithinde, M., K. K. Phoon, M. Wet, and J. Retief. 2011. “Characterization of Model Uncertainty in the Static Pile Design Formula.” Journal of Geotechnical and Geoenvironmental Engineering 137 (1): 70–85. DNV. 2012. “Statistical Representation of Soil Data.” Recommended Practice DNVGL-RP-C207, Det Norske Veritas (DNV), Oslo, Norway. Einstein, H. H., and G. B. Baecher. 1982. “Probabilistic and Statistical Methods in Engineering Geology I. Problem Statement and Introduction to Solution.” In Ingenieurgeologie und Geomechanik als Grundlagen des Felsbaues/Engineering Geology and Geomechanics as Fundamentals of Rock Engineering, edited by L. Muller, 47–61. Vienna: Springer. Einstein, H. H., and G. B. Baecher. 1983. “Probabilistic and Statistical Methods in Engineering Geology. Specific Methods and Examples Part I: Exploration.” Rock Mechanics and Rock Engineering 16 (1): 39–72. Einstein, H. H., V. B. Halabe, J.-P. Dudt, and F. Descoeudres. 1996. “Geologic Uncertainties in Tunnelling.” In Uncertainty in the Geologic Environment: From Theory to Practice (GSP 58), edited by C. D. Shackelford, P. P. Nelson, and M. J. S. Roth, 239–253. Reston, VA: ASCE. EN 1997−1:2004. Eurocode 7: Geotechnical Design — Part 1: General Rules. Brussels, Belgium: European Committee for Standardization (CEN). Feng, S., and P. J. Vardanega. 2019. “Correlation of the Hydraulic Conductivity of Fine-grained Soils with Water Content Ratio Using a Database.” Environmental Geotechnics 6 (5): 253–268. Fenton, G. A., F. Naghibi, D. Dundas, R. J. Bathurst, and D. V. Griffiths. 2016. “Reliability-based Geotechnical Design in the 2014 Canadian Highway Bridge Design Code.” Canadian Geotechnical Journal 53 (2): 236–251. Filippas, O. B., F. H. Kulhawy, and M. D. Grigoriu. 1988. Reliability-based Foundation Design for Transmission Line Structures: Uncertainties in Soil Property Measurement. Report EL-5507(3). Palo Alto: Electric Power Research Institute. Flynn, K. 2014. “Experimental Investigations of Driven Cast in-Situ Piles.” Ph.D. thesis, National University of Ireland, Galway Galbraith, A., E. Farrell, and J. Byrne. 2014. “Uncertainty in Pile Resistance from Static Load Tests Database.” Proceedings of the Institution of Civil Engineers Geotechnical Engineering 167 (5): 431–446. Garder, J., K. Ng, S. Sritharan, and M. Roling. 2012. An Electronic Database for Drilled SHAft Foundation Testing GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS (DSHAFT). Report No. InTrans Project 10-366, Iowa Department of Transportation Gartner. 2019. “IT Glossary.” Gartner. Accessed October 3, 2019. https://www.gartner.com/it-glossary/dark-data. Geoengineer. 2019. “Professor Ralph Peck’s Legacy Website.” Geoengineer.org. Accessed October 3, 2019. https://peck. geoengineer.org/resources/words-of-wisdom. Hand, D. J. 2014. “Wonderful Examples, but Let’s not Close Our Eyes.” Statistical Science 29: 98–100. Harr, M. E. 1987. Reliability-based Design in Civil Engineering. New York: McGraw-Hill. Hov, S., A. Prästings, E. Persson, and S. Larsson. 2019. “On Empirical Correlations for Normalised Shear Strengths from Fall Cone and Direct Simple Shear Tests in Soft Swedish Clays.” Geotechnical and Geological Engineering, under review. Huang, B. Q., and R. J. Bathurst. 2009. “Evaluation of SoilGeogrid Pullout Models Using a Statistical Approach.” Geotechnical Testing Journal 32 (6): 489–504. Idriss, I. M., and R. W. Boulanger. 2010. SPT-based Liquefaction Triggering Procedures. Davis, CA: Center for Geotechnical Modeling, Department of Civil and Environmental Engineering, University of California at Davis. Report No. UCD/CGM-10/02. Ismail, S., S. S. Najjar, and S. Sadek. 2018. “Reliability Analysis of Buried Offshore Pipelines in Sand Subjected to Upheaval Buckling.” In Proceedings of the Offshore Technology Conference (OTC). Houston, TX: American Petroleum Institute. OTC-28882-MS. ISO2394:1973/1986/1998/2015. General Principles on Reliability for Structures. Geneva, Switzerland: International Organization for Standardization. Jaksa, M. B. 1995. “The Influence of Spatial Variability on the Geotechnical Design Properties of a Stiff, Overconsolidated Clay.” Ph.D. Dissertation, University of Adelaide. Jaksa, M. B., H. R. Maier, and M. A. Shahin. 2008. “Future Challenges for Artificial Neural Network Modelling in Geotechnical Engineering.” Proceedings, 12th International Conference of International Association for Computer Methods and Advances in Geomechanics (IACMAG), 1710–1719, Goa, India, October 1–6. Joint TC205/TC304 Working Group Report. 2017. “Discussion of Statistical/Reliability Methods for Eurocodes. International Society for Soil Mechanics and Geotechnical Engineering.” http://140.112.12.21/issmge/tc304.htm. Juang, C. H., J. Ching, and Z. Luo. 2013. “Assessing SPT-based Probabilistic Models for Liquefaction Potential Evaluation: A 10-Year Update.” Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards 7 (3): 137– 150. Kayen, R., R. E. S. Moss, E. M. Thompson, R. B. Seed, K. O. Cetin, A. D. Kiureghian, Y. Tanaka, and K. Tokimatsu. 2013. “Shear-wave Velocity–Based Probabilistic and Deterministic Assessment of Seismic Soil Liquefaction Potential.” Journal of Geotechnical and Geoenvironmental Engineering 139 (3): 407–419. Ku, C. S., C. H. Juang, C. W. Chang, and J. Ching. 2012. “Probabilistic Version of the Robertson and Wride Method for Liquefaction Evaluation: Development and Application.” Canadian Geotechnical Journal 49 (1): 27–44. Kulhawy, F. H., B. Birgisson, and M. D. Grigoriu. 1992. Reliability-based Foundation Design for Transmission Line 21 Structures: Transformation Models for In-Situ Tests. Report EL-5507(4). Palo Alto: Electric Power Research Institute. Kulhawy, F. H., and P. W. Mayne. 1990. Manual on Estimating Soil Properties for Foundation Design. Report EL-6800. Palo Alto, CA: Electric Power Research Institute. Kulhawy, F. H., T. D. O’Rourke, J. P. Stewart, and J. F. Beech. 1983. Transmission Line Structure Foundations for UpliftCompression Loading: Load Test Summaries. Appendix to EPRI Final Report EL-2870. Report No. EL-3160-LD. Palo Alto, CA: Electric Power Research Institute (EPRI) Lacasse, S., M. Jamiolkowski, R. Lancellotta, and T. Lunne. 1981. “In Situ Characteristics of Two Norwegian Clays.” Proceedings, 10th International Conference on Soil Mechanics and Foundation Engineering, Stockholm, 2 vols., 507–511. Lacasse, S., Z. Liu, and F. Nadim. 2017. “Probabilistic Characterization of Soil Properties—Recognition of Wilson Tang’s Contribution to Geotechnical Practice.” In Geotechnical Safety and Reliability: Honoring Wilson H. Tang (GSP 286), edited by C. Hsein Juang, R. B. Gilbert, L. Zhang, J. Zhang, and L. Zhang, 2–26. Reston, VA: ASCE. Lacasse, S., and T. Lunne. 1982. “Penetration Tests in Two Norwegian Clays.” Proceedings, Second European Symposium on Penetration Testing, Amsterdam, 661–670. Lacasse, S., and F. Nadim. 1996. “Uncertainties in Characterising Soil Properties.” In Uncertainty in the Geologic Environment: From Theory to Practice (GSP 58), edited by C. D. Shackelford, P. P. Nelson, and M. J. S. Roth, 49–75. Reston, VA: ASCE. Lam, K., and W. K. Li. 1986. “Excerpts from Interview with Professor Peter Lumb.” Hong Kong Statistical Society Newsletter 9 (1): 2–5. Lambe, T. W. 1973. “Predictions in Soil Engineering.” Géotechnique 23 (2): 151–202. Lazarte, C. A. 2011. Proposed Specifications for LRFD SoilNailing Design and Construction. NCHRP Report 701. Washington, DC: Transportation Research Board Lehane, B. M., J. K. Kim, P. Carotenuto, F. Nadim, S. Lacasse, R. J. Jardine, and B. F. J. Van Dijk. 2017. “Characteristics of Unified Databases for Driven Piles.” In 8th International Conference of Offshore Site Investigation and Geomechanics. Vol. 1, 162–191. London: Society for Underwater Technology. Li, Z., X. Wang, H. Wang, and R. Y. Liang. 2016. “Quantifying Stratigraphic Uncertainties by Stochastic Simulation Techniques Based on Markov Random Field.” Engineering Geology 201: 106–122. Lin, P. Y., R. Bathurst, and J. Y. Liu. 2017. “Statistical Evaluation of the FHWA Simplified Method and Modifications for Predicting Soil Nail Loads.” Journal of Geotechnical and Geoenvironmental Engineering 143 (3): 04016107. Liu, H. F., L. S. Tang, P. Y. Lin, and G. X. Mei. 2018. “Accuracy Assessment of Default and Modified Federal Highway Administration (FHWA) Simplified Models for Estimation of Facing Tensile Forces of Soil Nail Walls.” Canadian Geotechnical Journal 55 (8): 1104–1115. Liu, S., H. Zou, G. Cai, B. V. Bheemasetti, A. J. Puppala, and J. Lin. 2016. “Multivariate Correlation among Resilient Modulus and Cone Penetration Test Parameters of 22 K.-K. PHOON Cohesive Subgrade Soils.” Engineering Geology 209: 128– 142. Long, M. 2001. “Database for Retaining Wall and Ground Movements due to Deep Excavations.” Journal of Geotechnical and Geoenvironmental Engineering 127 (3): 203–224. Long, J. 2013. Improving Agreement between Static Method and Dynamic Formula for Driven Cast-in-Place Piles in Wisconsin. Report No. 0092-10-09, Wisconsin Department of Transportation Long, J. 2016. Static Pile Load Tests on Driven Piles into Intermediate-geo Materials. Report No. WHRP 0092-1208, Wisconsin Department of Transportation Long, J., and A. Anderson. 2014. Improved Design for Driven Piles Based on a Pile Load Test Program in Illinois: phase 2. Report No. FHWA-ICT-14-019, Illinois Department of Transportation Long, J., J. Hendrix, and D. Jaromin. 2009. Comparison of Five Different Methods for Determining Pile Bearing Capacities. Report No. WisDOT 0092-07-04, Wisconsin Department of Transportation. Lumb, P. 1966. “Variability of Natural Soils.” Canadian Geotechnical Journal 3 (2): 74–97. Marcos, M. C. M., and Y. J. Chen. 2013. “Evaluation of Lateral Interpretation Criteria for Drilled Shaft Capacity in Gravels.” Geotechnical and Geological Engineering 31 (5): 1411–1420. Mariethoz, G., and J. Caers. 2015. Multiple-point Geostatistics Stochastic Modeling with Training Images. West Sussex: John Wiley & Sons. Marsland, A. 1953. “Model Experiments to Study the Influence of Seepage on the Stability of a Sheeted Excavation in Sand.” Géotechnique 3 (6): 223–241. McVay, M., S. Wasman, L. Huang, and S. Crawford. 2016. Load and Resistance Factor Design (LRFD) Resistance Factors for Auger Cast in Place Piles. Tallahassee, FL: Florida Department of Transportation. Mesri, G., and N. Huvaj. 2007. “Shear Strength Mobilized in Undrained Failure of Soft Clay and Silt Deposits.” In Advances in Measurement and Modeling of Soil Behavior (GSP 173), 1–22. Reston, VA: ASCE. Miyata, Y., and R. J. Bathurst. 2012a. “Measured and Predicted Loads in Steel Strip Reinforced c-Φ Soil Walls in Japan.” Soils and Foundations 52 (1): 1–17. Miyata, Y., and R. J. Bathurst. 2012b. “Analysis and Calibration of Default Steel Strip Pullout Models Used in Japan.” Soils and Foundations 52 (3): 481–497. Miyata, Y., and R. J. Bathurst. 2015. “Reliability Analysis of Geogrid Installation Damage Test Data in Japan.” Soils and Foundations 55 (2): 393–403. Miyata, Y., and R. J. Bathurst. 2019. “Statistical Assessment of Load Model Accuracy for Steel Grid-Reinforced Soil Walls.” Acta Geotechnica 14 (1): 57–70. Miyata, Y., R. J. Bathurst, and T. M. Allen. 2014. “Reliability Analysis of Geogrid Creep Data in Japan.” Soils and Foundations 54 (4): 608–620. Miyata, Y., R. J. Bathurst, and T. Konami. 2011. “Evaluation of Two Anchor Plate Capacity Models for MAW Systems.” Soils and Foundations 51 (5): 885–895. Miyata, Y., Y. Yu, and R. J. Bathurst. 2018. “Calibration of SoilSteel Grid Pullout Models Using a Statistical Approach.” Journal of Geotechnical and Geoenvironmental Engineering 144 (2): 04017106. Moghaddam, R. B., P. W. Jayawickrama, W. D. Lawson, J. G. Surles, and H. Seo. 2018. “Texas Cone Penetrometer Foundation Design Method: Qualitative and Quantitative Assessment.” DFI Journal – The Journal of the Deep Foundations Institute 12 (2): 69–80. Montoya-Noguera, S., T. Zhao, Y. Hu, Y. Wang, and K. K. Phoon. 2019. “Simulation of Non-stationary NonGaussian Random Fields from Sparse Measurements Using Bayesian Compressive Sampling and KarhunenLoève Expansion.” Structural Safety 79: 66–79. Moormann, C. 2004. “Analysis of Wall and Ground Movements due to Deep Excavations in Soft Soil Based on a new Worldwide Database.” Soils and Foundations 44 (1): 87–98. Moshfeghi, S., and A. Eslami. 2018. “Study on Pile Ultimate Capacity Criteria and CPT-Based Direct Methods.” International Journal of Geotechnical Engineering 12 (1): 28–39. Moss, R. E. S., R. B. Seed, R. E. Kayen, J. P. Stewart, A. Der Kiureghian, and K. O. Cetin. 2006. “CPT-based Probabilistic and Deterministic Assessment of in Situ Seismic Soil Liquefaction Potential.” Journal of Geotechnical and Geoenvironmental Engineering 132 (8): 1032–1051. Motamed, R., S. Elfass, and K. Stanton. 2016. LRFD Resistance Factor Calibration for Axially Loaded Drilled Shafts in the Las Vegas Valley. Report No. 515-13-803, Nevada Department of Transportation. Nanazawa, T., T. Kouno, G. Sakashita, and K. Oshiro. 2019. “Development of Partial Factor Design Method on Bearing Capacity of Pile Foundations for Japanese Specifications for Highway Bridges.” Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards 13 (3): 166–175. National Research Council. 1995. Probabilistic Methods in Geotechnical Engineering. Washington, DC: National Academies Press. Ng, C. W. W., T. L. Y. Yau, J. H. M. Li, and Tang. 2001. “New Failure Load Criterion for Large Diameter Bored Piles in Weathered Geomaterials.” Journal of Geotechnical and Geoenvironmental Engineering 127 (6): 488–498. NGI. 2019. “Peck Library.” Norwegian Geotechnical Institute. Accessed October 3, 2019. https://www.ngi.no/eng/ Publications-and-library/NGI-s-Historical-Libraries/PeckLibrary. Niazi, F. S. 2014. “Static Axial Pile Foundation Response Using Seismic Piezocone Data.” Ph.D. thesis, Georgia Institute ofTechnology Orchant, C. J., F. H. Kulhawy, and C. H. Trautmann. 1988. Reliability-based Foundation Design for Transmission Line Structures: Critical Evaluation of In-situ Test Methods. Report EL-5507(2). Palo Alto: Electric Power Research Institute. Ou, C. Y., and J. T. Liao. 1987. Geotechnical Engineering Research Report. GT96008. Taipei: National Taiwan University of Science and Technology. Paikowsky, S. G., B. Birgisson, M. McVay, T. Nguyen, C. Kuo, G. B. Baecher, B. Ayyub, et al. 2004. Load and Resistance Factors Design for Deep Foundations. NCHRP Report 507. GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS Washington, DC: Transportation Research Board of the National Academies. Paikowsky, S., M. Canniff, K. Lesny, A. Kisse, S. Amatya, and R. Muganga. 2010. “LRFD Design and Construction of Shallow Foundations for Highway Bridge Structures.” NCHRP Report 651. Washington, DC: Transportation Research Board. Peck, R. B. 1980. “Where has All the Judgment Gone?” Canadian Geotechnical Journal 17 (4): 584–590. Phoon, K. K. 2017. “Role of Reliability Calculations in Geotechnical Design.” Georisk 11 (1): 4–21. Phoon, K. K. 2018. “Editorial for Special Collection on Probabilistic Site Characterization.” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 4 (4): 02018002. Phoon, K. K. 2020. “The Goldilocks Dilemma - too Little or too Much Data.” GeoStrata 24 (1), in press. Phoon, K. K., and J. Ching. 2017. “Better Correlations for Geotechnical Engineering.” A Decade of Geotechnical Advances, Geotechnical Society of Singapore (GeoSS), 73–102. Phoon, K. K., J. Ching, and Y. Wang. 2019. “Managing Risk in Geotechnical Engineering – from Data to Digitalization.” Proceedings, 7th International Symposium on Geotechnical Safety and Risk (ISGSR 2019), Taipei, Taiwan, in press. Phoon, K. K., and F. H. Kulhawy. 1999a. “Characterization of Geotechnical Variability.” Canadian Geotechnical Journal 36 (4): 612–624. Phoon, K. K., and F. H. Kulhawy. 1999b. “Evaluation of Geotechnical Property Variability.” Canadian Geotechnical Journal 36 (4): 625–639. Phoon, K. K., and F. H. Kulhawy. 2008. “Serviceability Limit State Reliability-based Design.” In Reliability-based Design in Geotechnical Engineering: Computations and Applications, 344–383. New York: Taylor & Francis. Phoon, K. K., F. H. Kulhawy, and M. D. Grigoriu. 2003. “Multiple Resistance Factor Design for Shallow Transmission Line Structure Foundations.” Journal of Geotechnical and Geoenvironmental Engineering 129 (9): 807–818. Phoon, K. K., S. L. Liu, and Y. K. Chow. 2009. “Characterization of Model Uncertainties for Cantilever Walls in Sand.” Journal of GeoEngineering 4 (3): 75–85. Phoon, K. K., W. A. Prakoso, Y. Wang, and J. Ching. 2016. “Uncertainty Representation of Geotechnical Design Parameters.” Chap. 3 in Reliability of Geotechnical Structures in ISO2394, 49–87. Balkema: CRC Press. Phoon, K. K., J. V. Retief, J. Ching, M. Dithinde, T. Schweckendiek, Y. Wang, and L. M. Zhang. 2016. “Some Observations on ISO2394:2015 Annex D (Reliability of Geotechnical Structures).” Structural Safety 62: 24–33. Phoon, K. K., and C. Tang. 2019. “Characterization of Geotechnical Model Uncertainty.” Georisk 13 (2): 101–130. Qi, X. H., D. Q. Li, K. K. Phoon, Z. Cao, and X. S. Tang. 2016. “Simulation of Geologic Uncertainty Using Coupled Markov Chain.” Engineering Geology 207: 129–140. Rauser, J., and C. Tsai. 2016. “Beneficial use of the Louisiana Foundation Load Test Database.” In Transportation Research Board 95th Annual Meeting, 1–17. Washington, DC: Transportation Research Board. 23 Reddy, S., and A. Stuedlein. 2017. “Ultimate Limit State Reliability-Based Design of Augered Cast-in-Place Piles Considering Lower-Bound Capacities.” Canadian Geotechnical Journal 54 (12): 1693–1703. Roling, M., S. Sritharan, and M. Suleiman. 2011. Development of LRFD Procedures for Bridge Pile Foundations in Iowa. Vol. 1: An electronic Database for Pile Load Tests (PILOT). Report No. IHRB Project TR-573, Iowa Department of Transportation Samtani, N. C., and T. C. Allen. 2018. “Implementation Report – Expanded Database for Service Limit State Calibration of Immediate Settlement of Bridge Foundation on Soil. Report No. FHWA-HIF-18-008. Washington, DC: Federal Highway Administration (FHWA). Shahin, M. A., M. B. Jaksa, and H. R. Maier. 2001. “Artificial Neural Network Applications in Geotechnical Engineering.” Australian Geomechanics 36 (1): 49–62. Shields, M. D., and H. Kim. 2017. “Simulation of Higherorder Stochastic Processes by Spectral Representation.” Probabilistic Engineering Mechanics 47: 1–15. Smith, T., A. Banas, M. Gummer, and J. Jin. 2011. Recalibration of the GRLWEAP LRFD Resistance Factor for Oregon DOT. Report No. FHWA-OR-RD-11-08, Oregon Department of Transportation Spry, M. J., F. H. Kulhawy, and M. D. Grigoriu. 1988. Reliability-based Foundation Design for Transmission Line Structures: Geotechnical Site Characterization Strategy. Report EL-5507(1). Palo Alto: Electric Power Research Institute. Stark, T., J. Long, A. Baghdady, and A. Osouli. 2017. Modified Standard Penetration Test-based Drilled Shaft Design Method for Weak Rocks (phase 2 study). Report No. FHWA-ICT-17-018, Illinois Department of Transportation State of the Nation Report. 2017. “Digital Transformation.” Institution of Civil Engineers. https://www.ice.org.uk/ news-and-insight/policy/state-of-the-nation-2017-digitaltransformation. Stuyts, B., D. Cathie, and T. Powell. 2016. “Model Uncertainty in Uplift Resistance Calculations for Sandy Backfills.” Canadian Geotechnical Journal 53 (11): 1831–1840. Tang, W. H. 1984. “Principles of Probabilistic Characterization of Soil Properties.” In Probabilistic Characterization of Soil Properties: Bridge Between Theory and Practice, edited by D. S. Bowles and H.-Y. Ko, 74–89. Atlanta, GA: ASCE. Tang, C., and K. K. Phoon. 2016. “Model Uncertainty of Cylindrical Shear Method for Calculating the Uplift Capacity of Helical Anchors in Clay.” Engineering Geology 207: 14–23. Tang, C., and K. K. Phoon. 2018a. “Evaluation of Model Uncertainties in Reliability-Based Design of Steel H-Piles in Axial Compression.” Canadian Geotechnical Journal 55 (11): 1513–1532. Tang, C., and K. K. Phoon. 2018b. “Statistics of Model Factors in Reliability-Based Design of Axially Loaded Driven Piles in Sand.” Canadian Geotechnical Journal 55 (11): 1592– 1610. Tang, C., and K. K. Phoon. 2018c. “Statistics of Model Factors and Consideration in Reliability-Based Design of Axially Loaded Helical Piles.” Journal of Geotechnical and Geoenvironmental Engineering 144 (8): 04018050. Tang, C., and K. K. Phoon. 2018. “Statistics of Model Factors and Consideration in Reliability-Based Design of Axially 24 K.-K. PHOON Loaded Helical Piles.” Journal of Geotechnical and Geoenvironmental Engineering 144 (8): 04018050. Tang, C., and K. K. Phoon. 2019a. “Evaluation of Stress Dependent Methods for the Punch-Through Capacity of Foundations in Clay with Sand.” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 5 (3): 04019008. Tang, C., and K. K. Phoon. 2019b. “Characterization of Model Uncertainty in Predicting Axial Resistance of Piles Driven In to Clay.” Canadian Geotechnical Journal 56 (8): 1098– 1118. Tang, C., and K. K. Phoon. 2019c. “Statistical Evaluation of Model Factors in Reliability Calibration of High Displacement Helical Piles Under Axial Loading.” Canadian Geotechnical Journal. doi:10.1139/cgj-2018-0754. Tang, C., K. K. Phoon, and Y. J. Chen. 2019. “Statistical Analyses of Model Factors in Reliability-Based Limit State Design of Drilled Shafts under Axial Loading.” Journal of Geotechnical and Geoenvironmental Engineering 145 (9): 04019042. Tang, C., K. K. Phoon, D. Q. Li, and S. O. Akbas. 2019. “Expanded Database Assessment of Design Methods for Spread Foundations Under Axial Uplift and Compression.” under preparation. TC304. 2019. “304dB – TC304 Databases.” TC304 (Engineering Practice of Risk Assessment and Management), International Society for Soil Mechanics and Geotechnical Engineering. Accessed October 3, 2019. http://140.112.12.21/issmge/tc304.htm. Terzaghi, K., and R. B. Peck. 1967. Soil Mechanics in Engineering Practice. New York: John Wiley & Sons Inc. Tian, M., D. Q. Li, Z. Cao, K. K. Phoon, and Y. Wang. 2016. “Bayesian Identification of Random Field Model Using Indirect Test Data.” Engineering Geology 210: 197–211. Travis, Q., M. Schmeeckle, and D. Sebert. 2011. “Meta-analysis of 301 Slope Failure Calculations. I: Database Description.” Journal of Geotechnical and Geoenvironmental Engineering 137 (5): 453–470. Vanmarcke, E. H. 1977. “Probabilistic Modeling of Soil Profiles.” Journal of the Geotechnical Engineering Division 103 (11): 1227–1246. Vanmarcke, E. H., and G. A. Fenton. 2003. Probabilistic Site Characterization at the National Geotechnical Experimentation Sites (GSP 121). Reston, VA: ASCE. Vick, S. G. 2002. Degrees of Belief: Subjective Probability and Engineering Judgment. Reston, VA: ASCE. Wang, Y., Y. Hu, and T. Zhao. 2019. “CPT-based Subsurface Soil Classification and Zonation in a 2D Vertical Cross-section Using Bayesian Compressive Sampling.” Canadian Geotechnical Journal. doi:10.1139/cgj-2019-0131. Wang, Y., K. Huang, and Z. Cao. 2013. “Probabilistic Identification of Underground Soil Stratification Using Cone Penetration Tests.” Canadian Geotechnical Journal 50 (7): 766–776. Wang, X., Z. Li, H. Wang, Q. Rong, and R. Y. Liang. 2016. “Probabilistic Analysis of Shield-driven Tunnel in Multiple Strata Considering Stratigraphic Uncertainty.” Structural Safety 62: 88–100. Wang, X., H. Wang, R. Y. Liang, and Y. Liu. 2019. “A Semisupervised Clustering-based Approach for Stratification Identification Using Borehole and Cone Penetration Test Data.” Engineering Geology 248: 102–116. Wang, X., H. Wang, R. Y. Liang, H. Zhu, and H. Di. 2018. “A Hidden Markov Random Field Model Based Approach for Probabilistic Site Characterization Using Multiple Cone Penetration Test Data.” Structural Safety 70: 128–138. Wang, H., X. Wang, J. F. Wellmann, and R. Y. Liang. 2018. “Bayesian Stochastic Soil Modeling Framework Using Gaussian Markov Random Fields.” ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 4 (2): 04018014. Wang, H., X. Wang, J. F. Wellmann, and R. Y. Liang. 2019. “A Bayesian Unsupervised Learning Approach for Identifying Soil Stratification Using Cone Penetration Data.” Canadian Geotechnical Journal 56 (8): 1184–1205. Wang, H., J. F. Wellmann, Z. Li, X. Wang, and R. Y. Liang. 2017. “A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields.” Mathematical Geosciences 49 (2): 145–177. Wang, J., Z. Xu, and W. Wang. 2010. “Wall and Ground Movements due to Deep Excavations in Shanghai Soft Soils.” Journal of Geotechnical and Geoenvironmental Engineering 136 (7): 985–994. Wang, Y., and T. Zhao. 2016. “Interpretation of Soil Property Profile from Limited Measurement Data: A Compressive Sampling Perspective.” Canadian Geotechnical Journal 53 (9): 1547–1559. Wang, Y., and T. Zhao. 2017. “Statistical Interpretation of Soil Property Profiles from Sparse Data Using Bayesian Compressive Sampling.” Géotechnique 67 (6): 523–536. Wang, Y., T. Zhao, Y. Hu, and K. K. Phoon. 2019. “Simulation of Random Fields with Trend from Sparse Measurements without De-trending.” Journal of Engineering Mechanics 145 (2): 04018130. Wang, Y., T. Zhao, and K. K. Phoon. 2018. “Direct Simulation of Random Field Samples from Sparsely Measured Geotechnical Data with Consideration of Uncertainty in Interpretation.” Canadian Geotechnical Journal 55 (6): 862–880. White, D., C. Cheuk, and M. Bolton. 2008. “The Uplift Resistance of Pipes and Plate Anchors Buried in Sand.” Géotechnique 58 (10): 771–779. Wood, T., P. Jayawickrama, J. Surles, and W. Lawson. 2012a. “Pullout Resistance of MSE Reinforcements in Backfills Typically Used in Texas: Vol. 2, Test Reports for MSE Reinforcements in Type B (sandy) Backfill.” Research Report: FHWA/TX-13/0-6493-3, Vol. 2, Texas Department of Transportation Wood, T., P. Jayawickrama, J. Surles, and W. Lawson. 2012b. “Pullout Resistance of MSE Reinforcements in Backfills Typically Used in Texas: Vol. 3, Test Reports for MSE Reinforcements in Type A (gravelly) Backfill.” Research Report: FHWA/TX-13/0-6493-3, Vol. 3, Texas Department of Transportation Wu, T. H., M. A. Abdel-Latif, M. A. Nuhfer, and B. B. Curry. 1996. Uncertainty in the Geologic Environment: from Theory to Practice (GSP 58), 76–90. Reston, VA: ASCE. Wu, S. H., J. Y. Ching, and C. Y. Ou. 2013. “Predicting Wall Displacements for Excavations with Cross Walls in Soft Clay.” Journal of Geotechnical and Geoenvironmental Engineering 139 (6): 914–924. Wu, S. H., C. Y. Ou, and J. Y. Ching. 2014. “Calibration of Model Uncertainties in Base Heave Stability for Wide GEORISK: ASSESSMENT AND MANAGEMENT OF RISK FOR ENGINEERED SYSTEMS AND GEOHAZARDS Excavations in Clay.” Soils and Foundations 54 (6): 1159– 1174. Wu, T. H., W. H. Tang, D. A. Sangrey, and G. B. Baecher. 1989. “Reliability of Offshore Foundations-state of the Art.” Journal of Geotechnical Engineering 115 (2): 157–178. Xiao, T., D. Q. Li, Z. Cao, and L. M. Zhang. 2018. “CPTbased Probabilistic Characterization of Three-dimensional Spatial Variability Using MLE.” Journal of Geotechnical and Geoenvironmental Engineering 144 (5): 04018023. Yang, Z., R. Jardine, W. Guo, and F. Chow. 2016. A Comprehensive Database of Tests on Axially Loaded Piles Driven in Sand. London: Academic Press. Yuan, J., P. Y. Lin, R. Huang, and Y. Que. 2019. “Statistical Evaluation and Calibration of Two Methods for Predicting Nail Loads of Soil Nail Walls in China.” Computers and Geotechnics 108: 269–279. Yuen, K. V., and G. A. Ortiz. 2016. “Bayesian Nonparametric General Regression.” International Journal for Uncertainty Quantification 6 (3): 195–213. 25 Yuen, K. V., and G. A. Ortiz. 2018. “Multi-resolution Bayesian Nonparametric General Regression for Structural Model Updating.” Structural Control and Health Monitoring 25 (2): e2077. Yuen, K. V., G. A. Ortiz, and K. Huang. 2016. “Novel Nonparametric Modeling of Seismic Attenuation and Directivity Relationship.” Computer Methods in Applied Mechanics and Engineering 311: 537–555. Zhang, L. M., L. M. P. Shek, H. W. Pang, and C. F. Pang. 2006. “Knowledge-based Design and Construction of Driven Piles.” Proceedings of the Institution of Civil Engineers Geotechnical Engineering 159 (3): 177–185. Zhang, D. M., Y. Zhou, K. K. Phoon, and H. W. Huang. 2019. “Multivariate Probability Distribution of Shanghai Clay Properties.” Engineering Geology, under review. Zhao, T., and Y. Wang. 2018. “Simulation of Cross-correlated Random Field Samples from Sparse Measurements Using Bayesian Compressive Sensing.” Mechanical Systems and Signal Processing 112: 384–400.