Acta Materialia 213 (2021) 116930 Contents lists available at ScienceDirect Acta Materialia journal homepage: www.elsevier.com/locate/actamat Full length article Optimizing the cellular automata finite element model for additive manufacturing to simulate large microstructures Kirubel Teferra∗, David J. Rowenhorst US Naval Research Laboratory, Washington, DC, USA a r t i c l e i n f o Article history: Received 11 December 2020 Revised 25 February 2021 Accepted 22 April 2021 Available online 26 April 2021 Keywords: Solidification microstructure Computer simulations Additive manufacturing Dendritic solidification a b s t r a c t This study proposes an implementation of the cellular automata finite element (CAFE) model in order to optimize its performance for simulating the solidification of additively manufactured (AM) materials. The translating melt pool in AM leads to a highly localized region of activity at any given time. The time scale associated with the evolving temperature field is much larger than the time scale associated with grain solidification. A separation of temporal scales is proposed such that solidification analysis is treated independently as sub-cycles within each discrete time step of the scanning laser. At each laser time step, all of the domain is idle except a small portion, which is the region where the material is undercooled but has yet to be solidified. This region is identified, and a second partition of the computational resources is established such that the solidification computations are evenly balanced. Numerical studies demonstrate that the proposed implementation achieves dramatic improvement in parallel scalability than previously reported and thus advances the state-of-the-art of CAFE modeling for AM in terms of computational efficiency. This improved computational performance is necessary to simulate the large 3D polycrystalline microstructures required to evaluate texture and grain morphology, and two of such simulations are presented. The model is validated for laser powder bed fusion 316L stainless steel, where the polycrystalline grain morphology and crystallographic texture of the simulated microstructure closely match experimental characterization data. A second simulation is performed for a different scan pattern in order to highlight and evaluate its effect on microstructural features. Published by Elsevier Ltd on behalf of Acta Materialia Inc. 1. Introduction Additively manufactured (AM) materials processing, where parts are fabricated through a layer-by-layer additive process according to geometry prescribed digitally through a 3D model, promises to revolutionize materials design by enabling topology optimization of component geometry and reducing the costs associated with fabricating small-batch, customized parts for numerous industrial applications [1–4]. In order to accredit AM processed materials for engineering design, the effect of processing on microstructure and mechanical properties must be well characterized experimentally and reliably predicted through theoretical and computational models. The high cooling rates and thermal gradients as well as rapid thermal cycles resulting from the highly localized, moving melt pool leads to microstructure evolution and solidification behavior that is vastly different than traditional, wrought processed materials. The distinctly different microstructures produced from AM compared to conventional processing coupled with the ∗ Corresponding author. E-mail address: kirubel.teferra@nrl.navy.mil (K. Teferra). https://doi.org/10.1016/j.actamat.2021.116930 1359-6454/Published by Elsevier Ltd on behalf of Acta Materialia Inc. potential impact of AM to greatly improve engineering design explains the high activity of research conducted on this topic. There are numerous processing variables within AM, each affecting the thermal profile and solidified microstructural features. The two most common AM processes, directed energy deposition (DED) and powder bed fusion (PBF), have distinctly different melt pool geometries and fusion mechanisms. The melt pool volume in PBF is smaller and translates with higher speed than in DED, leading to a melt pool geometry that is proportionally longer and shallower. As grain growth is strongly influenced by the direction of maximum temperature gradient, grains are more columnar in the build direction for PBF, whereas grain growth tends to be more curved in the direction of the scanning direction in DED [2, Section 3.2]. Further, the differences in energy density input and thermal gradient magnitude affects phase transformation during evolution. For example, laser PBF (L-PBF) 316L stainless steel fully solidifies as austenite, while in DED there exists secondary ferrite phases due to the slower cooling rates [1, Section 2.3]. Within a given AM process (e.g., either DED or PBF) parameters such as power, velocity, hatch spacing, layer thickness, and scan pattern also impart variations on thermal history, microstructural features, and ensuing K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 mechanical properties. A poor selection of build parameters will lead to a high amount of defects, most notably lack of fusion and keyhole pores. Keyhole pores are gas entrapped pores left behind in the depression below the melt pool caused by metal vaporization, and power density (power/velocity) has been shown to be a good metric to delineate conduction and keyhole modes [5,6]. Lack of fusion pores, primarily occurring at the boundaries of melt pool overlap regions and build layers [7], exist in incompletely melted regions, which can occur due too little power density, excessive hatch spacing, or poor particle packing density [8]. Three dimension serial sectioning is currently the most robust means to characterize small scale features such as grain boundary character and pore distribution, although it is a time consuming and destructive technique [9]. As a result of the large processing parameter space, the sensitivity of the microstructure to these parameters cannot be solely explored by costly experimental data collection. It is necessary to develop computational models validated against experiments that are able to predict thermal field history, component microstructure, and mechanical properties from the processing parameters [10]. Microstructure solidification modeling of AM is built upon solidification theory developed for conventional processing, including casting and welding [11,12]. There are three main categories of methods that have been proposed: stochastic models, phasefield (PF) models, and the cellular automata finite element (CAFE) model. The majority of stochastic models involve variants of random tessellations that can accurately represent the microstructures of conventionally processed materials [13–20]. These models suffer from only modeling the final microstructure and not the actual solidification process, and they are too idealized to accurately capture the complexity of AM morphology. The only exception to this is the Kinetic Monte Carlo (KMC) model, which has shown promise in capturing AM microstructure morphology [21– 23]. However, the drawback is that the physical representation of the model is limited, thus its parameters are determined through calibration. Among the three methods, the PF method contains the most physical fidelity as it is a rather general formulation to solve coupled evolution equations of conserved and nonconserved (order parameters) variables governed by a free energy functional defined by user-specified thermodynamic driving forces [24–27]. The primary drawbacks to the PF method is the difficultly in determining mobility coefficients as well as its intensive computational demand. Compared to the other methods, it is especially useful for understanding detailed physics at the powder and melt pool scale, particularly subgranular dendritic growth incorporating the effects of phase composition variations [28–30]. The CAFE model is a compromise in terms of the fidelity of physics between the stochastic models and the PF method, and it is arguably the most popular approach for modeling solidification at the polycrystalline length scale, which is necessary to evaluate texture for AM materials. It was originally developed for modeling solidification for casting in a series of papers by Gandin and Rappaz [31–36]. It has since been used extensively for various applications of microstructure evolution including recrystallization [37], welding [38,39], and dendritic solidification [40–42]. In recent years, the CAFE model has been used to better understand the role of processing parameters, primarily laser power and raster pattern, on the solidification process and microstructure of AM materials [43– 54]. While the vast majority of studies focus on validating the CAFE model against electron backscatter diffraction (EBSD) data, its potential use as a tool for predicting how to control microstructure to facilitate material design is also recognized. For example, Shi et al. [51] used the CAFE model to predict that laser beams having wide aspect ratios promote nucleation and thus counteract against the dominance of columnar grains. The studies listed above exhibit a range of microstructure characteristics, as well as model validity, despite using essentially the same formulation. This is at least partially reflecting the dependence on thermal history, nucleation rate, resolution of the computational domain, and implementation differences. Different heat models have been used to compute the thermal field including the Lattice Boltzmann method [49,50], Navier stokes equations [46,52], solutions to the heat conduction equation [44,45,53,54], and analytical approximations [43]. The studies employ different rates for grain nucleation, which has a dominant effect on solidified grain structure as evaluated by Li and Tan [45]. Furthermore, the majority of the simulations in these studies use prohibitively small computational domains to accurately compute texture and morphology statistics. Either the spatial resolution is too coarse or the domain size is too small to simulate numerous passes of the melt pool. There is a need to modify the original CAFE implementation to make it more efficient for AM solidification, as it was originally developed for casting problems characterized by spatially distributed solidification fronts. The present work details an implementation of the CAFE model which permits simulating large domain sizes that can be used as representative volume elements. The improved computational efficiency is attained by leveraging understanding of the particular physics involved in the AM process. The primary observations are that solidification occurs sequentially in highly localized regions and that the laser time step can be much larger than the time step associated with the solidification front. This permits a sub-cycling time stepping algorithm that enables the solidification computations to be efficiently parallelized within a laser time step. The parallel scalability of the proposed approach is evaluated through numerical examples, where is it common to achieve optimality on the order of a few hundred processors, depending on the resolution of the computations. In contrast, most CAFE codes achieve optimality around a dozen processors. It is this lack of parallel scalability that precludes simulating the large 3D microstructures that can be validated against characterization data for polycrystalline grain morphology and crystallographic texture. The implementation introduced in this work overcomes these computational limitations; and to the authors knowledge, this work presents the first CAFE simulation results of large 3D microstructures that include multiple scan paths per layer such that scan, transverse, and build direction behavior can be analyzed. The accuracy of the proposed implementation is demonstrated through a validation study showing that the simulated microstructure’s grain morphology and crystallographic texture closely match experimental characterization data for laser powder bed fusion 316L stainless steel (L-PBF 316L). The model’s validity is demonstrated through cross sectional inverse pole figure (IPF) maps as well as pole figure plots. Stainless steel has widespread industrial use because of its resistance to corrosion and oxidation, having numerous applications in the aerospace, medical, and energy sectors. As it is often used in small components and joints, L-PBF 316L may achieve tremendous performance gains for these application areas by optimizing component geometry while minimizing mass. As a result, there are a number of studies investigating the effects of microstructure features on mechanical performance under quasistatic and cyclic loading [55–62] as well as corrosion resistance as a function of heat treatment post-processing [63]. Under certain build conditions, L-PBF 316L achieves greater strength and ductility than conventionally manufactured 316L because of the multiscale deformation mechanisms arising from its hierarchical microstructural features [59,62]. Wang et al. [62] attributes the increased strength on the high density of dislocations organized into cellular networks, while the increase in ductility is due to deformation twinning and its interaction with cellular walls. Interestingly, the 2 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Table 1 Solidification parameters for the dendrite growth model used in Eq. (1). Parameter Description Value dL Diffusion coefficient Gibbs-Thomson coefficient Liquidus slope Partition coefficient Initial concentration 3e−9 m2 /s 1e−7 K m −10.9 K/wt.% 0.48 4.85 wt.% mL k C0 els relating primary dendrite tip velocity to undercooling are based on the LGK [66] and KGT [67] models. For ease of implementation, the present work uses a commonly used analytical approximation to the LGK model, given by v(T ) = DL 5.51π 2 (−mL (1 − k ))1.5 T 2 . 5 (1) C01.5 The parameters of the model are defined in Table 1. The key feature of this model is the power law relationship between the primary dendrite tip velocity and the undercooling T = tL − T , where tL is the liquidus temperature and the temperature field T is computed as described in Section 2.1. A growing cell captures one of its neighbors when the growing octahedron encompasses a designated point within the neighboring cell. Traditionally, this point is the centroid of the cell, although this leads to mesh-induced spurious anisotropy in the crystallographic texture. Appendix B describes this effect and the developed modification implemented in this work in determining the designated point representing the neighboring cell. When captured, this neighboring cell is assigned the grain identification and orientation of its capturing cell, and an octahedron for this cell commences to grow to represent the solidification front at this location. If the center of the octahedron is set equal to the designated point representing the captured cell, then spurious anisotropy is introduced in the solidified grain morphology as a result of the voxel-based mesh discretization. The Decentered Octahedral Algorithm developed by Gandin and Rappaz [33] is selected for this work as it is the most widely accepted approach to eliminate this source of spurious anisotropy. The following sections describe the calculation of the temperature field and how grain nucleation is modeled. Further details of the CAFE model can be found in the key developmental papers [31–36] as well as recent papers focused on AM [43–47,49,50,52–54]. microstructure features and mechanical test performance varied greatly between samples built using different machines (Concept Laser and Fraunhofer machines) due to the different laser beam sizes, layer thicknesses, and scanning speeds under the study by Wang et al. [62]. Relatedly, Andreau et al. [55] showed that the direction of chamber gas flow with respect to scanning direction produces differences in grain size and texture in samples that are otherwise built identically. Therefore, it is vital to have reliable models that predict polycrystalline-scale microstructures given AM processing parameters as a cost effective means to assess the quality of a build strategy. The implementation introduced in this work aims to advance the state-of-the-art towards achieving this goal. The paper is organized as follows. Section 2 gives a brief summary of the CAFE model, with Sections 2.1 and 2.2 describing the temperature field model and nucleation model utilized in the present study, respectively. In Section 3, the implementation introduced in this work is described and its parallel scalability is demonstrated. Validation examples are given in Section 4, followed by a discussion of the results in Section 5. Section 6 concludes the study and provides a link to the source code of the proposed CAFE implementation. 2. Cellular automata finite element model 2.1. Temperature field The cellular automata finite element (CAFE) model is a mesoscale model to simulate solidification at the polycrystalline length scale. The CAFE model does not spatially resolve the individual branches of the growing dendrites (i.e., primary, secondary, and tertiary branches) but instead simulates the envelopes circumscribing the primary dendritic arms as homogeneous material equipped with a local reference frame corresponding to the crystallographic orientation of its grain identification. For cubic crystal structures, this envelope is a regular octahedron with its diagonals aligned to the primary dendrite branches, which corresponds to the set of 001 directions in the local crystallographic coordinate system [64]. A domain B is discretized into cubic voxels and overlaid with field variables S(x, t ), g(x, t ), T (x, t ) : x ∈ B, denoting the state, grain identification, and temperature of the voxel at location x and time t, respectively. State variable S ∈ [0, 1, 2, 3] represents the possible discrete, finite states for each voxel, given as unassigned (0), liquid (1), mushy (2), solid (3). The unassigned state (S = 0) reflects the layer-by-layer deposition process and is prescribed to voxels at layers that the laser has not yet reached. The mushy state (S = 2) refers to voxels that lie on the solid-liquid interface. The simulation concludes when all the voxel states are solid. The states evolve according to the transition rules given in Table 2. The concept of a cell neighborhood is necessary to realize the state transitions, particularly growth and capture. The simulations in this work utilize the first-order Moore neighborhood [65], meaning each cell has 26 neighboring cells since simulations are performed in 3D. A solidifying cell grows if at least one of its neighbors is assigned S = 1 and all of its neighbors’ temperature are below the liquidus temperature tL . The most widely used solidification mod- The temperature field in this work is determined by the heat conduction equation in a semi-infinite domain, representing the bulk material response where boundary effects are negligible. Under this assumption along with a perfectly insulating top boundary and temperature-independent material properties, Schwalbach et al. [68] developed an analytical, series solution to the heat conduction equation with a Gaussian heat source representing the directed energy of the laser. The solution strategy initiates by discretizing the continuously moving laser source into N discrete laser sources with their centers denoted by spatial-temporal coordinates (ξ j , τ j ); j = 1, . . . , N. The solution to the temperature field is then given as T (x, t ) = T0 + N j=1 T ( j ) ηPj tH (t − τ j ) 1 λ exp − x − ξ j 3 2 ( j) 3 / 2 π ρ c p 2 i=1 λii λii( j ) = σi2j + 2α (t − τ j ) −1 x − ξj (2) where T0 , η, Pj , ρ , c p , α , t, and H (· ) denote the ambient temperature, laser efficiency, laser power, material density, specific heat, thermal diffusivity, time resolution, and Heaviside function, respectively. The parameter σi j denotes the laser beam radius for coordinate xi at time τ j = jt. The time resolution t is selected to be a very small value such that the temperature field T (x, t ) is smooth. It is not to be confused with the laser time step tl defined below in Section 3. Table 3 provides the units of these parameters and the values used in the examples below. In the current implementation, the only temperature field information required to simulate the microstructure solidification is the temporally evolving melt pool geometry and the temperature at the melt pool boundary (i.e., the solid-liquid interface). Since the high cooling rate and thermal gradient for AM leads to a very localized solid-liquid interface, it is assumed that all solidification 3 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Table 2 Cellular automata transition rules among states. Transition Transition Description State before State after New layer Melting Nucleation Grain growth Solidification Return to growth New layer of powder added Temperature > liquidus temperature Grain nucleated Liquid cell captured by mushy grain All neighbors are mushy At least one neighbor is liquid S=0 S = 1 S=1 S=1 S=2 S=3 S=3 S=1 S=2 S=2 S=3 S=2 Fig. 1. Double ellipsoid model for the melt pool geometry. Table 3 Simulation parameters for examples. Parameter Description Value P Laser power Laser efficiency Material density Thermal conductivity Heat capacity Laser radius Liquidus temperature Solidus temperature Ambient temperature Melt pool dimension Melt pool dimension Melt pool dimension Melt pool dimension Hatch spacing Layer thickness Laser velocity 175 W 0.5 80 0 0 kg/m3 18 W/(m K ) 500 J/(kg K ) (55, 55, 55 ) μm 1609 K 1520.5 K 300 K 75 μm 200 μm 75 μm 75 μm 114.75 μm 30 μm 500 mm/s η ρ κ cp σ tL tS t0 a b c d bh lt v yond the solidification front speed associated with the solidus temperature tS , the material further supercools and the melt pool size increases. In this situation the melt pool size is determined by finding the isosurface of the temperature field for the temperature where the dendrite velocity v(T ) in Eq. (1) equals the laser scan velocity vl . Therefore, prior to executing a simulation, the steadystate temperature field is computed using Eq. (2), the isosurface of the solidification temperature tˆS is identified, and a double ellipsoid as described in Fig. 1 is fit to this geometry. The information provided for the CA simulation is the solidification temperature tˆS and geometric parameters a, b, c, d of the melt pool. This steadystate solution translates through the domain according to the laser scan pattern and velocity. Enforcing solidification to occur at a constant temperature (or narrow temperature range) significantly improves the stability of the solidification computations of the CA algorithm. 2.2. Nucleation model occurs at a constant temperature. The melt pool geometry is determined by the region enclosed by the isosurface of the solidification temperature, and its shape is approximated as the lower half of a double ellipsoid (i.e., two semi-ellipsoids conjoined at their planar boundary) as depicted in Fig. 1. Fig. 1a shows the SD-TD cross section, and Fig. 1b shows the SD-BD cross section where SD, TD, BD refers to scanning, transverse, and build directions, respectively. The melt pool geometry is simplified to a double ellipsoid for the purpose of focusing the computational resources on the solidification modeling; it is not a necessary feature of this CAFE implementation but is used in this study to circumvent recalculating the temperature field in each time step. Numerical studies have shown that the double ellipsoid is a very accurate representation of the melt pool geometry for the temperature field model in Eq. (2). The solidification temperature (i.e., temperature at the melt pool boundary) is calculated as tˆS = min(tS , Tv (vl ) ), where Tv (vl ) is the temperature obtained by inverting the dendrite velocity equation in Eq. (1), vl is the prescribed laser velocity, and tS is the solidus temperature. Essentially, as the laser speed increases be- Grain solidification in AM processing is overwhelmingly dominated by epitaxial growth by the grains at the melt pool boundary as well as the powder particles in the fusion layer, which themselves are polycrystalline. Nonetheless, nucleation can arise due to various events associated with system stochasticity, including splatter of partially melted material because of denudation, porosity and other surface discontinuities, or the presence of impurities. The rate of nucleation is modeled as a function of undercooling, and most studies use the Gaussian model of Thevoz et al. [69] or that of Nastac [70]. Since solidification occurs very fast in AM and is occurring within a narrow region trailing the moving melt pool boundary, solidification happens within a small temperature range. Therefore, in this work nucleation is modeled with a constant rate, r, and can only occur in voxels with state S = 1 that have a neighboring voxel with state S = 2 (i.e., at the melt pool boundary). During the growth process, each undercooled liquid voxel that is about to be captured by a neighbor is first evaluated for nucleation. Let the voxel edge length be denoted as x. If a generated sample of a random variable uniformly distributed in [0, 1] is less that 4 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 r · (x )3 , then a new grain is nucleated at the center of this voxel having a crystallographic orientation that is uniformly randomly generated according the approach by Roşca et al. [71]. Otherwise, the voxel is captured by its neighbor, and it commences to grow as a regular octahedron determined by the Decentered Octahedron Algorithm [33]. moving laser, the solidification front evolves via sub-cycle time steps. The CAFE algorithm consists of the following procedures in a given time step for the moving laser: (1) Update laser location and compute temperature field T (x, t ), (2) Determine new liquid voxels (S(x, t ) = 1), (3) Determine new mushy voxels (S(x, t ) = 2), (4) Simulate grain solidification and nucleation in undercooled liquid region, (5) Determine new solid voxels (S(x, t ) = 3), (6) Increment time. Procedure 4 is where the sub-cycle time stepping occurs to evolve the solidification front. Pseudocode of these procedures are given by Algorithms 1–6 in Appendix A. Following Schwalbach et al. [68], the time step for simulating the moving laser is taken to be tl = 34vσ where σ and vl are the laser beam radius and veloc- 2.3. Key modeling assumptions The following simplifications are made in the modeling of nucleation, temperature field, and solidification. The nucleation is considered to only occur on the melt pool interface, meaning that nucleation potentially occurring elsewhere, which could arise from splattering of partially melted particles, is not considered. Solidification of subgranular phenomena, such as cellular solidification and phase segregation, is neither modeled nor resolved with the length scale of the simulations. The current implementation of the model is limited to planar solidification of a single phase, which is a generally accepted approximation for polycrystalline-scale simulations in the cooling rate and thermal gradient regimes found in AM processing. Modeling the temperature field distribution during the build process has received considerable attention. Within the melt pool, heat flow is primarily driven by convection due to Marangoni forces [2] and a number of models have been proposed based on solving the Navier-Stokes equations [72–76]. These models focus on the scale of the melt pool and often resolve individual powder particles. They are very useful for understanding the effect of laser processing parameters on porosity, wetting of powders due to random packing, extent and effect of denudation, and material spattering. On the other hand, beyond the melt pool, heat transfer into the base material is primarily driving by conduction. For this reason, solving the heat conduction equation to determine the temperature field for mesoscale simulation is often taken to be a sufficient approximation. In fact, the overall melt pool shape and geometry is not too different than the Navier–Stokes solution at the polycrystalline length scale of a homogeneous medium [75,76]. Additionally, it is a common presumption that the latent heat of fusion effects on the temperature field can be ignored when modeling the AM process, as its impact has been argued to be secondary [77]. This approximation permits a one-way coupling between the CA and the temperature field, drastically reducing the computational costs. In order to further minimize computational costs, an analytical solution to the heat conduction equation for a semi-infinite domain is used in this study. The effects of boundary conditions, such as corners or thin-walled structures, as well as temperature dependent material properties can be evaluated in future studies, for example by incorporating the temperature field model developed by Steuben et al. [78]. The melt pool geometry shown in Fig. 1 is a highly idealized shape. Simulations of the temperature field at the powder-scale (e.g., Khairallah et al. [72]) show deviations of the melt pool interface from an idealized shape due to the stochastic nature of particle packing. This causes local variations in the melt pool boundary not captured with an idealized geometry. Stochasticity can be incorporated in the current formulation by specifying a random field model for the thermal conductivity and specific heat, although this is beyond the scope of the present study. l ity of the laser, respectively. As a result of the localized melting in AM, the cellular automata simulation to advance the solidification front within laser time step tl occurs in a very small region with respect to the overall volume of the domain. This region is denoted the active region and is defined for each laser time step as the region where the liquid voxels are undercooled but have yet to be solidified (i.e., T (x, t ) < tL and S(x, t ) = 1). By definition, adjacent to this region is a region comprised of neighboring mushy voxels (i.e., S(x, t ) = 2 and there exists x ∈ Nx where T (x , t ) < tL and S(x , t ) = 1, with Nx denoting the neighborhood of x), which grows into the active region within the current laser time step to execute the solidification. Recall that a voxel with S(x, t ) = 1 can only be captured when it is undercooled (T (x, t ) < tL ) and has a neighbor x ∈ Nx with S(x , t ) = 2. An example of this region is highlighted in gray in Fig. 2, and is the typical geometry at the moment when the laser has advanced and the solidification commences (i.e., the beginning of Procedure 4). The region in black is not active during the time step because the temperature in the region is above the liquidus temperature, and thus solidification is not yet possible. A recent parallelization strategy has been proposed by Lian et al. [47] that partitions the entire simulation domain among the processors. This approach is only effective when the entire domain is active throughout the simulation period, such as in solidification during casting. If this strategy is to be adopted for AM solidification, then many processors will be idle while just a few perform the CA simulation over the active region during each laser time step. A new parallelization strategy must be developed for solidification of AM processing to account for the small active region. In the proposed approach, the entire domain is initially partitioned as in Lian et al. [47], and each processor performs all procedures listed above except Procedure 4 under this partition. As these procedures must be performed over the entire domain, this partition efficiently balances these computations. However, the vast majority of the simulation time is in executing Procedure 4. For the examples below in Section 4, Procedure 4 consumes greater than 98% of the total simulation time, in part because the temperature field model is analytical. In order to efficiently balance the computations, a new partition is created within each laser time step prior to executing Procedure 4. At the beginning of Procedure 4, each processor identifies the voxels in its partition that are in the active region and are broadcasted to all the processors. Once the necessary arrays within the active region are broadcasted, they are then assembled and evenly partitioned among all processors. Under this new partition, Procedure 4 is executed with the computations evenly balanced among all the processors. This proposed parallelization strategy conceptually follows that proposed in Carozzani et al. [40] but for a different problem setting. The parallelization scheme developed in Carozzani et al. [40] is for a fully coupled CAFE model for ingot casting. At each time step, the scheme identifies the active region, interpolates the temperature field onto the active region of the CA grid, and performs the solidification simulation for the duration of the time step. While conceptually similar, the active region identification is less straight forward for casting problems as it is more distributed 3. Numerical implementation The key observation for the proposed CAFE algorithm is that the time scale associated with resolving the temperature field is much larger than that for the solidification computations. The advancement of the laser and solidification front are staggered in a multi-time-step approach such that for each time step of the 5 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Fig. 2. Isometric view of a CAFE simulation clipped at the laser scan center line at the time snapshot immediately before solidification commences. The active region for this time step is depicted in gray (S (x, t ) = 1, T (x, t ) < tL ). This is the only region in the domain where solidification and nucleation is occurring for the time step. The region in black is not active because the temperature is above the liquidus temperature and cannot be solidified yet (S (x, t ) = 1, T (x, t ) >= tL ). The rest of the simulation domain is either solid voxels colored by inverse pole figures with respect to the build direction (S (x, t ) = 3), formerly solid voxels converted to mushy voxels because they are in the neighborhood of the active region (S (x, t ) = 2), or unassigned voxels for layers not yet reached by the laser (S (x, t ) = 0), which are not shown in the figure. throughout the domain compared to the localized melt pool in AM processing. Also, because of the one-way coupling approximation between the temperature field and phase transformation often made for AM modeling, the dramatic time scale difference between the solidification process and temperature field evolution can be exploited by the subcycling scheme proposed in this work. In this subcycling approach, tens of thousands of solidification time steps are executed per laser time step. The larger the time scale difference, the larger the active region, which improves the parallel efficiency. These factors contribute to the parallel scaling achieved below. Whereas Carozzani et al. [40] reports that the CA code only achieves a 2.3 time speed up between using 8 and 256 processors, the proposed approach can achieve an order of magnitude speed up for this range of processors. For the execution of Procedure 4, the sub-cycle time stepping entails advancing the solidification front sequentially voxelby-voxel. Most CAFE implementations recommend calculating the maximum velocity among all the growing cells and setting the solidification time step such that the fastest growing voxel grows by less than a scale factor of the voxel edge length. Too small a scale factor leads to too many time steps whereas the opposite leads to discretization errors. In order to avoid having to calibrate the scale factor, the proposed implementation sets the solidification time step by calculated the time required for the first voxel to be captured. At each step, every mushy voxel’s velocity in the preferred growth direction is calculated according to Eq. (1), and the time it takes for every mushy voxel’s octahedron to have one of its faces intersect with each of its neighbors in the active region is computed. It must be noted that every mushy voxel’s temperature is enforced to be the solidification temperature tˆS as that is the true temperature present at the melt pool boundary. Each voxel neighbor is represented by a point denoted as the designated point as described in Appendix B (i.e., it is not necessarily its centroid). The smallest time δtmin is taken as the solidification time step. Let the designated points of the capturing and the captured voxel pair associated with δtmin be denoted as xgrow and xcapt , respectively. Then nucleation is determined at xcapt as described in Section 2.2. If a new grain is nucleated, its octahedron’s centroid is set to be the voxel’s centroid with diagonal length of zero. If nucleation does not occur, then the Decentered Octahedron Algorithm is used to determine the new octahedron for the voxel associated with xcapt . The diagonal length of all other voxel octahedra are extended by amount v(x )δtmin . This sequence continues until none of the voxels in the active region have state S = 1. Calculating the time it takes for the mushy voxels to capture one of its neighbors is the primary time consuming component of the CAFE code. This computation is evenly distributed among processors, maximizing computational resources. For each solidification time step, MPI communication is minimized: there is one MPI_Allreduce command to find the smallest time increment δtmin , and two MPI_Bcast commands performed on scalar integers to share the argument of the growing and captured voxel pair associated with δtmin . The parallel scalability of this implementation is demonstrated in Fig. 3, which plots the computational time to simulate one laser time step versus the number of processors, N proc . The optimal number of processors depends on the size of the active region, and it is observed that the optimal number of processors increases as the number of voxels within the active region increases. Three simulations are conducted for one time step with varying active regions in order to elucidate this trend. The active region volume is varied by varying the melt pool sizes. The melt pool parameters (a, b, c, d ) defined in Fig. 1 are (56.25, 135.0, 56.25, 56.25 ) μm for simulation A, (75.0, 180.0, 75.0, 75.0 ) μm for simulation B, and (93.75, 225.0, 93.75, 93.75 ) μm for simulation C. The laser time step tl for the 3 simulations A,B,C are [37.5, 50, 62.5] μs, respectively. The laser scan velocity is set to be vl = 1 m/s. Thus, the spatial increment of the laser Xl = vl tl gives increments of [37.5, 50, 62.5] μm, respectively. Given a double ellipsoid melt pool with parameterized dimensions given in Fig. 1, the volume of the active region illustrated in Fig. 2 is given by Va = π cd 6b2 3b2 Xl − Xl3 (3) Let Na = int (Va /x3 ) be the number of voxels in the active region. In all 3 simulations the voxel edge length is x = 1 μm and the domain voxel size is (225, 124, 62 ), given in the scanning direction (SD), transverse direction (TD), and build direction (BD), respectively. Fig. 3a shows the number of processors versus computational time for the three simulations. In order to elucidate the trend, the data for each simulation are fit to a curve of the form y(x ) = ξ1 /xξ2 + ξ3 where ξ1 , ξ2 , ξ3 are parameters fit to the data using function curve_fit in the SciPy Python package [79]. For all 6 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Fig. 3. Parallel scaling study results: Computational time per laser time step versus (a) number of processors, and (b) number of processors divided by number of voxels in the active region. 3 simulations, there is an exponential drop in computational time until a critical number of processors is reached, after which the curve flattens because of the increasing demand associated with message passing. As the number of processors becomes very large it is expected that the computational time will increase because the message passing will dominate the computations. This is not explored because the number of processors where this occurs is too large. The exponential decrease in time is not explicitly apparent in Simulation C because the computational time when using a small number of processors is impractically large. As the number of voxels in the active region Na increases, the computational time per laser time step increases. Also, the curves are shifted to the right with increasing Na , meaning more processors should be used as Na increases. Increasing Na can either be because of increasing the resolution or increasing the laser time step. If the active region volume is increased because of increasing the laser time step, then the overall simulation time is not drastically affected because the increase in computational time per time step is offset by the reduction in the total number of laser time steps. The following investigates if a trend can be found that extracts the dependence of the computational time on the active region volume. Fig. 3b plots the computational time for one laser time step versus the ratio of the number of processors to the number of voxels in the active region, that is N proc /Na . It can be observed that the 3 curves get closer to each other by normalizing the number of processors with the number of voxels in the active region. The curves are more vertically aligned, although there remains an increase in computational time for larger active regions (i.e., Simulation C curve is above the curves of Simulations A and B). The reason why the computational time increases is because of the increased load of message passing associated with requiring more processors. This study demonstrates that the proposed implementation modification of the CAFE model, where a second partition is constructed for each laser time step over the active region to distribute the solidification computations, achieves a desirable parallel scalability up to a certain number of processors. Further, since the computational time does not vary much among time steps, this study can be quickly performed as a prerequisite to determine the optimal number of processors for conducting a large scale simulation. Although not shown, similar studies were performed without using the modification introduced in this work, where the only partition is over the entire simulation domain. In this case, there is no computational gain by increasing the number of processors beyond a very small number. In fact beyond 8 processors, the computational time increases because there is an increase in time spent on mes- sage passing with minimal decrease in time for simulating the solidification. It is worth noting that memory demands are not an issue compared to the demand on computational resources. The reason is because the primary partitioning scheme (i.e., Partition 1) evenly partitions the entire domain among the processors. Field variable arrays are created to only contain the elements that are local to each processor’s partition along with ghost nodes to communicate information to partition neighbors. For solidification computations, all processors contain all the arrays defined over the active region. However, since the active region is small relative to the entire domain, the sizes of these arrays impart a low demand on memory resources. 4. Model validation In this section, microstructures are simulated and evaluated using two different laser scan strategies, while holding all other parameters identical. The first pattern is a unidirectional, back-and+− forth scanning pattern (denoted as SD X ). In the second pattern, the laser scanning is back-and-forth but the scanning direction is +− +− rotated by 90◦ from one layer to the next (denoted as SD X Y ). +− The scan paths are directly overlaid in every layer for the SD X +− +− case and every other layer in the SD X Y case. Fig. 4 illustrates these patterns. The simulation parameters including material properties, laser parameters, and melt pool geometry, and simulation geometry and given in Table 3. As a result of the need to balance accuracy with computational cost, the simulations in the examples below have voxel resolution x = 1.875 μm. It is worth noting that this resolution is similar, if not finer, than recent studies published. For example, Rolchigo et al. [80] determine that grain volume and aspect ratio converge at x = 1.67 μm. Akram et al. [43] uses a voxel size of 1 μm but only does 2D simulations. Chen et al. [38] uses a voxel size of 80 μm for a simulation focused on arc welding. Koepf et al. [44] uses a voxel size of 10 μm for L-PBF of Inconel 718. Lian et al. [46] uses a voxel size of 2.5 μm for DED simulation of Inconel 718. Zinovieva et al. [54] uses a voxel size of 5 μm to simulate selective laser melted Ti-6Al-4V. Therefore, it is clear that it is common practice to use voxel dimension between 1–10 μm, depending on the application. Baseplate microstructures are constructed using a Voronoi tessellation to represent each grain with average grain diameter of 5 μm. The crystallographic orientations are formed by generating uniformly distributed random rotations following the approach by Roşca et al. [71]. The nucleation rate is one of the most difficult 7 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Fig. 4. Illustration of the two scan patterns studied. Fig. 5. Example SD cross sections for the three simulations cases. The colors represent inverse pole figure colors in the build direction. The box in the upper right corner shows the location where the cross sections are taken. model parameters to determine since inhomogeneous nucleation, which can be conceived from defects or partially melted particles, is driven by system stochasticity. The effect on nucleation density on grain morphology and texture is examined by studying 3 nucleation rates with values r = 3e14 m−3 , 6e14 m−3 , and 1.2e15 m−3 , three cases are shown in Fig. 10. All pole figures in this work are generated using MTEX [81]. The pole figures have similar features but with decreasing magnitude as nucleation increases. This is because higher nucleation leads to greater density of grains with random orientation, which reduces the amount of texture. The sample and these 3 simulations are performed for the SD X case denoted with the least nucleation, case SD X A , has the 001 pole figure most closely resembling the data from the experimentally characterized sample, which can be found in Andreau et al. [55], and re- +− +− +− +− +− as SD X A , SD X B , and SD X C , respectively. The simulation domain size is (375, 375, 562.5 ) μm3 giving its voxel dimension to be (20 0, 20 0, 30 0 ). These simulations take about 6.5 hours distributed over 132 2.8 GHz Intel E5-2699v4 cores using the intel C++ compiler version 19.0.1.144 optimized for a Cray X40/50 High Performance Computing (HPC) system. Typical SD and BD cross sections are shown in Figs. 5 and 6 for all three cases, respectively. Example TD cross sections along the scan center lines and at melt pool over+− +− +− produced below in Fig. 15. A bigger sample of case SD X A is then simulated in order to verify that the pole figures achieve convergence. This simulation’s domain size is (750, 750, 562.5 ) μm3 giving its voxel dimension to be (40 0, 40 0, 30 0 ). The simulation takes approximately 65 h distributed over 144 2.7 GHz Intel Xeon Platinum 8168 cores using the intel C++ compiler version 18.1.163 for a SGI 8600 High Performance Computing (HPC) system. The results of this simulation are illustrated in Figs. 11 and 12. Fig. 11 shows +− lap regions are shown for cases in SD X A , SD X B , and SD X C in Figs. 7–9, respectively. In addition, the 001 pole figures for the 8 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Fig. 6. Example BD cross sections for the three simulation cases. The colors represent inverse pole figure colors in the build direction. The box in the upper right corner shows the location where the cross sections are taken. +− Fig. 7. Example TD cross sections at the scan center lines, (a) and (b), and at a melt pool overlap region, (c), for the SD X A case. The colors represent inverse pole figure colors in the build direction. The box in the upper right corner shows the locations where the cross sections are taken. +− Fig. 8. Example TD cross sections at the scan center lines, (a) and (b), and at a melt pool overlap region, (c), for the SD X B case. The colors represent inverse pole figure colors in the build direction. The box in the upper right corner shows the location where the cross sections are taken. 9 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 +− Fig. 9. Example TD cross sections at the scan center lines, (a) and (b), and at a melt pool overlap region, (c), for the SD X C case. The colors represent inverse pole figure colors in the build direction. The box in the upper right corner shows the location where the cross sections are taken. Fig. 10. Pole figures of the 001 crystallographic direction for the three cases simulated. an isometric viewpoint of the sample along with example SD, TD, and BD cross sections, while Fig. 12 shows the 001, 101, and 111 pole figures. The 001 pole figure appears to have converged at this simulation size because it is quite similar to the pole figure generated from the smaller simulation but smoother. The following example presents simulation results of the Andreau et al. [55] built 3 samples of L-PBF 316L: Samples A and B had laser power of P = 175 W and scan speed vl = 500 mm/s, whereas in Sample C the power and scan speed were P = 400 W and vl = 1100 mm/s, respectively. The power density is the same for all 3 cases and the only difference between the Samples A and B is the SD with respect to the chamber gas flow. In Sample A, the SD is parallel and anti-parallel with the gas flow, while in the Sample B the SD is perpendicular to the gas flow. The EBSD micrographs of the SD cross sections of the 3 samples (i.e., Andreau et al. [55, Fig. 6]) show that the experimentally characterized Sample B most closely resembles the simulated results of +− +− SD X Y case described above. The material and laser parameters are identical to the previous example Simulation A. Thus the impact of using a different scan pattern on the solidified microstructural features is highlighted. This simulation’s domain size is (750, 750, 562.5 ) μm3 giving its voxel dimension to be (40 0, 40 0, 30 0 ). The simulation takes approximately 65 hours distributed over 144 2.7 GHz Intel Xeon Platinum 8168 cores using the intel C++ compiler version 18.1.163 for a SGI 8600 High Performance Computing (HPC) system. Fig. 13 illustrates the microstructure morphology of this simulation via a 3D isometry view as well as three cross sections along the reference frame cartesian directions. Fig. 14 shows the 001, 101, and 111 pole figures for the +− SD X . While mechanisms explaining the different microstructures A between Sample A and B are not elaborated by Andreau et al. [55], it is likely that the melt pool size in Sample A contracts and expands in the SD direction depending on if the SD is parallel or anti-parallel to the chamber gas flow. The melt pool geometry is more likely to be constant in Sample B, as it is in the simulations in this study. For completeness of the following presentation, the EBSD micrograph and pole figure data of Sample B in Andreau et al. [55] are reprinted in Fig. 15. The dominance of a particular orientation in the competitive growth process is controlled by the alignment of the preferred crystallographic growth direction with the direction of the local maximum temperature gradient, which is anti-parallel to the (outward) normal direction of the melt pool boundary. At the center of the melt pool, the temperature gradient is parallel with the BD, so grains with aligned in the BD tend to dominate growth in this +− +− SD X Y case. 5. Discussion +− The simulation parameters for the SD X case are very similar to the build conditions given in Andreau et al. [55], and the results exhibit many similarities in grain morphology and crystallography with the experimentally characterized data. The study in 10 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 +− Fig. 11. Isometric view along with example SD, TD, and BD cross sections of the large simulation of the SD X A case. +− Fig. 12. Pole figures of the , , and crystallographic directions for the large simulation of the SD X A case. region. This is shown through experimental characterization and is captured through this simulation, highlighted in the SD cross section in Fig. 5(a), the BD cross sections in Fig. 6, and the TD cross sections in Figs. 7–9. These grains with 001 || BD grow epitaxially through multiple build layers in these regions: this is most clear in Figs. 7–9, which show these grains primarily growing vertically and slight curved toward the laser SD. Between the scan center lines, the melt pool boundary normal is more widely distributed. At the widest part of the melt pool, the thermal gradient points ≈ 45◦ from the BD, and orthogonal to the SD. It is this feature that introduces the 101 || BD; 100 || SD texture, as one of the [001] will point along the high thermal gradient, which can be determined from the pole figures in Fig. 12. Grains with this orientation will be the first that start to solidify out of the melt pool, selected against any other orientations. Thus, as the widest part of the melt pool passes, and the thermal gradient starts to increasingly have a component pointing in the the SD direction, grain orientations that might have had a more favorable orientation of the pointing along the SD, have already been preferentially eliminated from the mushy zone thus leaving the dominance of the 101|| BD; 100 || SD texture. Of course, the amount of nucleation and the particulars of the local grain neighborhood make this only a general trend, and there are other orientations that grow that do not meet these perfect ideals. For example, as seen in Fig. 7(c) there are also contributions from grains with 111 || BD (i.e., blue grains), yellow (i.e., ≈ 214|| BD), and purple (i.e., ≈ 112|| BD) grains spanning multiple build layers and 11 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 +− +− Fig. 13. Isometric view along with three cross sections illustrating the grain morphology of the SD X Y case. +− +− Fig. 14. Pole figures of the , , and crystallographic directions for the simulation of the SD X Y case. oriented vertically in the BD. It is also seen that as the amount of random nucleation is increased, there is a decrease in these multilayer structures, as the nucleation disrupts the competitive growth process by creating new equiaxed grains of random orientation. For example, the observed columnar structures are less apparent in Figs. 8 and 9, and Fig. 6 shows a gradual erosion of structure from Fig. 6(a)–(c). The 101 || BD; 100 || SD texture is further reinforced by the back-and-forth rastering. Numerous researchers have observed an asymmetrical chevron pattern of the grain morphology between scan center lines in the SD cross sections. Scan strategies are designed such that the melt pool in subsequent scans along the TD overlap with the previous scan’s melt pool with the aim of elimi- nating lack of fusion pores. The chevron pattern forms because the melt pool boundary normal between the two scans point in the opposite direction with respect to the TD and SD. The asymmetry is because the melt pool overlap region extends well beyond the midpoint between the center lines. Furthermore, the preferred grain orientations for growth will be well oriented having 100 aligned with the thermal gradients from both melt pools in this overlapping region, which is the 101||BD 100 ||SD texture. The asymmetrical chevron pattern is captured in the simulations and is highlighted in Fig. 16. +− The pole figure shapes between the small and large SD X simA ulation cases (i.e, Figs. 10(a) and 12(a), respectively) are similar, 12 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 +− Fig. 15. Validation data used to evaluate simulation SD X . Images are reprinted from the Journal of Materials Processing Technology, 264, Olivier Andreau, Imade Koutiri, Patrice Peyre, Jean-Daniel Penot, Nicolas Saintier, Etienne Pessard, Thibaut De Terris, Corinne Dupuy, Thierry Baudin, Texture control of 316L parts by modulation of the melt pool morphology in selective laser melting, 21-31 (2019), with permission from ELSEVIER. +− center lines is slightly less pronounced than in the SD X case, because the alternating direction of the laser scan also alternates whether a location is within the scan center line region or the melt pool overlap region. The changing scan direction has the effect of convolving the structure of the morphology as it further distributes the variation of melt pool boundary normal direction throughout the build. The BD cross section in Fig. 13(d) exemplifies this. A close inspection reveals locations having a grid-like structure following the laser scan pattern, however much of the cross section shows a morphology without obvious structure. The pole figures for the , , and crystallographic directions are shown in Fig. 14. The peak magnitudes are lower (approximately +− 70%) than in the SD X case. Again, increasing the variation in the scan pattern spreads the distribution of the melt pool boundary normals during the build, which reduces the preference of a particular group of crystallographic orientations. Although there is only a slight texture present, the magnitudes can again be explained by examining the melt pool boundary geometry at the last remelting scan for a given region at each layer. Let the build direction be denoted by coordinate z, and then x and y denote coordinate directions in the plane perpendicular to the BD. Recall that +− for the SD X case the crystallographic direction predominantly points in the (±0 SD, 1 T D, 1 BD ) direction, following the melt pool boundary normal along the widest region of the melt pool (i.e., Fig. 12(a)). Since now the SD alternates layer by layer in the x and y directions, the crystallographic direction on average points in the (±1, ±1, 1 )xyz direction following similar arguments. The peaks of the and crystallographic directions in Fig. 14(b,c) can be explained by considering cubic symmetry. +− Fig. 16. Scan direction cross section of the large simulation of the SD X case highA lighting the asymmetrical chevron pattern of the grain morphology. The superimposed vertical black lines show scan center lines while the thicker black lines identify locations where the asymmetrical chevron patterns are observed. despite the peak magnitude dropping from 4.2 to 3 MRD (multiples of random). The similar shape between these cases and the smoothness of the pole figure in 12(a) suggests that the pole figures appear to be converging and that the large simulation is approximately the domain size necessary to predict crystallographic texture. 6. Conclusions In this work, implementation modifications are made to the CAFE model, originally developed for casting solidification, to optimize solidification modeling for AM. The parallelization algorithm proposed accommodates the localized solidification process arising from the translating melt pool in AM. The small active region at each time step is identified and a second partition is created to evenly distribute the solidification computations. Parallel scalability studies demonstrate the computational gain using this approach, which enables simulation of the large polycrystalline-scale microstructures necessary to identify texture and grain morphology. Two scan patterns are simulated to validate the model and to evaluate the effect of scan pattern on polycrystalline features. The simulated microstructures for L-PBF +− +− Fig. 13 shows the grain morphology of the SD X Y scan pattern, where the effect of the laser scanning in both directions perpendicular to the BD can be observed. Both cross sections perpendicular to the BD have the same structure of grain morphology at the melt pool centers, and is similar to the SD cross section for +− the SD X case: grains are vertically oriented at the scan center lines with crystallographic directions 001|| BD. Additionally the chevron patterns are again present in the overlap region. The dominance of the crystallographic directions in the BD along the scan 13 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 316L show very good agreement with experimental characterization data in terms of grain morphology and pole figure plots. The modularity of the model permits swapping temperature field models, nucleation models, or dendrite growth models to accommodate different physical properties or fidelity. It is anticipated that this model will be useful for numerous subsequent studies aimed to ascertain the effect of build parameters of microstructural features. The source code of the implementation described in this study as well as scripts to run the examples are publicly available at https://github.com/USNavalResearchLaboratory/AMCAFE. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments Funding for this project was provided by the Office of Naval Research through the Naval Research Laboratory’s Basic Research Program (No. N0 0 01418WX0 0 093) and through the ONR Agile ICME Tool Kit project (No. N0 0 01420WX0 0405 ). Appendix A. Pseudocode of the implementation of the CAFE model This section provides pseudocode describing the implementation of the CAFE model developed in this work. Initialize: Model parameters in Table 3, nucleation rate r, voxel discretization of domain B, time t = 0 Output : Solidified microstructure over domain B Np p p Create partition HB of B such that p=1 HB = B ; for i = 0, i < number of time steps Nt , ++ i do Compute temperature field via Algorithm 2; Set liquid voxels via Algorithm 3 ; Set mushy voxels via Algorithm 4 ; Solidify active region via Algorithm 5; Set solid voxels via Algorithm 6 ; t += tl ; end Algorithm 1: Overview of CAFE model. Input : time t, scan pattern S Output: Temperature field T (x, t ) Determine laser location xL = (xL , yL , zL ) based on scan pattern S and time t; Melt pool region M is given by Fig. 1 with origin xL ; p N = number of local voxels N p + number of ghost voxels Ng in partition HB ; /* Loop through voxels in domain x = (x, y, z ) for i = 0, i < N, ++ i do if voxel zi > zL then continue end if xi ∈ M then T ( xi , t ) = tL else T (xi , t ) = tˆS end end Algorithm 2: Computation of temperature field. Input : Temperature field T (x, t ) Output: Updated voxel state S(x, t ) p N = number of local voxels N p + number of ghost voxels Ng in partition HB ; p /* Loop through voxels x = (x, y, z ) in partition HB for i = 0, i < N, ++ i do if T (xi , t ) ≥ tL then S ( xi , t ) = 1 end end Algorithm 3: Set liquid voxels. 14 */ */ K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Input : Voxel state S(x, t ) Output: Updated voxel state S(x, t ) p N = number of local voxels N p in partition HB ; p /* Loop through voxels x in partition HB for i = 0, i < N, ++ i do Determine neighborhood of xi as Nxi ; if S(xi ) = 3 & any S(Nxi ) = 1 then S ( xi ) = 2 end end Use MPI_Sendrecv(S) MPI communication to share voxel state S(x, t ) among neighboring processors Algorithm 4: Set mushy voxels. */ Input : temperature T (x ), voxel state S(x ), grain ID G(x ), and voxel octahedral diagonal length a(x ) Output: updated S(x ), G(x ), a(x ) at time t + tl Identify active region A as where S(x, t ) = 1 and T (x, t ) < tL ; p p p p Assemble arrays TA (x ) = T (x ), SA (x ) = S(x ), GA (x ) = G(x ), aA (x ) = a(x ) for x ∈ A; Use MPI_Bcast to collect arrays to all processors: TA = p Evenly distribute A into parition HA such that p aˆA (x ) Np p=1 Np p T , p=1 A SA = Np p S , p=1 A GA = H p = A and assign local arrays = aA (x ) for x ∈ while #(SA = 1 ) > 0 do for j=1,…,#(H p ) do p p if SA (x j ) = 2 then continue; end if ∃S(x , t ) = 1 for x ∈ Nx & T (x , t ) < tL ∀x ∈ Nx then p Calculate velocity v(x j ) according to Eq. (1); Np p G , aA = p=1 A p TˆA (x ) = TA (x ), Np p a ; p=1 A p SˆA (x ) = SA (x ), p Gˆ A (x ) = GA (x ), H p; for k=1,…,#(Nx p ) do j Calculate distance to capture neighbor d jk via Eq. (B.2); Calculate time to captureδt p = min δt p , d jk v(x pj ) end end end (δt, p∗ ) = (min δt p , argmin δt p ) // involves one MPI_Reduce ; p p xgrow , xcapt = argmin δt p∗ ; j p∗ ,k p∗ Share j p∗ , k p∗ to all processors // involves two MPI_Bcast; if generated sample rˆ ∼ U[0, 1] < rnuc then Nucleate new grain at xcapt following description in Section 2.2 else Initiate octahedron for cell at xcapt using Decentered Octahedron Algorithm [33]; Share octahedron diagonal, a for xgrow to all processors // involves one MPI_Bcast; end if x ∈ / {xgrow , xcapt } then Increase cell octahedron diagonal, a as a += v(x )δt end end Algorithm 5: Solidification of active region. Input : Voxel state S(x, t ) Output: Updated voxel state S(x, t ) p N = number of local voxels N p in partition HB ; p /* Loop through voxels x in partition HB for i = 0, i < N, ++ i do Determine neighborhood of xi as Nxi ; if S(x ) = 2 & all S(Nx ) > 1 then S (x ) = 3 end end Use MPI_Sendrecv(S) MPI communication to share voxel state S(x, t ) among neighboring processors Algorithm 6: Set solid voxels. 15 */ K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 Appendix B. Voxel capture details Fig. B.1 illustrates a dendrite envelope in 2D centered at point x and the growth necessary to capture neighbors i and j characterized by points yi and y j . Computing the time necessary for a growing dendrite envelope with nucleation located at x to capture one of its neighboring voxels represented by a point y is given as follows. The distance the growing dendrite envelope needs to travel such that it first intersects with a point y is d = |R · (y − x )|1 − a (B.1) where R is the rotation matrix that maps the global reference frame into the local crystal reference frame for grain associated with x, | · |1 is the 1 norm, and a is the diagonal length of the dendrite envelope at current time. Thus, the time to capture can be computed as δt = d/v where v is the velocity of the envelope. It can be seen in Fig. B.1 that if y is set to be the centroid for each voxel, then there is a bias toward capturing voxels along the primary directions of the voxel grid because the distance y − x is smaller for these neighbors than neighboring voxels along diagonals (i.e., neighbors that do not share a face). In order to offset this bias Eq. (B.1) is modified to be d = ξd |R · (y − x )|1 − a +− Fig. B.2. Pole figure of the crystallographic direction for Simulation SD X deA fined in Section 4 without including the correction factor for the mesh induced anisotropy. The corresponding pole figure with correction factor ξd = cos(θ )sin(φ ) included in voxel capture algorithm is shown in Fig. 10a). References (B.2) [1] P. Bajaj, A. Hariharan, A. Kini, P. Kürnsteiner, D. Raabe, E.A. Jägle, Steels in additive manufacturing: a review of their microstructure and properties, Mater. Sci. Eng. 772 (2020) 138633. [2] T. DebRoy, H. Wei, J. Zuback, T. Mukherjee, J. Elmer, J. Milewski, A.M. Beese, A. Wilson-Heid, A. De, W. Zhang, Additive manufacturing of metallic components–process, structure and properties, Prog. Mater. Sci. 92 (2018) 112–224. [3] J.J. Lewandowski, M. Seifi, Metal additive manufacturing: a review of mechanical properties, Annu. Rev. Mater. Res. 46 (2016) 151–186. [4] J. Liu, A.T. Gaynor, S. Chen, Z. Kang, K. Suresh, A. Takezawa, L. Li, J. Kato, J. Tang, C.C. Wang, et al., Current and future trends in topology optimization for additive manufacturing, Struct. Multidiscip. Optim. 57 (6) (2018) 2457–2483. [5] R. Cunningham, C. Zhao, N. Parab, C. Kantzos, J. Pauza, K. Fezzaa, T. Sun, A.D. Rollett, Keyhole threshold and morphology in laser melting revealed by ultrahigh-speed x-ray imaging, Science 363 (6429) (2019) 849–852. [6] J. Yang, J. Han, H. Yu, J. Yin, M. Gao, Z. Wang, X. Zeng, Role of molten pool mode on formability, microstructure and mechanical properties of selective laser melted Ti-6Al-4V alloy, Mater. Des. 110 (2016) 558–570. [7] A.E. Wilson-Heid, S. Qin, A.M. Beese, Multiaxial plasticity and fracture behavior of stainless steel 316L by laser powder bed fusion: experiments and computational modeling, Acta Mater. (2020) 578–592. [8] H. Choo, K.-L. Sham, J. Bohling, A. Ngo, X. Xiao, Y. Ren, P.J. Depond, M.J. Matthews, E. Garlea, Effect of laser power on defect, texture, and microstructure of a laser powder bed fusion processed 316L stainless steel, Mater. Des. 164 (2019) 107534. [9] D.J. Rowenhorst, L. Nguyen, A.D. Murphy-Leonard, R.W. Fonda, Characterization of microstructure in additively manufactured 316L using automated serial sectioning, Curr. Opin. Solid State Mater. Sci. 24 (3) (2020) 100819. [10] M.P. Echlin, W.C. Lenthe, T.M. Pollock, Three-dimensional sampling of material structure for property modeling and design, Integr. Mater. Manuf. Innov. 3 (1) (2014) 278–291. [11] S. David, S. Babu, J. Vitek, Welding: Solidification and microstructure, Jom 55 (6) (2003) 14–20. [12] D.A. Porter, K.E. Easterling, Phase Transformations in Metals and Alloys (Revised Reprint), CRC Press, 2009. [13] A. Alpers, A. Brieden, P. Gritzmann, A. Lyckegaard, H.F. Poulsen, Generalized balanced power diagrams for 3D representations of polycrystals, Philos. Mag. 95 (9) (2015) 1016–1028. [14] M. Groeber, S. Ghosh, M.D. Uchic, D.M. Dimiduk, A framework for automated analysis and simulation of 3D polycrystalline microstructures. Part 2: synthetic structure generation, Acta Mater. 56 (6) (2008) 1274–1287. [15] M.A. Groeber, M.A. Jackson, Dream. 3D: a digital representation environment for the analysis of microstructure in 3D, Integr. Mater. Manuf. Innov. 3 (1) (2014) 5. [16] S. Mandal, J. Lao, S. Donegan, A.D. Rollett, Generation of statistically representative synthetic three-dimensional microstructures, Scr. Mater. 146 (2018) 128–132. [17] O. Šedivỳ, T. Brereton, D. Westhoff, L. Polívka, V. Beneš, V. Schmidt, A. Jäger, 3D reconstruction of grains in polycrystalline materials using a tessellation model with curved grain boundaries, Philos. Mag. 96 (18) (2016) 1926–1949. [18] A. Spettl, T. Brereton, Q. Duan, T. Werz, C.E. Krill III, D.P. Kroese, V. Schmidt, Fitting Laguerre tessellation approximations to tomographic image data, Philos. Mag. 96 (2) (2016) 166–189. [19] K. Teferra, L. Graham-Brady, Tessellation growth models for polycrystalline microstructures, Comput. Mater. Sci. 102 (2015) 57–67. where ξd = cos(θ )sin(φ ) and (θ , φ ) are the spherical coordinates components indicating the direction of y − x in the global reference frame. By taking symmetry into account (θ , φ ) are mapped to θ ∈ [0 π /4] and φ ∈ [π /4 π /2] prior to computed ξd . It was found that setting ξd = cos(θ )sin(φ ) overcompensated for the bias and the chosen modification strikes a balance between overcompensated (ξd = cos(θ )sin(φ )) and undercompensated (ξd = 1) bias. When a voxel is captured, it is initiated with a length as described in Gandin and Rappaz [33]. To be consistent with this modification, this length is multiplied by cos(θ )sin(φ ). Fig. B.2 shows +− the pole figure of the example SD X in Section 4 without comA pensating for the mesh-induced anisotropy (i.e., as compared to Fig. 10(a)). The crystallographic directions are predominantly concentrated along the voxel grid directions: this is deemed to be a spurious artifact of the mesh-induced anisotropy, which is overcome by using Eq. (B.2). Fig. B.1. Illustration in 2D of dendrite envelope centered at x and growth necessary to capture its neighboring voxels represented by points yi and y j . 16 K. Teferra and D.J. Rowenhorst Acta Materialia 213 (2021) 116930 [20] K. Teferra, D.J. Rowenhorst, Direct parameter estimation for generalised balanced power diagrams, Philos. Mag. Lett. 98 (2) (2018) 79–87. [21] K.L. Johnson, T.M. Rodgers, O.D. Underwood, J.D. Madison, K.R. Ford, S.R. Whetten, D.J. Dagel, J.E. Bishop, Simulation and experimental comparison of the thermo-mechanical history and 3D microstructure evolution of 304L stainless steel tubes manufactured using lens, Comput. Mech. 61 (5) (2018) 559–574. [22] T.M. Rodgers, J.D. Madison, V. Tikare, Simulation of metal additive manufacturing microstructures using kinetic Monte Carlo, Comput. Mater. Sci. 135 (2017) 78–89. [23] H. Wei, G. Knapp, T. Mukherjee, T. DebRoy, Three-dimensional grain growth during multi-layer printing of a nickel-based alloy Inconel 718, Addit. Manuf. 25 (2019) 448–459. [24] W.J. Boettinger, J.A. Warren, C. Beckermann, A. Karma, Phase-field simulation of solidification, Annu. Rev. Mater. Res. 32 (1) (2002) 163–194. [25] L.-Q. Chen, Phase-field models for microstructure evolution, Annu. Rev. Mater. Res. 32 (1) (2002) 113–140. [26] N. Moelans, B. Blanpain, P. Wollants, An introduction to phase-field modeling of microstructure evolution, Calphad 32 (2) (2008) 268–294. [27] J. Park, J.-H. Kang, C.-S. Oh, Phase-field simulations and microstructural analysis of epitaxial growth during rapid solidification of additively manufactured AlSi10Mg alloy, Mater. Des. 195 (2020) 108985. [28] L.-X. Lu, N. Sridhar, Y.-W. Zhang, Phase field simulation of powder bed-based additive manufacturing, Acta Mater. 144 (2018) 801–809. [29] S. Sahoo, K. Chou, Phase-field simulation of microstructure evolution of Ti-6Al-4V in electron beam additive manufacturing process, Addit. Manuf. 9 (2016) 14–24. [30] X. Zhang, Y. Liao, A phase-field model for solid-state selective laser sintering of metallic materials, Powder Technol. 339 (2018) 677–685. [31] C.-A. Gandin, J.-L. Desbiolles, M. Rappaz, P. Thevoz, A three-dimensional cellular automation-finite element model for the prediction of solidification grain structures, Metall. Mater. Trans. A 30 (12) (1999) 3153–3165. [32] C.-A. Gandin, M. Rappaz, A coupled finite element-cellular automaton model for the prediction of dendritic grain structures in solidification processes, Acta Metall. Mater. 42 (7) (1994) 2233–2246. [33] C.-A. Gandin, M. Rappaz, A 3D cellular automaton algorithm for the prediction of dendritic grain growth, Acta Mater. 45 (5) (1997) 2187–2195. [34] C.-A. Gandin, M. Rappaz, R. Tintillier, Three-dimensional probabilistic simulation of solidification grain structures: application to superalloy precision castings, Metall. Trans. A 24 (2) (1993) 467–479. [35] M. Rappaz, Modelling of microstructure formation in solidification processes, Int. Mater. Rev. 34 (1) (1989) 93–124. [36] M. Rappaz, C.A. Gandin, J.-L. Desbiolles, P. Thevoz, Prediction of grain structures in various solidification processes, Metall. Mater. Trans. A 27 (3) (1996) 695–705. [37] D. Raabe, Cellular automata in materials science with particular reference to recrystallization simulation, Annu. Rev. Mater. Res. 32 (1) (2002) 53–76. [38] S. Chen, G. Guillemot, C.-A. Gandin, Three-dimensional cellular automaton-finite element modeling of solidification grain structures for arc-welding processes, Acta Mater. 115 (2016) 448–467. [39] R.S. Saluja, R.G. Narayanan, S. Das, Cellular automata finite element (CAFE) model to predict the forming of friction stir welded blanks, Comput. Mater. Sci. 58 (2012) 87–100. [40] T. Carozzani, C.-A. Gandin, H. Digonnet, Optimized parallel computing for cellular automaton–finite element modeling of solidification grain structures, Model. Simul. Mater. Sci. Eng. 22 (1) (2013) 015012. [41] H. Dong, P.D. Lee, Simulation of the columnar-to-equiaxed transition in directionally solidified Al–Cu alloys, Acta Mater. 53 (3) (2005) 659–668. [42] K. Reuther, M. Rettenmayr, Perspectives for cellular automata for the simulation of dendritic solidification—A review, Comput. Mater. Sci. 95 (2014) 213–220. [43] J. Akram, P. Chalavadi, D. Pal, B. Stucker, Understanding grain evolution in additive manufacturing through modeling, Addit. Manuf. 21 (2018) 255–268. [44] J.A. Koepf, M.R. Gotterbarm, M. Markl, C. Körner, 3D multi-layer grain structure simulation of powder bed fusion additive manufacturing, Acta Mater. 152 (2018) 119–126. [45] X. Li, W. Tan, Numerical investigation of effects of nucleation mechanisms on grain structure in metal additive manufacturing, Comput. Mater. Sci. 153 (2018) 159–169. [46] Y. Lian, Z. Gan, C. Yu, D. Kats, W.K. Liu, G.J. Wagner, A cellular automaton finite volume method for microstructure evolution during additive manufacturing, Mater. Des. 169 (2019) 107672. [47] Y. Lian, S. Lin, W. Yan, W.K. Liu, G.J. Wagner, A parallelized three-dimensional cellular automaton model for grain growth during additive manufacturing, Comput. Mech. 61 (5) (2018) 543–558. [48] A. Rai, H. Helmer, C. Körner, Simulation of grain structure evolution during powder bed based additive manufacturing, Addit. Manuf. 13 (2017) 124–134. [49] A. Rai, M. Markl, C. Körner, A coupled cellular automaton–Lattice Boltzmann model for grain structure simulation during additive manufacturing, Comput. Mater. Sci. 124 (2016) 37–48. [50] M.R. Rolchigo, M.Y. Mendoza, P. Samimi, D.A. Brice, B. Martin, P.C. Collins, R. LeSar, Modeling of Ti-W solidification microstructures under additive manufacturing conditions, Metall. Mater. Trans. A 48 (7) (2017) 3606–3622. [51] R. Shi, S.A. Khairallah, T.T. Roehling, T.W. Heo, J.T. McKeown, M.J. Matthews, Microstructural control in metal laser powder bed fusion additive manufacturing using laser beam shaping strategy, Acta Mater. 184 (2020) 284–305 . [52] Y. Zhang, J. Zhang, Modeling of solidification microstructure evolution in laser [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] 17 powder bed fusion fabricated 316L stainless steel using combined computational fluid dynamics and cellular automata, Addit. Manuf. 28 (2019) 750–765. A. Zinoviev, O. Zinovieva, V. Ploshikhin, V. Romanova, R. Balokhonov, Evolution of grain structure during laser additive manufacturing. simulation by a cellular automata method, Mater. Des. 106 (2016) 321–329. O. Zinovieva, A. Zinoviev, V. Ploshikhin, Three-dimensional modeling of the microstructure evolution during metal additive manufacturing, Comput. Mater. Sci. 141 (2018) 207–220. O. Andreau, I. Koutiri, P. Peyre, J.-D. Penot, N. Saintier, E. Pessard, T. De Terris, C. Dupuy, T. Baudin, Texture control of 316L parts by modulation of the melt pool morphology in selective laser melting, J. Mater. Process. Technol. 264 (2019) 21–31. U.S. Bertoli, B.E. MacDonald, J.M. Schoenung, Stability of cellular microstructure in laser powder bed fusion of 316L stainless steel, Mater. Sci. Eng. 739 (2019) 109–117. M. Godec, S. Zaefferer, B. Podgornik, M. Šinko, E. Tchernychova, Quantitative multiscale correlative microstructure analysis of additive manufacturing of stainless steel 316L processed by selective laser melting, Mater. Charact. 160 (2020) 110074. T. Kurzynowski, K. Gruber, W. Stopyra, B. Kuźnicka, E. Chlebus, Correlation between process parameters, microstructure and properties of 316L stainless steel processed by selective laser melting, Mater. Sci. Eng. 718 (2018) 64–73. A. Riemer, S. Leuders, M. Thöne, H. Richard, T. Tröster, T. Niendorf, On the fatigue crack growth behavior in 316L stainless steel manufactured by selective laser melting, Eng. Fract. Mech. 120 (2014) 15–25. K. Saeidi, X. Gao, F. Lofaj, L. Kvetková, Z.J. Shen, Transformation of austenite to duplex austenite-ferrite assembly in annealed stainless steel 316L consolidated by laser melting, J. Alloys Compd. 633 (2015) 463–469. W.M. Tucho, V.H. Lysne, H. Austbø, A. Sjolyst-Kverneland, V. Hansen, Investigation of effects of process parameters on microstructure and hardness of SLM manufactured SS316L, J. Alloys Compd. 740 (2018) 910–925. Y.M. Wang, T. Voisin, J.T. McKeown, J. Ye, N.P. Calta, Z. Li, Z. Zeng, Y. Zhang, W. Chen, T.T. Roehling, et al., Additively manufactured hierarchical stainless steels with high strength and ductility, Nat. Mater. 17 (1) (2018) 63–71. R. Fonda, D. Rowenhorst, C. Feng, A. Levinson, K. Knipling, S. Olig, A. Ntiros, B. Stiles, R. Rayne, The effects of post-processing in additively manufactured 316l stainless steels, Metall. Mater. Trans. A 51 (12) (2020) 6560–6573. M.C. Flemings, Solidification processing, Metall. Trans. 5 (10) (1974) 2121–2134. L. Nastac, D.M. Stefanescu, Stochastic modelling of microstructure formation in solidification processes, Model. Simul. Mater. Sci. Eng. 5 (4) (1997) 391. J. Lipton, M. Glicksman, W. Kurz, Dendritic growth into undercooled alloy metals, Mater. Sci. Eng. 65 (1) (1984) 57–63. W. Kurz, B. Giovanola, R. Trivedi, Theory of microstructural development during rapid solidification, Acta Metall. 34 (5) (1986) 823–830. E.J. Schwalbach, S.P. Donegan, M.G. Chapman, K.J. Chaput, M.A. Groeber, A discrete source model of powder bed fusion additive manufacturing thermal history, Addit. Manuf. 25 (2019) 485–498. P. Thevoz, J. Desbiolles, M. Rappaz, Modeling of equiaxed microstructure formation in casting, Metall. Trans. A 20 (2) (1989) 311–322. L. Nastac, Numerical modeling of solidification morphologies and segregation patterns in cast dendritic alloys, Acta Mater. 47 (17) (1999) 4253–4262. D. Roşca, A. Morawiec, M. De Graef, A new method of constructing a grid in the space of 3D rotations and its applications to texture analysis, Model. Simul. Mater. Sci. Eng. 22 (7) (2014) 075013. S.A. Khairallah, A.T. Anderson, A. Rubenchik, W.E. King, Laser powder-bed fusion additive manufacturing: physics of complex melt flow and formation mechanisms of pores, spatter, and denudation zones, Acta Mater. 108 (2016) 36–45. C. Körner, A. Bauereiß, E. Attar, Fundamental consolidation mechanisms during selective beam melting of powders, Model. Simul. Mater. Sci. Eng. 21 (8) (2013) 085011. V. Manvatkar, A. De, T. DebRoy, Spatial variation of melt pool geometry, peak temperature and solidification parameters during laser assisted additive manufacturing process, Mater. Sci. Technol. 31 (8) (2015) 924–930. T. Mukherjee, H. Wei, A. De, T. DebRoy, Heat and fluid flow in additive manufacturing—Part I: modeling of powder bed fusion, Comput. Mater. Sci. 150 (2018) 304–313. T. Mukherjee, H. Wei, A. De, T. DebRoy, Heat and fluid flow in additive manufacturing—Part II: powder bed fusion of stainless steel, and titanium, nickel and aluminum base alloys, Comput. Mater. Sci. 150 (2018) 369–380. M. Chiumenti, E. Neiva, E. Salsi, M. Cervera, S. Badia, J. Moya, Z. Chen, C. Lee, C. Davies, Numerical modelling and experimental validation in selective laser melting, Addit. Manuf. 18 (2017) 171–185. J.C. Steuben, A.J. Birnbaum, J.G. Michopoulos, A.P. Iliopoulos, Enriched analytical solutions for additive manufacturing modeling and simulation, Addit. Manuf. 25 (2019) 437–447. P. Virtanen, R. Gommers, T.E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, et al., Scipy 1.0: fundamental algorithms for scientific computing in python, Nat. Methods 17 (3) (2020) 261–272. M. Rolchigo, B. Stump, J.F. Belak, A. Plotkowski, Sparse thermal data for cellular automata modeling of grain structure in additive manufacturing, Model. Simul. Mater. Sci. Eng. 28 (2020) 065003. F. Bachmann, R. Hielscher, H. Schaeben, Texture analysis with MTEX–free and open source software toolbox, in: Solid State Phenomena, 160, Trans Tech Publ, 2010, pp. 63–68.