AIAA-2006-0117 44th AIAA Aerospace Sciences Meeting & Exhibit January 8-12, 2006 Reno, NV Comparative Analysis of Hybrid Turbulence Closure Models in Unsteady Transonic Separated Flow Simulations D. Basu**, A. Hamed*, K. Das** Department of Aerospace Engineering and Engineering Mechanics & Q. Liu**, K. Tomko+ Department of Electrical and Computer Engineering and Computer Science University of Cincinnati Cincinnati, OH 45220-2515 ABSTRACT This paper presents assessment of DES (Detached Eddy Simulation), Hybrid RANS (Reynolds-Averaged Navier Stokes) / LES (Large Eddy Simulation), and PANS (Partially Averaged Navier-Stokes) turbulence closure models in unsteady separated flow simulations. The two-equation k-ε based models are implemented in a full 3-D NavierStokes solver and numerical results are compared for transonic flow solution over an open cavity using a 3 rd order Roe scheme. Results for the vorticity, pressure fluctuations, SPL (Sound Pressure level) spectra and for modeled and resolved TKE (Turbulent Kinetic Energy) are presented and compared with available experimental data and LES results. All models predicted accurately the dominant frequency at peak SPL. The experimental peak SPL was more closely matched by the DES1 and PANS models predictions compared to the other DES models and the hybrid model results. There are large differences among the models resolved TKE within the cavity. The results indicate that the resolved TKE levels by the DES, hybrid and the PANS models are comparable to LES predictions at (1/6)th the CPU resources. The relative execution times, as well as cache and memory utilization are compared for the models. INTRODUCTION Many CFD analyses of turbulent flows are performed within the framework of Reynolds Averaged Navier-Stokes (RANS) equations because of computational affordability. While RANS models yield prediction of useful accuracy in attached flows; they fail to accurately capture the complex flow structures in separated flow regions. Simulation strategies such as LES are attractive as an alternative for prediction of flow fields where RANS is deficient but carry a prohibitive computational cost for resolving boundary layer turbulence at the high Reynolds numbers. This in turn provides a strong incentive for the merging of these two techniques in DES, hybrid RANS-LES and PANS approaches1. DES1,2,3 was developed for the simulation of turbulent flows with massive separation at realistic Reynolds numbers4,5. It is based on the adoption of a single turbulence model that serves as a sub-grid scale model in the separated flow regions where the grid is nearly isotropic and as a RANS turbulence closure model in attached boundary layers regions6. Spalart et al.4 first proposed the concept of DES based on the original formulation of the Spalart-Allmaras (S-A) one-equation RANS model7. Subsequently, Strelets5, Bush et al.8, Batten et al.9, Nichols et al.10 and Basu et al.11 proposed parallel concepts of DES based on two-equation turbulence models. Applications of the DES models for a wide variety of problems involving separated flow configurations 11-15 have shown a certain degree of success relative to the RANS predictions. Another class of hybrid technique relies on explicitly dividing the computational domain into RANS and LES regions16-21. This approach requires initialization of the LES fluctuating quantities at the interface because the RANS * Professor, Fellow AIAA, + Assistant Professor, ** Graduate Student Copyright © 2006 by Basu et.al. Published by the 1 American Institute of Aeronautics and Astronautics, Inc., with permission region delivers Reynolds averaged flow statistics. Recently, synthetic generation of turbulence and the use of forcing functions have been used by Schluter et al.17,18 and Keating et al.19,20 at the interface between RANS and LES regions. Baurle et al.22 proposed a hybrid technique based on a k-ω RANS and a one equation sub-grid scale turbulent kinetic energy (SGS TKE) model23. This approach requires modification of the RANS TKE equations to a form that is consistent with the SGS TKE equation and a blending function. Basu et al.11,24 developed a hybrid model based on the combination of the two-equation k-ε turbulence model25 and a one-equation sub-grid-scale (SGS) model23. Recently, Girimaji et al.26 have developed PANS (Partially Averaged Navier Stokes) model based on the unresolved-to-total ratios of kinetic energy and dissipation. They found the predictions to be quite encouraging for flow over a square cylinder, a circular cylinder and nozzle flows 26. Recently, Hamed et al.27 analyzed the performance of the parallel solver FDL3DI28, with the standard RANS k- turbulence model in a commodity cluster. The relative speed up, execution time and cache analysis were compared for different numerical schemes and different sub-domain sizes. Different type of Roe schemes, a 2nd order central schemes and a 4th order compact scheme with 6th order filtering (C4F6) was considered in the analysis of turbulent flow over flat plate. The investigation indicated that the number of sub-domains as well as the type of spatial discretization scheme used significantly affected the execution time and cache miss rate. The relative and the normalized execution times were similar for the 2 nd order central scheme and the three Roe schemes but significantly greater for the high order compact scheme. The cache analysis showed the compact scheme results in higher cache accesses compared to the Roe and the central schemes but still achieves a low miss rate. Both the cache access number and the cache miss number decreased with increasing number of subdomains. In the present work, three DES models, hybrid model as well as a modified PANS formulation are developed and implemented into FDL3DI28, then evaluated in the predictions of unsteady separated transonic flows. Transonic cavity flow is dominated by shear layer instability, acoustically generated flow oscillations and shock interactions. It encompasses complex coupling between turbulence and acoustics and its SPL spectra includes discrete resonance in addition to broadband small scale fluctuations typical of turbulent shear layers29. The present analysis goes beyond prior simulations11,24,29 using WIND code30 as well as FDL3DI. In addition to the comparison of the SPL spectra with both experimental and LES results, the present investigation compares the resolved TKE from the simulations using the various closure models. The effect of computational grid on the resolved turbulent quantities is also explored in the current work. TURBULENCE MODELS Three DES models, one hybrid model and one PANS model are used in the current investigation. DES Models The DES formulas are based on the two-equation k- turbulence model33 with modifications of k, to reduce the eddy viscosity (μt) in proportion to the local grid resolution. In the first DES model (DES1), the turbulent kinetic energy dissipation rate is expressed as k 3 / 2 DES FDES 1 FDES * max , C b In the second DES model (DES2), the turbulent kinetic energy k is expressed as k DES FDES k RANS 1 FDES * min k RANS , C b 2 / 3 (1) (2) In the third DES model (DES3), the turbulent Reynolds number Ret is expressed as R et FDES * l k k 1 FDES min C b * k , l k k (3) Copyright © 2006 by Basu et.al. Published by the 2 American Institute of Aeronautics and Astronautics, Inc., with permission C In all the above equations 1-3, FDES AINT min b ,1.0 l k The local grid length is defined as max max , t u i2 where, (4) (5) u i2 u 2 v 2 w 2 (6) max max x, y, z (7) l k k3/ 2 (8) lk- is the turbulent length scale. In equation 4, AINT is a FORTRAN90 function that truncates the fractional portion of the argument. x, y, z are the dimensions for the local grid cell and lk- is the turbulent length scale.. Cb is a model constant for the refinement of the DES model that has a significant effect on resolved scales and energy cascading 8,29 . The switch / transfer from the RANS model to LES model is based on C b as well as on the ratio of the grid-scale to the turbulent length scale. In regions where, C b << l k , where the grid is fine enough to resolve the turbulent scales present, the eddy viscosity is reduced to the LES level. Otherwise, in regions where C b >> l k , where the grid is too coarse for resolving the turbulent scales present, the eddy viscosity remains as predicted by the baseline RANS model. A value of 0.61 was suggested4,34 based on the calibration of homogeneous turbulence. A value of Cb between 0.1 and 0.5 gives the best prediction of time-averaged lift coefficient for subsonic flow over airfoil 8. Mani et al.35 used both S-A based7 and SST based8 DES turbulence models in the simulation of jet flows and investigated the effect of Cb on the flow field. They compared the centerline mean velocity with experimental results and concluded that lower values of Cb, resulted in better agreement with experimental results. Previous investigations 29 by the first three authors indicated that a value of 0.1 for the Cb give the best results in terms of the SPL spectra agreement with experimental data. Hybrid Model In this formulation, the RANS TKE equation is essentially reformulated to resemble the SGS TKE equation in the regions where it is desired to resolve the TKE. The present investigation combines a RANS two-equation k- model25,33 and a SGS one -equation model of Yoshizawa and Horiuti.23 using a blending function. The blending function is applied to the turbulent kinetic energy equations and also the eddy viscosities in the RANS k equation and the SGS equation. The k-equation for the two-equation model is given by k RANS k RANSU j t RANS k RANS Pk c L K t x j x j k1 x j The SGS k-equation is given by t sgs k sgs (k sgs U j ) t x j x j k2 2 k sgs k Pk C d sgs x j (9) 3 (10) Using the blending function F the combined equation is given as Copyright © 2006 by Basu et.al. Published by the 3 American Institute of Aeronautics and Astronautics, Inc., with permission k t x j x j t k k Pk 1 M 2t x j 3 F * 1 F * C k 2 d F*L k (11) The hybrid eddy viscosity is given by t F * C 1 * f * R et * 1 F * C 2 * * k SGS * (12) The blending function is given by 1 tanh2(f d 0.5) F 2.0 (13) where, C f d AINT min b ,1.0 l k (14) Here, max x, y, z, t u i2 (15) AINT and Cb have the same significance as explained in the previous DES models. For the present simulations, C b is kept at 0.5. Other model constants are given by: σk1 = 1.0, σk2 = 1.0, Cμ1 = 0.09 and Cμ2 = 0.01 and Cd = 1. 5 It can be observed that the model reverts to equation 10 when F = 0.0. This is encountered where C b l k . On the other hand, where C b l k , fd = 1, F = 1 and the model reverts to the RANS equations. PANS Model The PANS model resolves the large unsteady fluctuations through a partial averaging concept, whose extent is determined by pre-specified fraction of unresolved turbulent energy (fk) and unresolved dissipation rate (fε). The PANS model was proposed by Girimaji et al.26 who suggested a single-stage and a two-stage PANS36. The single stage approach (PANS-v1), is used in the current investigation. As shown by Girimaji et al. 26, the extent of PANS averaging – relative to RANS – is best quantified using the unresolved-to-total ratios of kinetic energy (fk) and dissipation (f). The model equations used for k and are given by k kU j t k Pk c L K t x j x j k x j U j t x j x j t 2 2 t C f P C f 1 1 k 2 2 x k k j (16) 2u i x 2 j (17) The coefficient C 2 differentiates the PANS model from the baseline RANS model. fk C 2 C 1 f The value of f is taken as 1.0 and the coefficients C1, C2, k, are same as the parent RANS model. C1 = 1.44, C2 = 1.92, k = 1.0, = 1.3 C* 2 C 1 (18) Girimaji et al.26 used constant values of fk in their PANS computation of flow around a circular cylinder. However, the constant value of fk was found to be unsuitable for the current investigations with the large unsteady flow dynamics. Therefore, a spatially and temporally varying fk that is dependent on the local grid and the turbulent length scale was developed for the present computations. Copyright © 2006 by Basu et.al. Published by the 4 American Institute of Aeronautics and Astronautics, Inc., with permission 1 ,1 f k min (19) 2/3 l k 1 C PANS This was derived based on the analysis provided by Girimaji et al. 26,37 and also on the analysis provided by Schiestel and Dejoan38 in their partially integrated transport model (PITM). In this expression, and CPANS are constants taken as 0.3 and 0.8. lk- is the turbulent length scale, and is the local grid length. COMPUTATIONAL METHODOLOGY The numerical solutions to the unsteady, three-dimensional compressible Navier-Stokes equations are written in strong conservation-law form were obtained using the FDL3DI solver developed at AFRL. The solver was validated for a wide range of high speed and low speed; steady and unsteady problems 32,39-44. The time integration employs the implicit, approximate-factorization, Beam-Warming algorithm45 along with the diagonal form of Pullinam and Chaussee46. Newton sub iterations are used to improve temporal accuracy and stability of the algorithm. Subiterations also permit the use of the more efficient diagonal form46 of the implicit algorithm while retaining the time accuracy. The two-equation k-ε based DES models; hybrid RANS-LES model and a PANS model are implemented in the FDL3DI code. The Chimera based parallelization strategy40 was used. The computational domain was decomposed into a number of overlapped sub-domains with five points overlapped as schematically shown in figure 1. An automated pre-processor PEGSUS47 is used to determine the domain connectivity and interpolation function between the decomposed zones. The single Program Multiple Data (SPMD) parallel programming was used for parallelization strategy48. Each sub-domain or zone is assigned to an individual processor. The processor identification number was used to determine the subdomains assigned to each processor. The flow and turbulent equations for each sub-domain were solved independently in parallel, and the interpolated boundary data were updated. The MPI message-passing library was used for inter-processor communication, with point-to-point send and receive calls to exchange the Chimera boundary data between processors. The 3rd order Roe scheme with MUSCL approach is used for the spatial discretization of both the flow and the turbulent equations. The choice was based on the authors27 analysis of the execution time for the different schemes in FDL3DI for flow over a flat plate whose results are shown in figure 2. The assessed schemes include the 2 nd order central scheme, the 1st, 2nd and 3rd order Roe schemes and the 4th order compact scheme with 6th order filtering (C4F6). It can be seen from figure 2 that, among the four schemes, the 2nd order central scheme has the lowest execution time. The execution time of the C4F6 compact scheme is approximately 45% greater because of filtering. The higher spatial accuracy of the 3rd order Roe scheme optimizes the computational time. To counter the nonphysical oscillations, in regions of discontinuity for this least dissipative scheme, Van Leer’s harmonic limiter49 is used in FDL3DI. In addition, the switch of Swanson and Turkel 50, which is based upon the second difference of pressure, is used as a sensor to determine where these regions occur. The computations were performed on two commodity cluster systems, the Itanium 2 cluster at the Ohio Supercomputer Center and an AMD Athlon home cluster. The performance analysis of the code’s execution time, memory usage and performance are reported for the UC Athlon cluster. The system consists of 1.667 GHz dual processor Athlon MP nodes, each with 1 GB of memory. The Athlon has two levels of on-chip memory cache to bridge the performance gap between the processor and the memory. They are referred as level 1 (L1) and level 2 (L2) cache. A detailed look at the, the L1 and L2 on-chip cache effectiveness for the turbulence models is characterized using the PAPI performance monitoring API51. The memory utilization for the turbulence models is carried out using PERFSUITE52, which is a collection of tools, utilities, and libraries for software performance analysis developed at NCSA for performance analysis of user applications on Linux-based systems. Details of the Perfsuite Building Blocks and Architecture as well as the description of the libraries and tools are provided in reference 52 and 53. The cache-miss data and memory requirements for the DES, Hybrid and PANS methods as implemented in FDL3DI are presented after the computational results for the unsteady flow field. Copyright © 2006 by Basu et.al. Published by the 5 American Institute of Aeronautics and Astronautics, Inc., with permission COMPUTATIONAL GRID The cavity geometry has a L/D (length-to-depth) ratio of 5.0 and a W/D (width-to-depth) ratio of 0.5. The solution domain for the cavity, shown in figure 3 extended 4.5D upstream of the cavity in order to maintain the incoming boundary layer thickness δ at 10% of the cavity depth D, for a Reynolds number of 0.60×10 6/ft. Free stream supersonic conditions were set for inflow and first order extrapolation was applied at the upper boundary, at 9D above the cavity opening. First order extrapolation was also applied at the downstream boundary, 4.5D behind the rear bulkhead. Periodic boundary conditions were applied in the span-wise direction. The first three authors previously29 investigated the effect of computational grid on the cavity unsteady SPL spectra, but not on the resolved turbulent kinetic energy. However, the effect of grid on resolved turbulent quantities was not explored. Two grid levels are considered in the present investigation to assess the effect of grid resolution on resolved turbulent quantities for the various turbulence closure models. The baseline computational grid consisted of 300×120×80 grid points in the stream-wise, wall normal and span-wise direction with 160×60×80 grids within the cavity. The grids which was generated using GRIDGEN54 was clustered in the wall normal direction using hyperbolic tangent stretching function with 20 grid points within the boundary layer upstream of the cavity. The minimum wall normal grid spacing (Δy) was 1×10-4D, which corresponds to y+ of 1.0 for the first grid point. This was based on prior assessment 29 regarding the effect of the grid resolution on the SPL spectra and the TKE cascading. Within the cavity, the minimum Δy was y + of 10, and the minimum Δx led to x+ of 125. Constant grid spacing is used in the span-wise direction, with a z+ of 63. The solution domain was decomposed into twelve overlapping zones for parallel computation with a five-point overlap between the zones. The zones were constructed in such a way that the number of grids in the sub-domain ranged from 15,0000 to 35,0000. The sub-domains within the cavity had nearly the same number of grid points. The fine grid consisted of 39519590 grid points in the stream wise, wall normal and span wise direction, with 32010090 grid points within the cavity. The maximum Δx within the cavity corresponds to an x+ of 63 and the constant grid spacing in the span-wise direction results in a z+ of 55. Specific region within the cavity showing the nearly isotropic grid distribution within the cavity can be observed in the 3-D view shown in figure 4. The computational domain was divided into 24 sub-domains and the fine grid computations were performed only for DES1 model. The simulations were initiated in the unsteady mode and continued over 300,000 constant time-steps of 2.5×10-7 seconds. This corresponds to 8000 time steps per cycle for the primary spectral mode of the cavity at 500 Hz. It took 60,000 time steps to purge out the transient flow and establish resonance for the baseline grid and nearly 80,000 time steps for the fine grid. The remaining time-steps were used to capture sufficient cycles in order to have unsteady data for statistical analysis. The sound pressure level (SPL) and the turbulent kinetic energy (TKE) spectra for the cavity simulations are computed for all cases based the remaining unsteady data. RESULTS AND DISCUSSIONS Computational results are presented for Mach 1.19 flow over open cavity. The computed results are compared to the experimental data of DERA31, which were obtained at a Reynolds number of 4.336×106/ft and a transonic Mach number of 1.19. To optimize the use of available computational resources while maintaining a fully turbulent boundary layer in front of the cavity, the present simulations were performed at a Reynolds number of 0.60×10 6/ft, which is (1/7)th the value of the experimental Reynolds number. Details of the computational grids used in the study are summarized in table 1 and 2. The total number of grid points with the overlapping grid points considered are 2.75×106 for the baseline grid, and 6.81×106 for the fine grid. Rizzetta et al.32 carried out LES analysis for the same cavity configuration at a Reynolds number of 0.12×10 6/ft, using the dynamic SGS model with a 4th order compact pade-type scheme. The LES simulations were carried out using 25×106 grid points in a massive parallel computational platform with 254 processors and required pulsating flow to accomplish transition upstream of the cavity front bulkhead. Figures 5 through 9 presents the computational results obtained using the baseline grid for the different twoequations based turbulence closure models, while figures 10 through 12 present the computational results for the DES1 model using the fine grid. The sound pressure level (SPL) spectra for the cavity are compared with the experimental data31 and the LES results32. Grid resolved turbulent kinetic energy (TKE) profiles are also compared Copyright © 2006 by Basu et.al. Published by the 6 American Institute of Aeronautics and Astronautics, Inc., with permission with the LES simulations. Q iso-surfaces and vorticity contours are presented to show the fine scale structures and three-dimensionality of the cavity flowfield. Figure 5 presents the computed sound pressure level (SPL) spectra on the cavity floor (Y/D = 0.0) at a streamwise location close to the front bulkhead (X/L = 0.2) for the baseline grid. The SPL spectra were obtained by transforming the pressure-time signal into the frequency domain using fast Fourier transform (FFT). The experimental data31 as well as Rizzetta’s LES results32 are also shown in figure 5. It can be seen that all the investigated models predict the second dominant modes frequency at 500 Hz in close agreement with the experimental data and slightly underpredict the first modal frequency at 220 Hz. In general, the predicted peak amplitude of the SPL at the dominant frequency predicted is closer to the experimental results than the LES predictions. Compare to the present simulations, the LES at the lower Reynolds number 32 underpredicted the dominant mode frequency. The DES results of Hamed et al. 55 indicates an increase in frequency and peak SPL with increasing Reynolds number. One can see in figure 5 that DES1 and PANS are closest in predicting the experimental peak SPL. The DES2, hybrid and DES3 models predict another peak at around 800 Hz that is not present in the experimental results. According to the results of figure 5, the predictions using DES1 gives the best overall agreement with the experimental SPL. The grid resolved turbulent kinetic energy (TKE) was obtained from the velocity fluctuations (u , v, w) in the three directions. Figure 6 presents the time-mean spanwise averaged grid resolved TKE for all the models. All cases show high level of resolved TKE in the separated regions within the cavity. Figure 7 present the time-mean spanwise averaged grid resolved turbulent kinetic energy (TKE) profiles across the cavity shear layer at three streamwise locations for all models and for predictions from the LES 32. It can be seen that all the models predict higher grid resolved TKE than LES in the shear layer near the front bulkhead (X/L = 0.2). At X/L = 0.2, the PANS and the DES3 model predict the resolved TKE closer to the LES results. The DES1 model predict the peak at Y/D = 0.8 while all other models predict the peak at Y/D = 1.0 in agreement with the LES results. The DES2, DES3 and PANS predicted peak resolved TKE are closer to the LES results at X/L = 0.5 while DES1 is higher and hybrid is lower. At X/L = 0.8, the differences are greater among the various models. In general, all the models predicted lower resolved TKE compared to the LES within the cavity. It is hard to draw conclusions from the comparisons however because of the relative coarseness of the baseline grid in the current investigation where 60 grids were employed within the cavity in the direction normal to the cavity opening while 121 grids were used for the LES simulations. Figure 8 shows iso-surfaces of the span-wise component of Q at the cavity mid-span for the DES1 and the PANS models for the baseline grid. The Q criterion was proposed by Hunt et al 56 and is used here to show the coherent vortices downstream of the cavity forward bulkhead. It can be observed the vortex sheet is essentially twodimensional in nature upstream of the cavity. The roll up of the vortex and the impingement of the shear layer at the rear bulkhead can also be seen in figure 8. The surfaces show the formation of the Kelvin-Helmholtz instabilities as the shear layer passes over the cavity lip, the gradual roll-up/lifting of the shear layer, the breakdown of the vortex as it is convected downstream and the associated formation of separated eddies. The figures also indicate the formation of eddies that are smaller than the shed vortex within the cavity. The PANS model predicts a distinctively wider range of fine scale structures in a larger region within the cavity. Figure 9 shows the iso-surfaces of the axial component of the quantity Q (Q x) for the DES1 and the PANS model. The axial component shows the three-dimensionality of the flow field in more details. Towards the rear bulkhead, the three-dimensionality of the flowfield becomes more prominent and the breakup of the fine scale structures increases. Figure 10 through 12 present the predictions from the baseline and the fine grids for the DES1 turbulence model. Figure 10 compares the SPL spectra for the fine grid. The experimental data as well as the results from the LES simulations are also shown. Great improvement in the SPL predictions in terms of the broadband spectra frequencies and also the first two dominant modes are obtained with the increased grid resolution. The predicted SPL peak is much closer to the experimental results than the LES results. Figure 11 presents the resolved turbulent kinetic energy for the baseline and fine grids. The results from LES are also presented at three streamwise locations along the cavity opening. It is very clear that grid size refinement dramatically affects the predicted level of grid resolved turbulent kinetic energy. A noticeable increase takes place in the levels of resolved turbulent kinetic energy within the cavity especially in the region below Y/D = 0.5 with grid refinement, but the resolved values are greater than the LES results. This is consistent with the Reynolds number is five times higher in the present analysis compared to the Copyright © 2006 by Basu et.al. Published by the 7 American Institute of Aeronautics and Astronautics, Inc., with permission LES. The fine grid results exhibit the same general trends as the LES. Figure 12 shows the computed instantaneous contours of the axial vorticity (ωx) for the baseline and the fine grid. The fine grid simulations clearly predicted a much greater range of fine scale structures. Figure 13 through 16 present the results for the execution time, cache and memory analysis for the different turbulent models for the baseline grid where the computational domain is divided into 12 sub domains. Figure 13 presents a comparison of the execution time (in seconds) based on 1000 iterations for each model. The execution time increases compared to RANS because of required computations to modify the eddy viscosity in the separated regions within the cavity and is highest for the PANS model. Figure 13 indicates that the resolution of the flow fine scales is achieved with the presented DES, hybrid and PANS turbulence closure models with only 20% increase in CPU time (compared to RANS). Figure 14 illustrate the performance of the different models with regard to the L1 and L2 data cache. The level 1 (L1) data cache maintains a copy of the most recently accessed data. A larger, 256 KB, second level (L2) cache holds a copy of everything in the L1 cache and additional working data. The entire data set including and remaining data is maintained in main memory, and groups of data items are copied into the caches when any data in the group is read or written. The miss rates of the L1 and L2 caches are measured to determine what percentage of reads and writes are for data not contained within the given level of cache. This provides an indication of the amount of time the processor must spend waiting for data from the L2 cache and main memory (low cache miss rates are desirable). The data for all the models are plotted by recording cache miss statistics for the primary subroutine (the subroutine that deals with the bulk of the computation) in the solver using the PAPI API calls51,57. The miss rates of L1 & L2 data cache given on the top of corresponding bars shows that L1 data cache miss rate is lower than 5% for all the models. This means that the L1 cache can satisfy most of the data lookup. The L1 data cache misses are taken care of by the L2 cache, which has a hit rate of 64%. There is no obvious difference among the cache performance of various models at the same grid resolution. Figure 15 show the cache behaviors with regard to different sub domain sizes. The domain was unevenly partitioned into 12 zones with varying grid sizes. According to figure, 15 both the cache access number and cache miss number increase linearly with sub-domain grid sizes. Besides cache performance, we also retrieved information about memory consumption utilizing a PerfSuite52 subroutine PSF_memusage (pid,vsize,rss,ierr), which returns the virtual size and the current resident set size of the calling process in megabytes. The main memory on the Linux cluster is a paged virtual memory. When an application executes, memory space is allocated in main memory as needed, i.e., the entire data set is not necessarily contained in the memory all at once. Tools for determining the virtual memory size of an application (its total data size) and the average amount of storage needed by the application at any given time, referred to as its resident set size, help determine whether the amount of memory per node is appropriate for the problem size. The virtual memory size reflects the overall memory requirement of a process, while the resident set size denotes the memory utilized by the process at any given time. The information about memory usage is gathered by calling the memory usage routine from the primary subroutine of the flow solver at some critical points. Figure 15 illustrates the average virtual memory size (VS) and resident set size (RSS) in megabytes for all 12 subdomains. There is little difference in memory consumption among the different models since the overall data structure of the code is same for all the models. This indicates that the data locality is implemented effectively in the development of the parallel code. CONCLUSIONS This paper presents a comparative study on DES, RANS/LES and PANS two-equation based turbulence closure models in the simulation of transonic separated flows. Five models are developed and used to simulate the unsteady transonic flow over a cavity. All models reduce the TKE and the eddy viscosity in proportion to the local grid resolution and turbulent length scale to resolve the fine scale structures. The results show that all the models captured the three-dimensionality of the flowfield, the fine scale structures and the unsteady vortex shedding. The predicted SPL spectra compared better with the experimental data than the LES results. The overall agreement with the experimental SPL spectra in terms of dominant frequency and amplitude were best for the DES1 and PANS models and higher levels of grid resolved TKE were predicted with increased grid resolution. The CPU time was only 20% above that for RANS solution. The cache and memory performance are nearly same for all the models. Copyright © 2006 by Basu et.al. Published by the 8 American Institute of Aeronautics and Astronautics, Inc., with permission ACKNOWLEDGMENTS The authors would like to thank Dr. Donald Rizzetta at WPAFB, AFRL for providing the results of the LES simulations. The computations were carried out in the Itanium 2 Cluster at the Ohio Supercomputer Center (OSC) and in the AMD Athlon Cluster at UC. The authors would like to acknowledge Mr. Robert Ogden of the Aerospace Engineering Department at UC for setting up the cluster and for installing the profiling softwares PAPI and PERFSUITE in the cluster. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. Spalart, P. R., “Strategies for Turbulence Modeling and Simulations,” 2000, International Journal of Heat and Fluid Flow, Vo. 21, pp. 252-263. Hamed, A., Basu, D., and Das, K., "Detached Eddy Simulations of Supersonic Flow over Cavity," 2003, 41st AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada, AIAA -2003-0549. Sinha, N., Dash, S. M., Chidambaram, N. and Findlay, D., “A Perspective on the Simulation of Cavity Aeroacosutics”, 1998, AIAA-98-0286. Spalart, P. R., Jou, W. H., Strelets, M., and Allmaras, S. R., “Comments on the Feasibility of LES for Wings, and on a Hybrid RANS/LES Approach,” 2001, First AFOSR International Conference on DNS/LES, Ruston, Louisiana, USA. Strelets, M., “Detached Eddy Simulation of Massively Separated Flows”, 2001, 39th AIAA Aerospace Sciences Meeting and Exhibit, AIAA-2001-0879. Krishnan, V., Squires, K. D., Forsythe, J. R., “Prediction of separated flow characteristics over a hump using RANS and DES'', 2004, AIAA-2004-2224. Spalart, P. R., and Allmaras, S. R., “ A one-equation turbulence model for aerodynamic flows”, La Rech. A’reospatiale, 1994, Vol. 1, pp. 5-21. Bush, R. H., and Mani, Mori, “A two-equation large eddy stress model for high sub-grid shear”, 2001, 31st AIAA Computational Fluid Dynamics Conference, AIAA-2001-2561. Batten, P., Goldberg, U., and Chakravarthy, S., “LNS – An approach towards embedded LES”, 2002, 40th AIAA Aerospace Sciences Meeting and Exhibit, AIAA-2002-0427. Nichols, R. H., and Nelson, C. C., “Application of Hybrid RANS/LES Turbulence models”, 2003, 41st AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada, AIAA 2003-0083. Basu, D., Hamed, A., and Das, K., “DES and Hybrid RANS/LES models for unsteady separated turbulent flow predictions”, 2005, AIAA-2005-0503. Travin, A., Shur, M., Strelets, M. and Spalart, P. R., “Detached-Eddy Simulations Past a Circular Cylinder,” 1999, Flow Turbulence and Combustion, Vol. 63, pp. 293-313. Constantinescu, G., Chapelet, M., Squires, K., “Turbulence Modeling Applied to Flow over a Sphere,” 2003, AIAA Journal, Vol. 41, No. 9, pp. 1733-1742. Hedges, L. S., Travin, A. K. and Spalart, P. R., “Detached-Eddy Simulations Over a Simplified Landing Gear,” 2002, Journal of Fluids Engineering, Transaction of ASME, Vol. 124, No. 2, pp. 413-423. Hamed, A., Basu, D., and Das, K., “Assessment of Hybrid Turbulence Models for Unsteady High Speed Separated Flow Predictions”, 2004, 42nd AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada, AIAA -2004-0684. Georgiadis, N. J., Alexander, J. I. D., and Roshotko, E., “Hybrid Reynolds-Averaged Navier-Stokes/LargeEddy Simulations of Supersonic Turbulent Mixing”, 2003, AIAA Journal, Vol. 41, No. 2, pp. 218-229. Schluter, J. U., “Consistent Boundary Conditions for Integrated RANS/LES Simulations: LES Inflow Conditions”, 2003, AIAA-2003-3971. Schluter, J. U., Wu, X., Alonso, J., Pitsch, H., “Coupled RANS-LES Computation of a Compressor and Combustor in a Gas Turbine Engine”, 2004, AIAA-2004-3417, 40th AIAA/ASME/SAE/ASEE Joint Propulsion Conference and Exhibit. A. Keating, G. De Prisco, U. Piomelli and E. Balaras, "Interface conditions for hybrid RANS/LES calculations," 2005, ERCOFTAC Int. Symposium on Engineering Turbulence Modelling and Measurements (ETMM6), Sardinia, Italy. A. Keating and U. Piomelli, "Synthetic generation of inflow velocities for large-eddy simulation," 2004, AIAA2004-2547, 34th AIAA Fluid Dynamics Conference and Exhibit, Portland, OR. Copyright © 2006 by Basu et.al. Published by the 9 American Institute of Aeronautics and Astronautics, Inc., with permission 21. Arunajatesan, S. and Sinha, N., “ Hybrid RANS-LES modeling for cavity aeroacoustic predictions”, 2003, International Journal of Aeroacoustics, Vol. 2, No. 1, pp. 65-93. 22. Baurle, R. A., Tam, C. J., Edwards, J. R., and Hassan, H. A., “Hybrid Simulation Approach for Cavity Flows: Blending, Algorithm, and Boundary Treatment Issues”, 2003, AIAA Journal, Vol. 41, No. 8, pp. 1463-1480. 23. Yoshizawa, A., and Horiuti, K., “A Statistically Derived Subgrid Scale Kinetic Energy Model for the LargeEddy Simulation of Turbulent Flows”, 1985, Journal of the Physical Society of Japan, Vol. 54, No. 8, pp. 28342839. 24. Basu, D., Hamed, A., and Das, K., “DES, Hybrid RANS/LES and PANS models for unsteady separated turbulent flow simulations”, 2005, Proceedings of FEDSM’05, Houston, 19-23 June, 2005, FEDSM-200577421. 25. Jones, W. P., and Launder, B. E., “The prediction of laminarization with a two-equation model of turbulence”, 1972, International Journal of Heat and Mass Transfer, Vol. 15, pp. 301-313. 26. Girimaji, Sharath S., Abdol-Hamid, Khaled S., “Partially –averaged Navier Stokes model for Turbulence: Implementation and Validation”, 2005, AIAA-2005-0502. 27. Hamed, A., Basu, D., Tomko, K., and Liu, Q., “Performance Characterization and Scalability Analysis of a Chimera Based Parallel Navier-Stokes Solver on Commodity Clusters,” Proc. of the International Conference on Parallel Computational Fluid Dynamics, College Park, MD, May 24-27, 2005. 28. Gaitonde, D., and Visbal, M. R., “High-Order Schemes for Navier-Stokes Equations: Algorithm and Implementation into FL3DI”, 1998, AFRL-VA-TR-1998-3060. 29. Hamed, A., Basu, D., and Das, K., “Assessment of Hybrid Turbulence model for predictions of unsteady transonic high-speed flow”, submitted and accepted for publications, Computers and Fluids, 2006. 30. Bush, R. H., Power, G., Towne, C. E “WIND: The Production Flow Solver of the NPARC Alliance,” 1998, AIAA-98-0935. 31. Ross, J. et al., “DERA Bedford Internal Report,” 1998, MSSA CR980744/1.0. 32. Rizzetta, D. P. and Visbal, M. R., “Large-Eddy Simulation of Supersonic Cavity Flow Fields Including Flow Control,” 2003, AIAA Journal, Vol. 41, No. 8, pp. 1452-1462. 33. Gerolymos, G. A., “Implicit Multiple grid solution of the compressible Navier-Stokes equations using k-ε turbulence closure”, 1990, AIAA Journal, Vol. 28, No. 10, pp. 1707-1717. 34. Shur, M., Spalart, P.R., Strelets, M., and Travin, A., “Detached Eddy Simulation of Airfoil at High Angle of Attack”, 1999, 4th International Symposium on Engineering Turbulence Modeling and Measurements, Corsica, May 24-26. 35. Mani, M., “Hybrid Turbulence Models for Unsteady Simulation of Jet Flows,” 2004, Journal of Aircraft, Vol. 41, No. 1, pp. 110-118. 36. Elmiligui, A., Abdol-Hamid, Khaled S., Massey, S. and Pao, P., “Numerical Study of Flow Past a Circular Cylinder using RANS, Hybrid URANS/LES and PANS formulations”, 2004, AIAA-2004-4959. 37. Girimaji S., Sreenivasan, R., Jeong E., “PANS Turbulence Model For Seamless Transition Between RANS, LES: Fixed-Point Analysis and Preliminary Results.” FEDSM2003-45336, Proceedings of ASMEFEDSM’03 2003 4th ASME-JSME Joint Fluids Engineering Conferences, July 13-16, Honolulu, Hawaii, USA, 2003. 38. Schiestel, R., and Dejoan, A., “Toward a new partially integrated transport model for coarse grid and unsteady turbulent flow simulations”, 2005, Theoretical and Computational Fluid Dynamics, Vol. 18, pp. 443-468. 39. Morgan, P., Visbal, M., and Rizzetta, D., “A Parallel High-Order Flow Solver for LES and DNS”, 2002, 32nd AIAA Fluid Dynamics Conference, AIAA-2002-3123. 40. Morgan, P. E., Visbal, M. R., and Tomko, K., “Chimera-Based Parallelization of an Implicit Navier-Stokes Solver with Applications”, 2001, 39th Aerospace Sciences Meeting & Exhibit, Reno, NV, January 2001, AIAA Paper 2001-1088. 41. Visbal, M. R. and Gaitonde, D., “Direct Numerical Simulation of a Forced Transitional Plane Wall Jet”, 1998, AIAA 98-2643. 42. Visbal, M. and Rizzetta, D., “Large-Eddy Simulation on Curvilinear Grids Using Compact Differencing and Filtering Schemes”, 2002, ASME Journal of Fluids Engineering, Vol. 124, No. 4, pp. 836-847. 43. Rizzetta, D. P., and Visbal, M. R., 2001, “Large Eddy Simulation of Supersonic Compression-Ramp Flows”, AIAA-2001-2858. 44. Morgan, P., Rizzetta, D., and Visbal, M., “Large-Eddy Simulation of Flow Over a Wall-Mounted Hump,” 2005, AIAA-2005-0484. 45. Beam, R., and Warming, R., “An Implicit Factored Scheme for the Compressible Navier-Stokes Equations,” 1978, AIAA Journal, Vol. 16, No. 4, pp. 393-402. Copyright © 2006 by Basu et.al. Published by the 10 American Institute of Aeronautics and Astronautics, Inc., with permission 46. Pullinam, T., and Chaussee, D., “A Diagonal Form of an Implicit Approximate-Factorization Algorithm,” 1981, Journal of Computational Physics, Vol. 39, No. 2, pp. 347-363. 47. Suhs, N. E., Rogers, S. E., and Dietz, W. E., “PEGASUS 5: An Automated Pre-processor for Overset-Grid CFD'', June 2002, AIAA Paper 2002-3186, AIAA Fluid Dynamics Conference, St. Louis, MO. 48. Tomko, K. A., “Grid Level Parallelization of an Implicit Solution of the 3D Navier-Stokes Equations,” Final report, Air Force Office of Scientific Research (AFOSR) Summer Faculty Research program, September 1996. 49. Van Leer, B., “Towards the ultimate conservative difference scheme, V, A second order sequel to Godunov's method”, 1979, Journal of Computational Physics, Vol. 32, pp.101-136. 50. Swanson, R. C., and Turkel, E., “On Central-Differencing and Upwind Schemes”, 1992, Journal of Computational Physics, Vol. 101, No. 2, pp. 292-306. 51. London, K, Moore, S., Mucci, P., Seymour, K., and Luczak, R., “The PAPI Cross-Platform Interface to Hardware Performance Counters,” Department of Defence Users’ Group Conference Proceedings, June 18-21, 2001. 52. Kufrin, R., “PerfSuite: An Accessible, Open Source Performance Analysis Environment for Linux”, Proc. Of the 6th International Conference on Linux Clusters: The HPC Revolution 2005 (LCI-05), April 2005. 53. http://perfsuite.ncsa.uiuc.edu/ 54. Pointwise, I., Gridgen User Manual, Version 15, 2003. 55. Hamed, A., Basu, D., and Das, K., "Effect of Reynolds Number on the Unsteady Flow and Acoustic Fields of a Supersonic Cavity," 2003, Proceedings of FEDSM '03, 4th ASME-JSME Joint Fluids Engineering Conference, Honolulu, HI, Jul 6-11, FEDSM2003-45473. 56. Hunt, J. C. R., Wray, A. A., and Moin, P., “Eddies, stream, and convergence zones in turbulent flows”, 1988, Proceedings of the 1988 Summer Program, Report CTR-S88, Center for Turbulence Research, pp. 193-208. 57. http://icl.cs.utk.edu/papi/ Table 1: Overview of Computational grids Grid Baseline Inflow Dimensions Mid Section and Cavity Outflow Dimensions Total Number of Points (with overlapping) (with overlapping) (with overlapping) 115×68×80 180×130×80 45×68×80 2,742,400 35×105×90 350×195×90 35×105×90 6,804,000 (300×120×80) Fine (395×195×90) Table 2: Details of Computational grids Grid y+ upstream of the Minimum y+ inside the Minimum x+ in side cavity cavity the cavity 300×120×80 1.0 10 125 395×195×90 1.0 4.5 63 z+ Grid Points Grids within within δ cavity 63 20 1,024,000 55 25 3,168,000 Copyright © 2006 by Basu et.al. Published by the 11 American Institute of Aeronautics and Astronautics, Inc., with permission Figure 1 Schematic of the domain connectivity for the parallel solver (41) Figure 2 Comparison of different numerical schemes in FDL3DI for execution time (27) Figure 3 Schematic of the cavity configuration Figure 4 Three-Dimensional view of the computational fine grid Copyright © 2006 by Basu et.al. Published by the 12 American Institute of Aeronautics and Astronautics, Inc., with permission DES1 DES2 DES3 Hybrid PANS Figure 5 Span-wise averaged fluctuating pressure frequency spectra on the cavity floor (Y/D = 0.0, X/L = 0.2) Copyright © 2006 by Basu et.al. Published by the 13 American Institute of Aeronautics and Astronautics, Inc., with permission DES1 DES2 Hybrid DES3 PANS Figure 6 Spanwise averaged time-mean grid resolved turbulent kinetic energy contours X/L = 0.2 X/L = 0.5 X/L = 0.8 Figure 7 Spanwise averaged time-mean grid resolved turbulent kinetic energy profiles etic energy profiles Copyright © 2006 by Basu et.al. Published by the 14 American Institute of Aeronautics and Astronautics, Inc., with permission DES1 PANS Figure 8 Iso-surfaces of the spanwise component of Q (Qz) DES1 PANS Figure 9 Iso-surfaces of the axial component of Q (Qx) Figure 10 Span-wise averaged fluctuating pressure frequency spectra on the cavity floor (Y/D = 0.0, X/L = 0.2) for fine grid Copyright © 2006 by Basu et.al. Published by the 15 American Institute of Aeronautics and Astronautics, Inc., with permission X/L = 0.2 X/L = 0.5 X/L = 0.8 Figure 11 Spanwise averaged time-mean grid resolved turbulent kinetic energy profiles for fine and baseline grid Baseline Grid Fine Grid Figure 12 Cavity mid-span axial vorticity contours Copyright © 2006 by Basu et.al. Published by the 16 American Institute of Aeronautics and Astronautics, Inc., with permission 1.25 1.20 1.2 1.15 1.13 1.14 DES1 DES2 1.15 1.12 1.1 1.05 1.00 1 0.95 0.9 DES3 RANS PANS Hybrid Schemes Figure 13 Execution Time for different turbulence models for the flow over cavity Billions Cache Performance # of Cache Access and Miss Time Ratio Relative Execution Time L1 Miss 1.6 L1 Access L2 Miss 1.4 1.2 1 0.8 0.6 0.4 0.2 4.2% 36.3% 4.2% 36.3% 4.2% 36.4% 4.2% 36.3% 4.3% 36.5% 4.6% 36.5% 0 RANS DES3 DES1 DES2 Hybrid PANS Schemes Figure 14 Number of Cache miss and Cache access for different turbulence models for cavity flow Copyright © 2006 by Basu et.al. Published by the 17 American Institute of Aeronautics and Astronautics, Inc., with permission 0.15 3.5 3 2.5 L1 Miss L2 Miss 0.12 L1 Miss L1 Access L2 Miss 0.09 2 0.06 1.5 1 0.03 0.5 0 0 0 100000 200000 300000 400000 500000 0 100000 200000 300000 400000 Subgrid Size Figure 15 Representative Cache Performance with sub-grid sizes for cavity flow Average Memory Usage MB Billions # of Cache Access and Miss Cache Performance (DES1) VSize RSS 1000 900 800 700 600 500 400 300 200 100 0 RANS DES3 DES1 DES2 Hybrid PANS VSize 817.16 853.012 850.2 844.695 861.238 861.244 232.343 259.487 257.328 253.194 265.058 263.634 RSS Figure 16 Average Memory Usage for different schemes for cavity flow Copyright © 2006 by Basu et.al. Published by the 18 American Institute of Aeronautics and Astronautics, Inc., with permission 500000