Numerical Simulation of Incompressible Viscous Fluids using Finite Differences Manopulo N. , Popiv D. Technnishe Universitaet München, Institut für Informatik Boltzmannstraße 3 , 85748 Garching bei München popiv@in.tum.de , manopulo@in.tum.de Ferienakademie 2005, Sarntal (IT) August 24 , 2005 Abstract. Computational Fluid Dynamics (CFD) is one of the most popular topics in the Scientific Computing context. Although finite differences are not “the best” approach in implementing a CFD code, they provide a simple and effective mean of getting acquainted to the topic. This paper aims to provide a walkthrough to the Practical Course at the TUM on implementing a CFD code, by going through the main principles, the algorithm and finally the parallelization. Keywords. CFD, fluid dynamics, Navier – Stokes equations, finite differences, marker and cell, MAC, parallelization, domain decomposition 1.0 Introduction and Governing Equations Understanding the behavior of fluids is quite a complex task. In fact fluids are mechanical systems with an infinite number of degrees of freedom. Nevertheless the theoretical basis in the analysis of fluid behavior is quite complete. The Navier – Stokes equations, discovered around 1821 by Claude-Louis Navier and George Gabriel Stokes, constitute a set of partial differential equations which enable the analysis of the great majority of the fluids on earth. The task is therefore mapping the continuous (infinite dimensional) model described by the Navier – Stokes equations into a discrete (finite dimensional) model. The latter can be translated into a set of algebraic equations and solved on a computer. Before going into the discretization problem it is worth to have a brief look to the theoretical background of the Navier Stokes Equations. 1.1 The Navier – Stokes Equations: Incompressible fluids can be described by two basic conservation laws of physics: conservation of mass and conservation of momentum: Conservation of Mass: u v 0 x y Conservation of Momentum: u p 1 2 u 2 u (u 2 ) (uv) gx t x Re x 2 y 2 x y v p 1 2 v 2 v (v 2 ) (uv) gy t y Re x 2 y 2 y x As it can be noted there is no equation for conservation of energy. In fact this could be obtained by a scalar multiplication of the momentum equation with the velocity vector. This means that, in this case, conservation of momentum would theoretically imply conservation of energy. Nevertheless one shouldn’t forget that discretization is distorting the continuous model and the energy is not always guaranteed to be conserved in numerical simulations. Energy conserving discretization methods such as Finite Elements help to overcome this source of instability. 1.2 Finite Difference Discretization 1.2.1 The Staggered Grid The finite difference approach consists in evaluating the unknowns at the grid points. However experience shows that evaluating velocities (u and v) and pressure (p) at the same grid points leads to instabilities in the algorithm. Instead the velocities and pressures are computed at three different grids arranged around cells. In the following staggered grid vertical velocities are computed at the midpoint of the horizontal edge, horizontal velocities are computed at the midpoint of the vertical edges and pressures at the centers of the cells. 1.2.2 Finite Difference Discretization Based on the above described staggered grid the Navier Stokes Equations are discretized using Finite Differences. (u 2 ) 1 u i , j u i 1, j 2 x i , j x u i , j u i 1, j u u i, j i 1, j x 4 (uv) 1 vi , j vi 1, j y y 2 i, j 2 u i 1, j u i , j 2 2 u i 1, j u i , j u i 1, j u i , j 4 , u i , j u i , j 1 vi , j 1 vi 1, j 1 u i , j 1 u i , j 2 2 2 ui, j u i , j u i , j 1 u vi , j 1 u i 1, j 1 i , j 1 v v i , j i 1 , j y 4 4 u i 1, j 2u i , j u i 1, j 2u 2 x i , j 2u 2 y i , j p x i, j x 2 u i 1, j 2u i , j u i 1, j y 2 pi 1, j pi , j x ui , j ui 1, j u x x i, j (v 2 ) 1 vi , j vi 1, j 2 y i , j y 2 vi , j vi 1, j v v i , j i 1 , j y 4 1 vi , j vi 1, j (uv) x x 2 i, j vi 1, j uvi , j 2 2 vi , j v vi 1, j vi , j i 1, j 4 , u i , j u i , j 1 vi , j 1 vi 1, j 1 u i , j 1 u i , j 2 2 2 ui, j u i , j u i , j 1 u vi , j 1 u i 1, j 1 i , j 1 v v i , j i 1 , j x 4 4 2v 2 x i , j 2v 2 y i , j p y i, j vi 1, j 2vi , j vi 1, j x 2 vi 1, j 2vi , j vi 1, j y 2 pi 1, j pi , j y vi , j vi 1, j v y y i, j 2.0 The Algorithm Now that we discretized the continuous Navier-Stokes equations, we should find an intelligent way of obtaining a linear system of equations and solving for the unknowns. Consider again the Momentum Equation: u p 1 2 u 2 u (u 2 ) (uv) gx t x Re x 2 y 2 x y If now we discretized , leave alone the time derivative and multiply everything by dt we obtain: u i(,nj1) Fi ,( nj ) t ( n 1) ( pi 1, j pi(,nj1) ) x with 1 2u 2 u (u 2 ) (uv) Fi , j u i , j t g x 2 y Re x 2 x y i, j i, j i, j i, j i 1 : imax 1 , j 1 : j max Similarly for the vertical direction: t vi(,nj1) Gi(,nj) ( pi(,nj11) pi(,nj1) ) y with 1 2v 2 v (v 2 ) (uv) Gi , j vi , j t g y 2 x Re x 2 y y i, j i, j i, j i, j i 1 : imax , j 1 : j max 1 The superscript in parenthesis denotes the time stepping and the subscripts the spatial coordinate indices. F and G are collections of functions containing the right hand side of the momentum eq. and the velocities in the current time step (n). The new velocities (n+1) are computed based on the current values of F and G and the new pressure values. 2.1 Computation of the pressure As mentioned above the computation of the new velocity values need the new pressure values to be computed. However there is no equation for calculating the pressure. For this purpose the continuity equation can be used to obtain a linear system of equations with pressure as the only unknown. Let us derive the above expressions for the velocities and put them in the continuity equation u i(,nj1) x vi(,nj1) y ( n ) t ( n 1) ( n 1) Fi , j ( pi 1, j pi , j ) x x ( n ) t ( n1) Gi , j ( pi , j 1 pi(,nj1) ) y y ( n ) t ( n 1) ( n ) t ( n 1) ( n 1) ( n 1) Fi , j ( pi 1, j pi , j ) Gi , j ( pi , j 1 pi , j ) 0 x x y y Rearranging and discretizing pi(n1,1j) 2 pi(,nj1) pi(n1,1j) x 2 pi(n1,1j) 2 pi(,nj1) pi(n1,1j) y 2 (n) (n) (n) (n) 1 Fi , j Fi 1, j Gi , j Gi 1, j t x y i 1 : imax , j 1 : j max It is easily seen that this is the discrete form of the Poisson Equation. The resulting set of equations can be solved using a Successive Over Relaxation (SOR) Method. 2.2 Boundary Conditions The discrete set of equations obtained as described above can only be solved with a set of boundary conditions completely defining the geometrical and physical properties of the domain. In the context of the practical course 4 types of boundary conditions are considered. 2.2.1 No - Slip Condition This condition assumes that the boundary provides very big friction. Therefore the velocity component parallel to the boundary vanishes. In addition the velocity component normal to the boundary must also vanish because the flow cannot cross the boundary. wN 0 , wT 0 wN : Normal velocity component wT : Tangential velocity component 2.2.2 Free – slip condition This condition is used when the boundary is assumed to be frictionless, i.e. the velocity component parallel to the boundary surface should not variate with respect to the surface normal. wN 0 , wT 0 n 2.2.3 Inflow Condition When fluid comes into the domain from outside with a given initial velocity, this is simulated by the inflow boundary condition. wN wN0 , wT wT0 2.2.4 Outflow Condition If the fluid is allowed freely to flow out of the boundary outflow boundary conditions are used. This simply consists in setting the normal derivative of both velocity components to 0. In other words the velocities in the boundary cells are set to be equal to the velocities in the previous cells in the normal direction. wN wT 0 , 0 n n Following is the algorithm which implements the ideas expressed till now. Set t := 0, n := 0 Assign u, v, p with initial values Set the values for u and v along the fixed boundary While t < tend Choose δt Determine values of u, v and p at the boundary Compute F(n) and G(n) inside the fluid domain Compute the right-hand side of the pressure equation Set it := 0 While it < itmax and res > ε Perform SOR cycle in the interior cells Compute the norm of the residual in the pressure equation it := it + 1 Compute u(n+1) and v(n+1) Data output for visualization, if necessary t := t + δt n := n + 1 3.0 Obstacles and Free Surfaces The treatment of the problem till now was based on the assumption of a rectangular domain totally filled with the fluid of interest. In this section we will discuss the additional mechanisms that are necessary for the treatment of irregular domains and free surface flows. 3.1 Obstacles The additional effort needed to simulate obstacles or generally irregular domains come from the fact that the boundary conditions need now to be imposed on every cell that is defined as obstacle. As it can be seen from the picture the boundary conditions that need to be imposed depends on the orientation of the boundary cell and on the number of neighbors. For this purpose the whole domain need to be covered by an additional Flag matrix, which identifies each cell as fluid or obstacle. Furthermore each obstacle cell has to be classified based on the orientation and number of neighbors. The procedure is as follows: First a loop going over all cells marks the cells as Fluid or Obstacle C_F C_B : Fluid Cell : Obstacle Cell Then another loop going over all obstacle cells marks them according the orientation of the neighbors with the following scheme B_N B_S B_W Boundary Cell with Fluid on the North Boundary Cell with Fluid on the South Boundary Cell with Fluid on the West B_E B_NW B_NE B_SW B_SE Boundary Boundary Boundary Boundary Boundary Cell Cell Cell Cell Cell with with with with with Fluid Fluid Fluid Fluid Fluid on on on on on the the the the the East North North South North and and and and West East West East The flags are formatted as binary numbers each digit storing the status of a neighbor. For example the Flag B_NW would be translated as 00101. Using the binary format enables us to use the binary operations defined in C during the marking of the different flags. 3.2 Free Surface Flows The treatment of free surface flows is rather complicated. The main reasons for this are as follows Fluid domain is constantly subject to change In addition to fluid and obstacle cells also empty cells need to be dealt with Stress tensor has to be taken into consideration in the computation of boundary conditions The solution of the first two points lays in the implementation of a method called Marker and Cell (MAC) which will be treated in the next section. The last point instead will be treated separately. In the practical course two free boundary problems were implemented: The “Falling Drop” and the “Breaking Dam”. 3.2.1 Marker and Cell For being able to modify the fluid boundary, at each time step, according to the fluid movement, a new data structure needs to be implemented. In fact for this purpose the region representing the fluid is covered by a set of “particles”, which are moved according to the velocities in the cells and used to mark the cells as Fluid or Empty. The particles are implemented as structs containing the x and y coordinates and a pointer to the next particle. This way the particles are organized with a one-way linked list data structure. The figure shows the drop in the “Falling Drop” problem which is covered by the particles (small circles). The squares represent the cells. Each cell contains a full square number of particles typically 9 or 16. With the introduction of the empty cells the structure of the previously defined Flag matrix has also to be changed. In fact now the following flags have to be added to the matrix to accurately define surface cells and their neighbors. C_F F_N F_S F_W F_E F_NW F_NE F_SN F_SW F_SE F_WE F_ESN F_WSN F_EWN F_EWS F_EWSN Inner Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Fluid Cell (not Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty Cell with Empty on fluid surface) Cell on the North Cell on the South Cell on the West Cell on the East Cell on the North and West Cell on the North and East Cell on the South and North Cell on the South and West Cell on the South and East Cell on the West and East Cell on the East South and North Cell on the West South and North Cell on the East West and North Cell on the East West and South Cell on all four directions (isolated) The new Algorithm now works as follows: Set t := 0, n := 0 Set the particle in the initial domain Ω0 Assign u, v, p with initial values Set the values for u and v along the fixed boundary While t < tend Choose δt Mark interior cells and surface cells Determine values of u, v and p at the free boundary Compute F(n) and G(n) inside the fluid domain Compute the right-hand side of the pressure equation for interior cells Set it := 0 While it < itmax and _rit_ > ε Perform SOR cycle in the interior cells Compute the norm of the residual in the pressure equation it := it + 1 Compute u(n+1) and v(n+1) in the fluid domain Set the values for u(n+1) and v(n+1) at the fixed Boundary Determine the values for u(n+1) and v(n+1) along the free boundary Compute the particle positions at time tn+1 Data output for visualization, if necessary t := t + δt n := n + 1 3.2.2 Free Boundary Conditions The presence of fluid – empty interface brings the necessity for defining boundary conditions on the fluid surface. Here the physical condition that defines the boundary is the surface tension. In our case we will neglect this effect and therefore will set the surface stress tensor components to be equal to zero. p u v 2 u v nx nx 0 n n n n x x y y Re x y y x u v u v (n x m y n y m x ) 2n y m y 0 x y y x (n x , n y ) : normal free boundary unit vector 2n x m x (m x , m y ) : tangential free boundary unit vector The surface normal and tangential vectors are computed according to the type of boundary cell defined in the Flag matrix. For example a cell F_E type of cell will have n 1,0 m 0,0 whereas a F_NE type of cell will have 2 2 n , 2 2 2 2 m , 2 2 The following figure shows all the different types of boundary conditions that need to be treated. 4.0 Parallelization of the Flow Code The code described till now has been implemented to run on a single machine in a sequential manner. However for running simulations on larger domains and with better accuracy, the power of a single computer becomes very fast insufficient. The solution is transforming the sequential code into a parallel one, being therefore able to run it on several machines concurrently. As the popularity of computer clusters increases, given their advantageous power/cost ratio, parallelization more and more shifts to distributed memory machines. Indeed the flow code in the practical course has been parallelized using MPI (Message Passing Interface). 4.1 Domain Decomposition One of the most native parallelization strategies is domain decomposition. This principle consists of dividing the initial domain into a number of subdomains i i=1,2,3... and solving each subdomain on a different processor. As it can be seen in the figure above the domain has been divided into a matrix of subdomains with dimensions iproc x jproc. Where iproc and jproc stand respectively for number of processors on the horizontal and vertical directions. The figure on the right instead shows the unknowns and the boundary conditions on a typical subdomain. The diamonds show the velocities (u and v)and the circles stand for the pressure (p). The white squares and circles are the values that have to be computed, whereas the black ones need to be communicated by the neighboring subdomains or set as boundary conditions if they lay on the boundary of the whole domain. 4.2 Communication of velocities and pressures As hinted above, for the flow properties to be computed in a subdomain, velocity and pressure values need to be received by the neighboring subdomains. This is accomplished by copying values in the interior strip of a subdomain into the boundary strip of the corresponding neighbor. Exchange pressure values Exchnage velocity values The pressures are communicated in the SOR cycle for the solution of the Poisson Equation. The velocities need to be communicated after the computation of the partial residuals, which are sent to the master process to be added together. The process of exchanging velocities and pressures should follow a determinate order to prevent from the network to be overloaded or even deadlocked one of such combinations could be: send send send send to to to to the the the the left right top bottom — — — — receive receive receive receive from from from from the the the the right, left, bottom, top. 4.3 Parallelization of Free Boundary Flows The treatment of the free surface flows brings along some additional complexities. In fact now in addition to the flow properties (u,v and p) also the particles need to be parallelized. For these kind of problems the domain decomposition parallelization strategy is not the most efficient. This is because in free surface flows parts of the domain are empty and not much computation is made there, some other parts need more intensive computations. As a result the load is not optimally balanced. A good strategy would be particle tracking i.e. dividing the particle among the processors each of which track a fixed number of particles. For reasons of simplicity the parallelization of the free surface flows has also been treated using domain decomposition. The main concern in this case is the communication of the particles which leave one subdomain to move into another. 4.3.1 Communication of Particles As hinted before particles are data structures which contain the x and y coordinate values and a pointer to the next particle. The whole set of particles is organized in a list. In the decomposed domain, each subdomain contains its own list of particles. When the coordinates of a particle move from the range of a subdomain into the range of another, the respective particle has to be moved as well. This is done by deleting the particle from the list of the source and adding it into the destination subdomain. For example a particle that lies into the subdomain 3,2 has the possibility of moving into all the nine neighbors. However it is unlikely that it moves into the diagonal neighbors. Therefore it is more efficient to implement the communication only to the neighbors sharing a face. The particles moving into the diagonal neighbor can be delivered in two steps. For example a particle moving from 3,2 into 4,1 is first moved to 3,1 and subsequently into the subdomain 4,1. 5.0 References