3

A Three-Level Cartesian Geometry Based Implementation ofthe DSMC Method

Da Gao, Chonglin Zhang, and Thomas E. SchwartzentruberDepartment of Aerospace Engineering and Mechanics, University of Minnesota, Minneapolis, MN 55455

The data structures and overall algorithms of a newly developed 3-D direct simulation Monte Carlo (DSMC)program are outlined. The code employs an embedded 3-level Cartesian mesh, accompanied by a cut-cell al-gorithm to incorporate triangulated surface geometry into the adaptively refined Cartesian mesh. Such an ap-proach enables decoupling of the surface mesh from the flow field mesh, which is desirable for near-continuumflows, flows with large density variation, and also for adaptive mesh refinement (AMR). Two separate datastructures are proposed in order to separate geometry data from cell and particle information. The geometrydata structure requires little memory so that each partition in a parallel simulation can store the entire mesh,potentially leading to better scalability and efficient AMR for parallel simulations. A simple and efficient AMRalgorithm that maintains local cell size and time step consistent with the local mean-free-path and local meancollision time is detailed. The 3-level embedded Cartesian mesh combined with AMR allows increased flexibil-ity for precise control of local mesh size and time-step, both vital for accurate and efficient DSMC simulation.Simulations highlighting the benefits of AMR and variable local time steps will be presented along with DSMCresults for 3-D flows with large density variations.

I. Introduction

DIRECT Simulation Monte Carlo (DSMC) is a particle-based numerical method1 that simulates the Boltzmannequation. As a result, DSMC is an accurate simulation tool for modeling dilute gas flows ranging from continuumto free-molecular conditions. The DSMC method tracks a representative number of simulation particles through acomputational mesh with each simulation particle representing a large number of real gas molecules. A key aspect ofthe method is that molecular movement and collision processes are decoupled. Specifically, during each simulationtime step, all particles are first translated along their molecular velocity vector without interaction. After the movementphase, nearby particles that lie within the same computational cell are collided in a statistical manner. This decouplingis accurate for dilute gas flows as long as the simulation time step remains less than the local mean-collision-time (t 100 million cells) an unstructured tetrahedral mesh must not only be partitioned across parallel processorsduring a DSMC simulation, but pre/post-processing, grid generation, and AMR routines must also be par-allelized due to the large memory requirements. A Cartesian geometry model significantly alleviates theseproblems and may therefore be more suitable for future large-scale DSMC simulations.

2. A Cartesian geometry model necessitates a cut-cell method in order to handle complex surface geometries.Although this may seem like a drawback, a cut-cell approach is an extremely general and accurate techniquefor complex geometries.6,8 As discussed by LeBeau et al.,6 a cut-cell approach allows for complete decouplingbetween the flow field and surface mesh discretization. Such decoupling is especially important for the DSMCmethod and may be necessary for large-scale simulations of near-continuum flows where the mean-free-pathnear a vehicle surface may be orders of magnitude smaller than accurate surface resolution requires. Likewise,decoupling is equally important for low density flows (such as low orbiting satellites) where the local value of may be much larger than fine surface geometry features.

3. The MGDS code employs a 3-level embedded Cartesian grid for the flow field. Adding the third independentlevel of Cartesian refinement adds considerable flexibility over previous 2-level implementations for flows withlarge density gradients. Precise AMR is essential for both simulation accuracy and computational efficiency.

4. Use of a Cartesian geometry model not only results in lower memory storage requirements, but also enablesmany DSMC calculations to be performed with fewer floating point operations. Algorithms for initial gridgeneration, successive AMR, particle movement, and particle re-sorting during AMR require fewer operationswhen using a Cartesian grid compared to a non-Cartesian grid. The efficiency of such operations, and the degreeto which they can be automated, will become increasingly important as larger DSMC simulations are performed,especially for time-accurate unsteady problems where frequent AMR is required.

2 of 14

American Institute of Aeronautics and Astronautics

The spatial and temporal discretization required for accurate and efficient DSMC simulation is inherently linked tothe solution itself. Thus, the DSMC method is ideal for fully automated grid generation, AMR, and variable time steps,potentially requiring little-to-no user input. As computational resources continue to grow rapidly, such automationwould avoid current bottle-necks in total simulation turnover created by high demands on code users for large-scaleparallel flow problems. In addition to providing a new DSMC simulation tool for complex flows over 3D geometries,development of the MGDS code will provide independent evaluation of existing DSMC implementation approaches,as well as investigation of new approaches specifically geared towards future large-scale parallel simulations.

The article is organized as follows. In Section II the data structures used to organize geometry, cell, and particledata are described, after which particle movement algorithms are outlined. In Section III the main DSMC algorithm,the AMR and variable time-step algorithm, and the cut-cell algorithm are described. Results for various 3D simulationsthat demonstrate the benefits of 3-level AMR and variable time steps are presented in Section IV. Section V containsa summary of the major aspects of MGDS code and conclusions drawn from the preliminary results.

II. Data Structures

A. Geometry Data Structure

Figure 1. Geometry data structure used within the MGDS code.

The Geometry data structure used within the MGDS code is depicted in Fig. 1. It consists of a 3D array oflevel-one (L1) cell structures. Each L1 cell structure contains the co-ordinates of the bounding Cartesian vertices(x0,y0,z0,xE ,yE ,zE ) and a pointer to a 3D array of level-two (L2) cell structures. This 3D array can have arbitrarydimensions, meaning each L1 cell can contain an arbitrary number of L2 cells. Each L2 cell structure is identicalto a L1 cell structure, and therefore contains a pointer to a 3D array of L3 cell structures. In addition to storing thebounding vertex coordinates, each L3 cell structure contains the local values of c, , t, and currently the maximumexpected collision rate < cr >max. Also, each L3 cell structure contains a pointer to an array of surface structures.The majority of cells will contain no boundary surfaces and the pointer will be null. However, for cells which docontain boundary surfaces (described in Section III C.), each surface structure in the array contains vertex, surfacenormal, area, surface type, and sampling variables. Given the coordinates of a particle, the Geometry data structureenables efficient cell-indexing of the particle. The most important aspect of this Geometry data structure (Fig. 1) is

3 of 14


that it contains all of the information required for the AMR, cut-cell, and variable time step (movement) procedureswhile requiring relatively little memory storage.

Clearly the largest contribution to memory storage requirements comes from the data stored in each L3 cell.While the current MGDS implementation stores vertex coordinates for each L3 cell, this information could simplybe computed from the L2 vertex information and the Cartesian index of the L3 cell, thereby greatly reducing thememory storage requirement. Only a small percentage of L3 cells that intersect a boundary surface will contain arraysof surface structures which clearly involve substantially larger memory requirements. Finally, each L3 cell structurecontains both the processor number and local array index (on that processor) where its particle data and additionalcell data is stored. It is proposed that the entire Geometry data structure could be stored locally on each processorin a parallel simulation even for large DSMC flow field and surface meshes. For simulations where highly complexsurface geometry prohibits this, at a minimum, large portions of the Geometry data structure could be stored on eachprocessor. The more memory-intensive cell and particle data associated with each L3 cell (detailed in the next section)must be partitioned across parallel processors. An important aspect of the MGDS data structures is that any L3 celldata can be partitioned to any processor. That is, domain decomposition is not limited to L1 or L2 cells, which isessential for load balancing of large parallel simulations. The proposed benefits of such a Geometry data structure,including efficient movement of particles with variable time steps and efficient AMR, for parallel simulations will bediscussed in upcoming sections.

B. Cell/Particle Data Structure

Figure 2. Cell/Particle data structure used within the MGDS code.

A DSMC code requires the storage and book-keeping of a large amount of data. Complex data structures are oftenemployed for data organization and can provide the flexibility required for a general DSMC implementation. Particledata is the dominant contributor to memory, however, the global number of particles, as well as the local numberwithin each cell, can vary greatly within a DSMC simulation. Thus, statically allocated arrays of particles must eitherbe conservatively large (unused memory) or re-sized often (computationally and memory intensive). Likewise, sincethe mesh resolution in DSMC depends on the solution itself, ideally, the number of cells would change during asimulation and data structures should account for this possibility as well.

The MGDS code adopts the cell and particle data structures approach used in the MONACO code4 with certainmodifications. The specific Cell/Particle data structure used by the MGDS code is depicted in Fig. 2. On each parallelpartition, an array of L3 cells is maintained. This array is actually an array of pointers to L3 Cell/Particle structures sothat if the array requires resizing (during AMR), resizing an array of pointers is much less memory intensive and moreefficient than resizing an array of Cell/Particle structures. Each cell-data structure contains substantial data (currentlyincluding duplicated data from the Geometry data structure). Each Cell/Particle structure also contains an array thatstores sampled (cell-averaged) data of the particles within the cell. Currently each Cell/Particle structure also containsthe nine integer indices of the corresponding L3 cell in the Geometry data structure. Finally, each Cell/Particle structurecontains pointers to the head and tail of a doubly-linked list of particle structures. Each particle structure contains allrequired particle information (as shown in Fig. 2) and in addition, also contains two integers for the processor number

4 of 14


and index within the local cell array where the particle is located (the reason for this will be explained in Section III).It should be noted that cell-face data and cell-connectivity data are not required since the grid is Cartesian.

The MGDS code is written in Fortran 90. All data structures are defined as Fortran derived data types whichenable general and efficient packaging of data for broadcast among partitions in a parallel simulation.

C. Particle Movement and Tracking

(a) Ray-tracing movement procedure (b) Cartesian movement procedure.

Figure 3. Particle tracking procedures relevant to the DSMC method.

Ray-tracing is a general and efficient method of tracking particle movement through a computational meshthat is widely used in the computer graphics industry and is also used by many DSMC codes. The procedure isdepicted schematically in Fig. 3(a) for movement within an unstructured 2D triangular mesh. Essentially, the time-to-hit each face of the current cell (thit f in Fig. 3(a)) is computed using the particle position/velocity and theface vertices/normal-vector, where x f is the normal-distance between the particle and cell-face. The particle is thenadvanced for the minimum time thitmin = min(thit f ) and the particle is re-located to the neighboring cell. Theprocess is then repeated for the remaining time (tsimthitmin). Of course if (thitmin < tsim) then the particle canbe moved for the full time step and remains located in the current cell. This procedure is very general, it naturally sortsparticles while moving them, and naturally allows for variable time steps to be used in each cell.

Within a Cartesian grid, a particle may be moved for the full time step regardless of how many cells are crossed.The particles new position can then be re-indexed on the Cartesian grid very efficiently compared to re-indexing ona non-Cartesian grid. After re-indexing, the particle can be re-located directly to the new cell. This procedure, calledCartesian-move is depicted in Fig. 3(b). However, even on a Cartesian grid, there are three main drawbacks to theCartesian-move procedure. The first is that variable time steps can not be used in each cell. This is not a drawback forcomputing unsteady flows, however, it is a significant drawback when computing steady-state flows as variable timesteps increase the simulation efficiency greatly. The second drawback is that re-indexing on a multi-level Cartesiangrid is rather computationally expensive, and may approach the expense of ray-tracing. Third, when simulating flowover complex geometries, ray-tracing must be used in cells close to boundary surfaces, thereby requiring the additionalcomplexity of mixed Cartesian-move and ray-trace algorithms.

Although the MGDS code maintains the option of the Cartesian-move procedure, the default is to use the ray-tracing procedure throughout the entire simulation domain. It is important to realize, however, that the ray-tracingalgorithm is more efficient on a Cartesian grid than on a non-Cartesian grid, since the dot products (shown in Fig.3(a)) involve only a single component. Furthermore, a trend in DSMC algorithm research is to enlarge cells above thex constraint without loss of accuracy through the use of virtual subcells.9 The consequence of this trend for 3Dsimulations will be that a large majority of simulation particles remain within the same cell during a given time step.On a Cartesian grid, the ray-trace procedure to determine if thitmin < tsim is highly efficient. Further quantitativeresults comparing these two movement procedures can be found in Ref. 10.

5 of 14


III. Algorithms

A. Core DSMC Algorithm

By separating the Geometry data structure (Fig. 1) from the Cell/Particle data structure (Fig. 2), the main DSMC loopwithin the MGDS code is made more compact and communication between processors is potentially lowered whenparticles move across partitions in a parallel simulation. The resulting compact DSMC loop is shown in Fig. 4.

Loop (for each L3 cell)Collide Particles (update particle properties)Sample PropertiesMove (update particle positions) - recursive ray-trace algorithm - including surface collisions - record new x,y,z, proc#, local cell# - particles remain in original linked-listIF (INFLOW) Generate/Move Particles

End Loop

Package up communicationSend data to partitionsGlobal Sort (loop over all cells/particles)

Figure 4. Main DSMC algorithm within the MGDS code.

Each L3 Cell/Particle structure (Fig. 2) contains all necessary information to perform collisions and sample particleproperties within the cell, independent of any other cell in the mesh. Next, since the entire Geometry data structure (orlarge portions of it) are stored on each partition, all particles within a given cell can be moved for their complete timestep including variable time steps and surface collisions. Specifically, without accessing any other cell data (which maybe located on a different partition), the x, y, and z positions, the processor index, and the local cell array index of eachparticle can be updated solely using the Geometry data structure. Finally, if the current cell contains inflow surfaces,then particles are generated and fully-moved as just described. The result is a single loop over all L3 cells for each fullsimulation time step. Each cell within this loop can be processed independently of all other cells in the simulation.It is important to note that upon completion of this loop, all cells still contain linked-lists of their original particles.Only the particle coordinates, destination processor, and local cell index on that processor have been updated; that is,each particle knows exactly where to be sent. At this point the particle data is packaged up and sent to the appropriatepartition. As depicted in Fig. 3(b) this approach may have the potential to limit the inter-processor communicationfor occasional particles that cross multiple partitions in single time step. The final step is a global sort algorithm thatremoves particles from current linked-lists and adds them to destination linked-lists. This is the only step that is notindependent of other cells in the simulation. Optimization of the above movement and sorting procedures is discussedin Ref. 10 along with a shared-memory parallelization technique which threads both the main DSMC loop and theglobal sorting algorithm over multi-core processors.

B. Adaptive Mesh Refinement (AMR)The AMR algorithm used within the MGDS code inputs the current Geometry data structure, generates a new Ge-ometry data structure adapted precisely to the local mean-free-path, sets local time steps in accordance with the localmean-collision-time, and also interpolates the maximum expected collision frequency (< cr >max) between old andnew meshes. The algorithm then updates the Cell/Particle data structure as cells are added or removed and finally, theglobal sort algorithm re-sorts all particles into the appropriate linked-lists in the revised data structure. In this manner,the MGDS simulation continues in a smooth and accurate manner. The AMR procedure is called approximately 4-10times during a typical steady-state MGDS simulation. With the current implementation described below, one callto the AMR function requires approximately the same computational time as 10 simulation time steps, regardless ofthe size of the simulation. While this has a negligible effect on the overall simulation time for a steady-state solu-tion, future efficiency improvements will be important for unsteady simulations where the AMR procedure might becalled every few timesteps. An overview of the AMR procedure is detailed below along with a 3-level Cartesian gridschematic in Fig. 5.

It is important to note that L1 cell sizes remain fixed during the simulation. Thus, the AMR procedure is appliedwithin each L1 cell independently of all other L1 cells according to the following steps:

6 of 14


Figure 5. Schematic of 3-level Cartesian mesh and AMR procedure.

1. Loop through all L3 cells contained within this L1 cell and determine max.

2. Set the new L2 cell size to max and generate a new Cartesian L2 cell grid within the L1 cell.

3. Loop through all new L2 cells and determine min within each. This involves determination of which old L3cells intersect with each new L2 cell. On a 3D Cartesian grid, computing the volume of intersection betweentwo generic cells is straight-forward and efficient.

4. Set the new L3 cell size within each new L2 cell to min.

5. Compute the volume of intersection between new and old L3 cells within each large L1 cell and interpolate(using a volume average) values of , c, and < cr >max into the new L3 cells. This information can then beused to accurately set an appropriate new local time step and maintain the local collision rate.

6. After processing all L1 cells via the above steps, the complete Geometry data structure is now updated.

7. The new Geometry data structure is now used to update the pointer array of Cell/Particle structures (Fig. 2) aswell as modify cell data. Note that in changing the dimensions/vertices of cells, the particles contained withinthese cells may now be completely un-sorted.

8. Finally, the coordinates of all particles are re-indexed using the new Geometry data structure. The global sortalgorithm is then called which adds/removes particle pointers to/from the correct linked lists. The MGDSsimulation is now ready to carry-on with the general move/collide DSMC algorithm.

C. Cut Cell Algorithm

Since the MGDS code already uses the ray-tracing technique to determine particle intersections with cell faces, ar-bitrary triangulated surface meshes can be naturally imbedded within the flow field grid without modification to theMGDS data structures or algorithms. The cut-cell method performs two main functions. The first is to read in a listof triangular surface elements (generated by various commercial surface triangulation packages) and sort all surfaceelements into the appropriate L3 cells within the Geometry data structure. Note that if a single large surface trianglecuts through multiple small L3 cells, that the surface element may simply be added to multiple cells and does not needto be trimmed. The second main function is to compute the volume of each cut-cell required to determine the colli-sion rate and various macroscopic properties in the cut-cell. Both functions involve moderately complex geometricalcalculations, but leave the basic DSMC data structures and algorithms completely unchanged.

In order to sort a list of triangular surface elements into the list of L3 Cartesian cells, the MGDS code employsa Cut Cell Intersection technique initially detailed in computer graphics literature and more recently adapted forCFD simulations by Aftosmis et al.8 The technique is able to use computationally efficient bitwise operations andcomparisons to determine intersections between triangular elements and Cartesian cells. Further details on computingthe intersection between generally positioned triangles in 3D are contained in Ref. 8.

Once all surface elements are sorted into appropriate L3 cells, the volume of each of these cut-cells is computed.The MGDS code uses a simple Monte Carlo technique to compute cut-volumes. Co-ordinates are chosen at random

7 of 14


within the un-cut Cartesian cell. The dot product between the chosen coordinates and the normal vector of a givensurface element is computed. If the dot products are negative for all surface elements contained within the cell, thenthe point lies outside of the flow domain. The cut-volume is simply determined by dividing the fraction of pointsdetermined to lie outside the domain by the total number of Monte Carlo co-ordinates considered (N). The error insuch a volume calculation scales directly as 1/

N. It should be noted that this procedure is suitable only when the

surface geometry cutting a given cell is either completely convex, or, completely concave. More complex collectionsof surface elements within a single L3 cell are currently not supported by the MGDS code and advanced methods forsuch geometries are discussed in Ref. 8.

The cut-cell algorithm is called at the beginning of each simulation and also immediately following each call tothe AMR function. The current cut-cell algorithm requires less computational time than a single AMR call for allsimulations presented in this article. Finally, inflow, outflow, symmetry, or wall surfaces may be specified by the useron all sides of the overall Cartesian bounding box of the simulation. These outer surfaces are added to the appropriateL3 cells in addition to the triangulated surface elements sorted by the cut-cell algorithm.

IV. MGDS Simulation Results

A. Effect of variable time steps

Mach 5 flow of argon over a 10 cm flat plate at 30 degrees angle of attack is simulated to steady-state with andwithout the use of variable time steps. The free-stream density and temperature are 7.5 105 kg/m3 and 200 Krespectively. The temperature of the flat plate is held fixed at Twall = 2000 K on the bottom and Twall = 200 K on thetop. The variable hard-sphere (VHS) collision model is used with a power law value of = 0.81, and diffuse reflectionand full thermal accommodation are imposed for surface collisions. The simulation is three-dimensional stretchingapproximately 4 in the z direction. However, since symmetry boundary conditions are imposed on the z boundaryplanes, the resulting flow is uniform in the z coordinate direction and only the two-dimensional flow field results arepresented.

(a) Normalized density field. (b) Number of simulation particles per cell.

Figure 6. MGDS simulation results without the use of variable time steps.

Figure 6(a) shows the resulting density field (normalized by the free-stream density) when a constant simulationtimestep is used. The 3-level Cartesian grid has been adapted to the local mean-free-path and the constant time step isless than 12 c everywhere, thus producing an accurate solution. It is evident from Fig. 6(a) that the density increases bya maximum of 5 times between the free-stream and the leading edge of the plate. Although the density (and thereforethe number density, n) increases towards the plate leading edge, the cell size decreases proportionately. Thus AMRshould naturally provide some control in maintaining a constant number of particles per cell. However, as evidentfrom Fig. 6(b), even with AMR, the average number of simulation particles per cell varies widely from less than 10 to

8 of 14


more than 500. If one considers the number of real gas molecules located within a volume of one cubic mean-free-path(Nreal), the following dependence is realized:

Nreal = n3, n , 1/. (1)If constant simulation particle weights are used, the number of simulation particles per cell should therefore scalewith 1/2 in 3D (and 1/ in 2D). For example, if the density drops by one order of magnitude, although the numberof molecules per unit volume drops equally, the volume under consideration rises by three orders of magnitude.Therefore, even when every cell is refined precisely to the local value of , if there were 10 particles per cell in thehigh density region near the leading edge, one would expect (/)2 = 52 times more particles in the lower densityfree-stream cells. This is precisely verified in Fig. 6(b) where there are approximately 250 particles per cell in thefree-stream region and 10 near the leading edge of the flat plate. Note also, how in the very low density region abovethe plate, there are an enormous number of particles per cell. As stated earlier, only 20 particles per cell are requiredfor accuracy and using additional particles per cell is inefficient. Thus, even with precise AMR for 3D simulations,large variations in the number of particles per cell will occur naturally in a DSMC simulation, resulting in a veryinefficient simulations using far more particles than required for statistical accuracy.

(a) Local time step ratio in each L3 cell. (b) Number of simulation particles per cell.

Figure 7. MGDS simulation results with the use of variable time steps.

As described by Kannenberg and Boyd,11 the use of a variable time step in each cell significantly alleviates thisproblem. Compared to a global reference time step, tre f , the time step used to advance particles within a given cell isincreased or decreased by a factor that varies from cell to cell (t = tratiotre f ). This factor is equal to the ratio ofthe local mean-collision-time to the global reference time step, tratio = Sc/tre f and can be scaled with a globallyconstant factor S. At the same time, the weight for all particles within the cell is also multiplied by tratio. Essentially,where c (and therefore ) is large, particles move more rapidly through these larger cells. During a given instant,this reduces the number of particles found in that cell and, by adjusting the particle weight by the same factor, thenumber density in the cell remains correct. Since c 1/, the combination of variable time steps and AMR improvesthe scaling for the number of particles per cell to N constant in 2D and N 1/ in 3D. As detailed in Ref. 11, thenew time step, number of particles, and particle weight are used in each cell to compute collision rates and outcomeswithout any modification to the DSMC algorithm.

The MGDS code is able to set a variable time step in each L3 cell. In addition, this local time step can be slightlyadjusted to account for in-exact AMR. Even when using a 3-level adaptive grid, locally x 6= precisely. In the MGDScode, the local time step factor is set as tratio = S (c/tre f ) (x/). Thus if a given cell is slightly too smallthe time step is lowered such that additional particles will accumulate in this smaller cell. It should be noted thatadjusting the local time step in this manner is accurate as long as x/ 1, which is the case when using a 3-levelCartesian grid with AMR. Figure 7(a) shows the time scale factor used in each L3 cell for the flat plate problem. Thecombination of precise AMR and precise local time step adjustment leads to a much more uniform distribution of

9 of 14


particles per cell as seen in Fig. 7(b). The flow field solutions, surface heating rates, and surface shear stress profilesobtained when using the variable time step method are verified to reproduce the results obtained using a constantsimulation timestep which required substantially more particles.

Varying the particle weight in each cell independent of the time step is an alternate method of controlling thenumber of particles per cell that is used in existing DSMC codes and also discussed in Ref. 11. Here, particles areeither cloned or deleted in order to obtain the desired number of particles per cell, and their weights are adjustedaccordingly. While the deletion of simulation particles is not thought to influence solution accuracy, the cloning ofparticles may correlate statistics and result in random walk errors.1 Although techniques exist to minimize such errors,the combination of AMR and local time steps should greatly reduce the degree to which particles are cloned/destroyed.

B. Hypersonic blunt body flows

(a) Temperature field and final 3-level Cartesian grid. (b) Close-up of the plate-tip flow region.

Figure 8. Simulation results for Mach 15 nitrogen flow over a vertical flat plate.

Mach 15 flow of diatomic nitrogen gas over a 16 cm vertical flat plate is simulated with the MGDS code usingboth AMR and the variable time step method. The free-stream density and temperature are 6.5106 kg/m3 and 80K, respectively. The temperature of the flat plate is held fixed at Twall = 1500 K on front surface and Twall = 200 Kon the back. The variable hard-sphere (VHS) collision model is used with a power law value of = 0.75, and diffusereflection and full thermal accommodation are imposed for surface collisions. The Larsen-Borgnakke12 model is usedfor translational-rotational energy exchange together with the variable rotational energy exchange probability model ofBoyd13 using a maximum rotational collision number 18.1 and reference temperature for rotational energy exchangeof 91.5 K. Vibrational energy is not considered. Again, the simulation is three-dimensional stretching approximately8 in the z direction with symmetry boundary conditions. Only the two-dimensional flow field results are presented.

The MGDS simulation begins with a uniform Cartesian mesh (L3=L2=L1) that is sized to x = 0.5 and auniform time step set to t = 0.2c. The simulation rapidly reaches steady state and the solution is then sampled forapproximately 200 time steps. At this point the AMR function is called which refines the mesh, re-sorts all particlesin the new mesh, resets all local time steps, and then continues with the simulation. The translational temperaturefield after four AMR calls is shown in Fig. 8(a) which clearly shows the bow shock and compression region ahead ofthe plate and a rarefied region behind the plate. A close-up of the plate tip is shown in Fig. 8(b). Here the level ofrefinement between L1 cells (behind the plate) and L3 cells in front is seen to vary by a factor of 30. Indeed, withoutthe variable time step method, the large cells in the plate wake would naturally contain 302 = 900 times more particlesthan the refined cells in front of the plate. Also evident in Fig. 8(b) is the increased flexibility a 3-level Cartesian meshhas over a 2-level mesh. A 2-level Cartesian mesh requires uniform cell spacing within each L1 cell which wouldresult in substantially more (unnecessary) cells in high gradient region upstream of the flat plate. Not only would a2-level Cartesian mesh require more cells and therefore more particles, but would also introduce further variation in

10 of 14


the number of particles per cell. In such a 2-level mesh, some of the cells would be now be much smaller than thelocal value of . As a result, these cells may no longer contain sufficient particles ( 20) and the solution may nowbecome inaccurate (not just inefficient). In order to restore accuracy, either the global number of particles would haveto be increased (highly inefficient) or particle cloning would become necessary.

Figure 9. Mach 15 nitrogen flow over a 3D vertical flat plate.

In order to highlight the capability of the MGDS geometry model, AMR algorithm, and cut-cell algorithm, twoadditional solutions are presented in Fig. 9 and Fig. 10. The validation of MGDS simulation predictions withexperimental results is not addressed in the current article. Figure 9 depicts 3D flow over a vertical flat plate of finitesize in the z direction. The free-stream conditions are identical to those used for the simulation displayed in Fig. 8,except Twall = 2500 K on the front of the plate. This modification results in more moderate density gradients near theplate surface and requires less refinement, thereby enabling this 3D flow to be simulated on a single processor. Figure9 highlights how the general AMR algorithm outlined in Section III B. is capable of smoothly adapting a 3-levelCartesian mesh for 3D flows involving large density gradients. Contours of translational temperature are plotted forthe flow field, and contours of the heat transfer coefficient are plotted on the plate surface. In Fig. 9, the fine gridspacing on the front of the plate and the coarse spacing immediately behind the plate is clearly evident.

Finally, Fig. 10 shows a 3D solution of hypersonic flow of argon gas over a cylinder. The solution is uniform in thez direction and only the resulting 2D flow field is displayed. The free-stream density and temperature are 6.5106kg/m3 and 80 K, respectively. The temperature of the flat plate is held fixed at Twall = 1500 K. Diffuse reflectionand full thermal accommodation are imposed for surface collisions. The free-stream velocity is set to correspond to avery high Mach number of 29. The resulting temperatures (seen in Fig. 10) are unrealistically high as no ionizationreactions are enabled. However, the reason such a high Mach number was selected is to test the MGDS AMR andcut-cell algorithms with a challenging case involving large density variations. The cylinder geometry is specified asa triangulated surface and is cut from the initial uniform Cartesian flow field mesh, and also cut from each new meshgenerated by the AMR function. Although the cylinder geometry may seem simple, the cut cells resulting from thesurface geometry span a wide range of volumes ranging from near complete cells, to very small slices of the originalCartesian cells. A close-up view of the mesh refinement near the cylinder surface at approximately 45 degrees fromthe stagnation point is shown in Fig. 11.

11 of 14


Figure 10. Mach 29 argon flow over a cylinder.

12 of 14


Figure 11. Close-up view of the cylinder mesh at 45 degrees from stagnation point.

V. Summary and Conclusions

In this article, the data structures and algorithms of a newly developed 3D direct simulation Monte Carlo (DSMC)program, called the Molecular Gas Dynamic Simulator (MGDS) code, are outlined. The code employs an embedded3-level Cartesian mesh, accompanied by a cut-cell algorithm to incorporate triangulated surface geometry into theadaptively refined Cartesian mesh. This geometry model is selected for its low memory storage requirements, itsability to decouple surface and flow field discretizations, and increased computational efficiency of DSMC particlemovement procedures over non-Cartesian meshes. Two separate data structures are proposed in order to separategeometry data from cell and particle information. The geometry data structure contains all necessary informationrequired for particle movement and AMR procedures, however for a Cartesian mesh, this still involves little memorysuch that each partition in a parallel simulation could potentially store the entire geometry. A separate cell/particle datastructure is maintained that holds all cell and particle data. This data structure requires significant memory storage andmust be partitioned in a parallel simulation. Such separate geometry and cell/particle data structures enable particleswithin each cell to be moved for a full time step without requiring information from other cell data structures, includingmovement with variable time steps across multiple cells. Within the MGDS code, particles are moved through the 3-level Cartesian mesh using a ray-tracing technique. A simple and efficient AMR algorithm that maintains local cellsize and time step consistent with the local mean-free-path and local mean collision time is detailed. Simulationsare presented that demonstrate significant efficiency gains for 3D flows with the use of precise AMR and variabletime steps. Finally, a cut-cell method is outlined that sorts an arbitrary list of triangulated surface elements intolocal Cartesian cells and computes the cut volume using a Monte Carlo technique. The ray-trace particle movementalgorithm is then able to accurately detect and simulate particle collisions with complex 3D surface geometries.

Acknowledgments

This work is partially supported by a seed-grant from the University of Minnesota Supercomputing Institute (MSI).This work is also supported by the Air Force Office of Scientific Research (AFOSR) under Grant No. FA9550-04-1-0341. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarilyrepresenting the official polices or endorsements, either expressed or implied, of the AFOSR or the U.S. Government.

13 of 14


References1Bird, G. A., Molecular Gas Dynamics and the Direct Simulation of Gas Flows, Oxford University Press, New York, 1994.2Rader, D. J., Gallis, M. A., Torczynski, J. R., and Wagner, W., DSMC Convergence Behavior of the Hard-Sphere-Gas Thermal Conductivity

for Fourier Heat Flow, Physics of Fluids, Vol. 18, No. 7, 2006, pp. 116.3Gallis, M. A., Torczynski, J. R., and Rader, D. J., DSMC Convergence Behavior for Transient Flows, AIAA Paper 07-4258, June 2007,

presented at the 39th AIAA Thermophysics Conference, Miami, FL.4Dietrich, S. and Boyd, I. D., Scalar and Parallel Optimized Implementation of the Direct Simulation Monte Carlo Method, Journal of

Computational Physics, Vol. 126, 1996, pp. 328342.5Otahal, T. J., Gallis, M. A., and Bartel, T. J., An Investigation of Two-Dimensional CAD Generated Models with Body Decoupled Cartesian

Grids for DSMC, AIAA Paper 2000-2361, June 2000, presented at the 39th AIAA Thermophysics Conference, Miami, FL.6LeBeau, G. J., A parallel implementation of the direct simulation Monte Carlo method, Computer Methods in Applied Mechanics and

Engineering, Vol. 174, 1999, pp. 319337.7Ivanov, M. S., Markelov, G. N., and Gimelshein, S. F., Statistical simulation of reactive rarefied flows: numerical approach and applications,

AIAA Paper 98-2669, 1998, presented at the 43rd AIAA Aerospace Sciences Meeting and Exhibit, Reno, NV.8Aftosmis, M. J., Berger, M. J., and Melton, J. E., Robust and Efficient Cartesian Mesh Generation for Component-Based Geometry, AIAA

Journal, Vol. 36, No. 6, 1998, pp. 952960.9LeBeau, G., Jacikas, K., and Lumpkin, F., Virtual Sub-Cells for the Direct Simulation Monte Carlo Method, AIAA Paper 03-1031, Jan.

2003, presented at the 39th AIAA Thermophysics Conference, Miami, FL.10Gao, D. and Schwartzentruber, T. E., Parallel Implementation of the Direct Simulation Monte Carlo Method for Shared Memory Architec-

tures, Presented the 48th AIAA Aerospace Sciences Meeting, AIAA Paper 2010-451, Jan. 2010.11Kannenberg, K. C. and Boyd, I. D., Strategies for Efficient Particle Resolution in the Direct Simulation Monte Carlo Method, Journal of

Computational Physics, Vol. 157, 2000, pp. 727745.12Larsen, P. S. and Borgnakke, C., Statistical Collision Model for Monte Carlo Simulation of Polyatomic Gas Mixture, Journal of Compu-

tational Physics, Vol. 18, 1975, pp. 405420.13Boyd, I. D., Rotational-Translational Energy Transfer in Rarefied Nonequilibrium Flows, Physics of Fluids A, Vol. 2, No. 3, 1990, pp. 447

452.

14 of 14


3

Documents

surface mesh

efficient dsmc simulation

simulation time step

level cartesian mesh

refined cartesian mesh

acomputational mesh

entire mesh

level embedded cartesian