-
A Three-Level Cartesian Geometry Based Implementation ofthe DSMC
Method
Da Gao, Chonglin Zhang, and Thomas E. SchwartzentruberDepartment
of Aerospace Engineering and Mechanics, University of Minnesota,
Minneapolis, MN 55455
The data structures and overall algorithms of a newly developed
3-D direct simulation Monte Carlo (DSMC)program are outlined. The
code employs an embedded 3-level Cartesian mesh, accompanied by a
cut-cell al-gorithm to incorporate triangulated surface geometry
into the adaptively refined Cartesian mesh. Such an ap-proach
enables decoupling of the surface mesh from the flow field mesh,
which is desirable for near-continuumflows, flows with large
density variation, and also for adaptive mesh refinement (AMR). Two
separate datastructures are proposed in order to separate geometry
data from cell and particle information. The geometrydata structure
requires little memory so that each partition in a parallel
simulation can store the entire mesh,potentially leading to better
scalability and efficient AMR for parallel simulations. A simple
and efficient AMRalgorithm that maintains local cell size and time
step consistent with the local mean-free-path and local
meancollision time is detailed. The 3-level embedded Cartesian mesh
combined with AMR allows increased flexibil-ity for precise control
of local mesh size and time-step, both vital for accurate and
efficient DSMC simulation.Simulations highlighting the benefits of
AMR and variable local time steps will be presented along with
DSMCresults for 3-D flows with large density variations.
I. Introduction
DIRECT Simulation Monte Carlo (DSMC) is a particle-based
numerical method1 that simulates the Boltzmannequation. As a
result, DSMC is an accurate simulation tool for modeling dilute gas
flows ranging from continuumto free-molecular conditions. The DSMC
method tracks a representative number of simulation particles
through acomputational mesh with each simulation particle
representing a large number of real gas molecules. A key aspect
ofthe method is that molecular movement and collision processes are
decoupled. Specifically, during each simulationtime step, all
particles are first translated along their molecular velocity
vector without interaction. After the movementphase, nearby
particles that lie within the same computational cell are collided
in a statistical manner. This decouplingis accurate for dilute gas
flows as long as the simulation time step remains less than the
local mean-collision-time (t 100 million cells) an unstructured
tetrahedral mesh must not only be partitioned across parallel
processorsduring a DSMC simulation, but pre/post-processing, grid
generation, and AMR routines must also be par-allelized due to the
large memory requirements. A Cartesian geometry model significantly
alleviates theseproblems and may therefore be more suitable for
future large-scale DSMC simulations.
2. A Cartesian geometry model necessitates a cut-cell method in
order to handle complex surface geometries.Although this may seem
like a drawback, a cut-cell approach is an extremely general and
accurate techniquefor complex geometries.6,8 As discussed by LeBeau
et al.,6 a cut-cell approach allows for complete decouplingbetween
the flow field and surface mesh discretization. Such decoupling is
especially important for the DSMCmethod and may be necessary for
large-scale simulations of near-continuum flows where the
mean-free-pathnear a vehicle surface may be orders of magnitude
smaller than accurate surface resolution requires.
Likewise,decoupling is equally important for low density flows
(such as low orbiting satellites) where the local value of may be
much larger than fine surface geometry features.
3. The MGDS code employs a 3-level embedded Cartesian grid for
the flow field. Adding the third independentlevel of Cartesian
refinement adds considerable flexibility over previous 2-level
implementations for flows withlarge density gradients. Precise AMR
is essential for both simulation accuracy and computational
efficiency.
4. Use of a Cartesian geometry model not only results in lower
memory storage requirements, but also enablesmany DSMC calculations
to be performed with fewer floating point operations. Algorithms
for initial gridgeneration, successive AMR, particle movement, and
particle re-sorting during AMR require fewer operationswhen using a
Cartesian grid compared to a non-Cartesian grid. The efficiency of
such operations, and the degreeto which they can be automated, will
become increasingly important as larger DSMC simulations are
performed,especially for time-accurate unsteady problems where
frequent AMR is required.
2 of 14
American Institute of Aeronautics and Astronautics
-
The spatial and temporal discretization required for accurate
and efficient DSMC simulation is inherently linked tothe solution
itself. Thus, the DSMC method is ideal for fully automated grid
generation, AMR, and variable time steps,potentially requiring
little-to-no user input. As computational resources continue to
grow rapidly, such automationwould avoid current bottle-necks in
total simulation turnover created by high demands on code users for
large-scaleparallel flow problems. In addition to providing a new
DSMC simulation tool for complex flows over 3D
geometries,development of the MGDS code will provide independent
evaluation of existing DSMC implementation approaches,as well as
investigation of new approaches specifically geared towards future
large-scale parallel simulations.
The article is organized as follows. In Section II the data
structures used to organize geometry, cell, and particledata are
described, after which particle movement algorithms are outlined.
In Section III the main DSMC algorithm,the AMR and variable
time-step algorithm, and the cut-cell algorithm are described.
Results for various 3D simulationsthat demonstrate the benefits of
3-level AMR and variable time steps are presented in Section IV.
Section V containsa summary of the major aspects of MGDS code and
conclusions drawn from the preliminary results.
II. Data Structures
A. Geometry Data Structure
Figure 1. Geometry data structure used within the MGDS code.
The Geometry data structure used within the MGDS code is
depicted in Fig. 1. It consists of a 3D array oflevel-one (L1) cell
structures. Each L1 cell structure contains the co-ordinates of the
bounding Cartesian vertices(x0,y0,z0,xE ,yE ,zE ) and a pointer to
a 3D array of level-two (L2) cell structures. This 3D array can
have arbitrarydimensions, meaning each L1 cell can contain an
arbitrary number of L2 cells. Each L2 cell structure is identicalto
a L1 cell structure, and therefore contains a pointer to a 3D array
of L3 cell structures. In addition to storing thebounding vertex
coordinates, each L3 cell structure contains the local values of c,
, t, and currently the maximumexpected collision rate < cr
>max. Also, each L3 cell structure contains a pointer to an
array of surface structures.The majority of cells will contain no
boundary surfaces and the pointer will be null. However, for cells
which docontain boundary surfaces (described in Section III C.),
each surface structure in the array contains vertex, surfacenormal,
area, surface type, and sampling variables. Given the coordinates
of a particle, the Geometry data structureenables efficient
cell-indexing of the particle. The most important aspect of this
Geometry data structure (Fig. 1) is
3 of 14
American Institute of Aeronautics and Astronautics
-
that it contains all of the information required for the AMR,
cut-cell, and variable time step (movement) procedureswhile
requiring relatively little memory storage.
Clearly the largest contribution to memory storage requirements
comes from the data stored in each L3 cell.While the current MGDS
implementation stores vertex coordinates for each L3 cell, this
information could simplybe computed from the L2 vertex information
and the Cartesian index of the L3 cell, thereby greatly reducing
thememory storage requirement. Only a small percentage of L3 cells
that intersect a boundary surface will contain arraysof surface
structures which clearly involve substantially larger memory
requirements. Finally, each L3 cell structurecontains both the
processor number and local array index (on that processor) where
its particle data and additionalcell data is stored. It is proposed
that the entire Geometry data structure could be stored locally on
each processorin a parallel simulation even for large DSMC flow
field and surface meshes. For simulations where highly
complexsurface geometry prohibits this, at a minimum, large
portions of the Geometry data structure could be stored on
eachprocessor. The more memory-intensive cell and particle data
associated with each L3 cell (detailed in the next section)must be
partitioned across parallel processors. An important aspect of the
MGDS data structures is that any L3 celldata can be partitioned to
any processor. That is, domain decomposition is not limited to L1
or L2 cells, which isessential for load balancing of large parallel
simulations. The proposed benefits of such a Geometry data
structure,including efficient movement of particles with variable
time steps and efficient AMR, for parallel simulations will
bediscussed in upcoming sections.
B. Cell/Particle Data Structure
Figure 2. Cell/Particle data structure used within the MGDS
code.
A DSMC code requires the storage and book-keeping of a large
amount of data. Complex data structures are oftenemployed for data
organization and can provide the flexibility required for a general
DSMC implementation. Particledata is the dominant contributor to
memory, however, the global number of particles, as well as the
local numberwithin each cell, can vary greatly within a DSMC
simulation. Thus, statically allocated arrays of particles must
eitherbe conservatively large (unused memory) or re-sized often
(computationally and memory intensive). Likewise, sincethe mesh
resolution in DSMC depends on the solution itself, ideally, the
number of cells would change during asimulation and data structures
should account for this possibility as well.
The MGDS code adopts the cell and particle data structures
approach used in the MONACO code4 with certainmodifications. The
specific Cell/Particle data structure used by the MGDS code is
depicted in Fig. 2. On each parallelpartition, an array of L3 cells
is maintained. This array is actually an array of pointers to L3
Cell/Particle structures sothat if the array requires resizing
(during AMR), resizing an array of pointers is much less memory
intensive and moreefficient than resizing an array of Cell/Particle
structures. Each cell-data structure contains substantial data
(currentlyincluding duplicated data from the Geometry data
structure). Each Cell/Particle structure also contains an array
thatstores sampled (cell-averaged) data of the particles within the
cell. Currently each Cell/Particle structure also containsthe nine
integer indices of the corresponding L3 cell in the Geometry data
structure. Finally, each Cell/Particle structurecontains pointers
to the head and tail of a doubly-linked list of particle
structures. Each particle structure contains allrequired particle
information (as shown in Fig. 2) and in addition, also contains two
integers for the processor number
4 of 14
American Institute of Aeronautics and Astronautics
-
and index within the local cell array where the particle is
located (the reason for this will be explained in Section III).It
should be noted that cell-face data and cell-connectivity data are
not required since the grid is Cartesian.
The MGDS code is written in Fortran 90. All data structures are
defined as Fortran derived data types whichenable general and
efficient packaging of data for broadcast among partitions in a
parallel simulation.
C. Particle Movement and Tracking
(a) Ray-tracing movement procedure (b) Cartesian movement
procedure.
Figure 3. Particle tracking procedures relevant to the DSMC
method.
Ray-tracing is a general and efficient method of tracking
particle movement through a computational meshthat is widely used
in the computer graphics industry and is also used by many DSMC
codes. The procedure isdepicted schematically in Fig. 3(a) for
movement within an unstructured 2D triangular mesh. Essentially,
the time-to-hit each face of the current cell (thit f in Fig. 3(a))
is computed using the particle position/velocity and theface
vertices/normal-vector, where x f is the normal-distance between
the particle and cell-face. The particle is thenadvanced for the
minimum time thitmin = min(thit f ) and the particle is re-located
to the neighboring cell. Theprocess is then repeated for the
remaining time (tsimthitmin). Of course if (thitmin < tsim) then
the particle canbe moved for the full time step and remains located
in the current cell. This procedure is very general, it naturally
sortsparticles while moving them, and naturally allows for variable
time steps to be used in each cell.
Within a Cartesian grid, a particle may be moved for the full
time step regardless of how many cells are crossed.The particles
new position can then be re-indexed on the Cartesian grid very
efficiently compared to re-indexing ona non-Cartesian grid. After
re-indexing, the particle can be re-located directly to the new
cell. This procedure, calledCartesian-move is depicted in Fig.
3(b). However, even on a Cartesian grid, there are three main
drawbacks to theCartesian-move procedure. The first is that
variable time steps can not be used in each cell. This is not a
drawback forcomputing unsteady flows, however, it is a significant
drawback when computing steady-state flows as variable timesteps
increase the simulation efficiency greatly. The second drawback is
that re-indexing on a multi-level Cartesiangrid is rather
computationally expensive, and may approach the expense of
ray-tracing. Third, when simulating flowover complex geometries,
ray-tracing must be used in cells close to boundary surfaces,
thereby requiring the additionalcomplexity of mixed Cartesian-move
and ray-trace algorithms.
Although the MGDS code maintains the option of the
Cartesian-move procedure, the default is to use the ray-tracing
procedure throughout the entire simulation domain. It is important
to realize, however, that the ray-tracingalgorithm is more
efficient on a Cartesian grid than on a non-Cartesian grid, since
the dot products (shown in Fig.3(a)) involve only a single
component. Furthermore, a trend in DSMC algorithm research is to
enlarge cells above thex constraint without loss of accuracy
through the use of virtual subcells.9 The consequence of this trend
for 3Dsimulations will be that a large majority of simulation
particles remain within the same cell during a given time step.On a
Cartesian grid, the ray-trace procedure to determine if thitmin
< tsim is highly efficient. Further quantitativeresults
comparing these two movement procedures can be found in Ref.
10.
5 of 14
American Institute of Aeronautics and Astronautics
-
III. Algorithms
A. Core DSMC Algorithm
By separating the Geometry data structure (Fig. 1) from the
Cell/Particle data structure (Fig. 2), the main DSMC loopwithin the
MGDS code is made more compact and communication between processors
is potentially lowered whenparticles move across partitions in a
parallel simulation. The resulting compact DSMC loop is shown in
Fig. 4.
Loop (for each L3 cell)Collide Particles (update particle
properties)Sample PropertiesMove (update particle positions) -
recursive ray-trace algorithm - including surface collisions -
record new x,y,z, proc#, local cell# - particles remain in original
linked-listIF (INFLOW) Generate/Move Particles
End Loop
Package up communicationSend data to partitionsGlobal Sort (loop
over all cells/particles)
Figure 4. Main DSMC algorithm within the MGDS code.
Each L3 Cell/Particle structure (Fig. 2) contains all necessary
information to perform collisions and sample particleproperties
within the cell, independent of any other cell in the mesh. Next,
since the entire Geometry data structure (orlarge portions of it)
are stored on each partition, all particles within a given cell can
be moved for their complete timestep including variable time steps
and surface collisions. Specifically, without accessing any other
cell data (which maybe located on a different partition), the x, y,
and z positions, the processor index, and the local cell array
index of eachparticle can be updated solely using the Geometry data
structure. Finally, if the current cell contains inflow
surfaces,then particles are generated and fully-moved as just
described. The result is a single loop over all L3 cells for each
fullsimulation time step. Each cell within this loop can be
processed independently of all other cells in the simulation.It is
important to note that upon completion of this loop, all cells
still contain linked-lists of their original particles.Only the
particle coordinates, destination processor, and local cell index
on that processor have been updated; that is,each particle knows
exactly where to be sent. At this point the particle data is
packaged up and sent to the appropriatepartition. As depicted in
Fig. 3(b) this approach may have the potential to limit the
inter-processor communicationfor occasional particles that cross
multiple partitions in single time step. The final step is a global
sort algorithm thatremoves particles from current linked-lists and
adds them to destination linked-lists. This is the only step that
is notindependent of other cells in the simulation. Optimization of
the above movement and sorting procedures is discussedin Ref. 10
along with a shared-memory parallelization technique which threads
both the main DSMC loop and theglobal sorting algorithm over
multi-core processors.
B. Adaptive Mesh Refinement (AMR)The AMR algorithm used within
the MGDS code inputs the current Geometry data structure, generates
a new Ge-ometry data structure adapted precisely to the local
mean-free-path, sets local time steps in accordance with the
localmean-collision-time, and also interpolates the maximum
expected collision frequency (< cr >max) between old andnew
meshes. The algorithm then updates the Cell/Particle data structure
as cells are added or removed and finally, theglobal sort algorithm
re-sorts all particles into the appropriate linked-lists in the
revised data structure. In this manner,the MGDS simulation
continues in a smooth and accurate manner. The AMR procedure is
called approximately 4-10times during a typical steady-state MGDS
simulation. With the current implementation described below, one
callto the AMR function requires approximately the same
computational time as 10 simulation time steps, regardless ofthe
size of the simulation. While this has a negligible effect on the
overall simulation time for a steady-state solu-tion, future
efficiency improvements will be important for unsteady simulations
where the AMR procedure might becalled every few timesteps. An
overview of the AMR procedure is detailed below along with a
3-level Cartesian gridschematic in Fig. 5.
It is important to note that L1 cell sizes remain fixed during
the simulation. Thus, the AMR procedure is appliedwithin each L1
cell independently of all other L1 cells according to the following
steps:
6 of 14
American Institute of Aeronautics and Astronautics
-
Figure 5. Schematic of 3-level Cartesian mesh and AMR
procedure.
1. Loop through all L3 cells contained within this L1 cell and
determine max.
2. Set the new L2 cell size to max and generate a new Cartesian
L2 cell grid within the L1 cell.
3. Loop through all new L2 cells and determine min within each.
This involves determination of which old L3cells intersect with
each new L2 cell. On a 3D Cartesian grid, computing the volume of
intersection betweentwo generic cells is straight-forward and
efficient.
4. Set the new L3 cell size within each new L2 cell to min.
5. Compute the volume of intersection between new and old L3
cells within each large L1 cell and interpolate(using a volume
average) values of , c, and < cr >max into the new L3 cells.
This information can then beused to accurately set an appropriate
new local time step and maintain the local collision rate.
6. After processing all L1 cells via the above steps, the
complete Geometry data structure is now updated.
7. The new Geometry data structure is now used to update the
pointer array of Cell/Particle structures (Fig. 2) aswell as modify
cell data. Note that in changing the dimensions/vertices of cells,
the particles contained withinthese cells may now be completely
un-sorted.
8. Finally, the coordinates of all particles are re-indexed
using the new Geometry data structure. The global sortalgorithm is
then called which adds/removes particle pointers to/from the
correct linked lists. The MGDSsimulation is now ready to carry-on
with the general move/collide DSMC algorithm.
C. Cut Cell Algorithm
Since the MGDS code already uses the ray-tracing technique to
determine particle intersections with cell faces, ar-bitrary
triangulated surface meshes can be naturally imbedded within the
flow field grid without modification to theMGDS data structures or
algorithms. The cut-cell method performs two main functions. The
first is to read in a listof triangular surface elements (generated
by various commercial surface triangulation packages) and sort all
surfaceelements into the appropriate L3 cells within the Geometry
data structure. Note that if a single large surface trianglecuts
through multiple small L3 cells, that the surface element may
simply be added to multiple cells and does not needto be trimmed.
The second main function is to compute the volume of each cut-cell
required to determine the colli-sion rate and various macroscopic
properties in the cut-cell. Both functions involve moderately
complex geometricalcalculations, but leave the basic DSMC data
structures and algorithms completely unchanged.
In order to sort a list of triangular surface elements into the
list of L3 Cartesian cells, the MGDS code employsa Cut Cell
Intersection technique initially detailed in computer graphics
literature and more recently adapted forCFD simulations by Aftosmis
et al.8 The technique is able to use computationally efficient
bitwise operations andcomparisons to determine intersections
between triangular elements and Cartesian cells. Further details on
computingthe intersection between generally positioned triangles in
3D are contained in Ref. 8.
Once all surface elements are sorted into appropriate L3 cells,
the volume of each of these cut-cells is computed.The MGDS code
uses a simple Monte Carlo technique to compute cut-volumes.
Co-ordinates are chosen at random
7 of 14
American Institute of Aeronautics and Astronautics
-
within the un-cut Cartesian cell. The dot product between the
chosen coordinates and the normal vector of a givensurface element
is computed. If the dot products are negative for all surface
elements contained within the cell, thenthe point lies outside of
the flow domain. The cut-volume is simply determined by dividing
the fraction of pointsdetermined to lie outside the domain by the
total number of Monte Carlo co-ordinates considered (N). The error
insuch a volume calculation scales directly as 1/
N. It should be noted that this procedure is suitable only when
the
surface geometry cutting a given cell is either completely
convex, or, completely concave. More complex collectionsof surface
elements within a single L3 cell are currently not supported by the
MGDS code and advanced methods forsuch geometries are discussed in
Ref. 8.
The cut-cell algorithm is called at the beginning of each
simulation and also immediately following each call tothe AMR
function. The current cut-cell algorithm requires less
computational time than a single AMR call for allsimulations
presented in this article. Finally, inflow, outflow, symmetry, or
wall surfaces may be specified by the useron all sides of the
overall Cartesian bounding box of the simulation. These outer
surfaces are added to the appropriateL3 cells in addition to the
triangulated surface elements sorted by the cut-cell algorithm.
IV. MGDS Simulation Results
A. Effect of variable time steps
Mach 5 flow of argon over a 10 cm flat plate at 30 degrees angle
of attack is simulated to steady-state with andwithout the use of
variable time steps. The free-stream density and temperature are
7.5 105 kg/m3 and 200 Krespectively. The temperature of the flat
plate is held fixed at Twall = 2000 K on the bottom and Twall = 200
K on thetop. The variable hard-sphere (VHS) collision model is used
with a power law value of = 0.81, and diffuse reflectionand full
thermal accommodation are imposed for surface collisions. The
simulation is three-dimensional stretchingapproximately 4 in the z
direction. However, since symmetry boundary conditions are imposed
on the z boundaryplanes, the resulting flow is uniform in the z
coordinate direction and only the two-dimensional flow field
results arepresented.
(a) Normalized density field. (b) Number of simulation particles
per cell.
Figure 6. MGDS simulation results without the use of variable
time steps.
Figure 6(a) shows the resulting density field (normalized by the
free-stream density) when a constant simulationtimestep is used.
The 3-level Cartesian grid has been adapted to the local
mean-free-path and the constant time step isless than 12 c
everywhere, thus producing an accurate solution. It is evident from
Fig. 6(a) that the density increases bya maximum of 5 times between
the free-stream and the leading edge of the plate. Although the
density (and thereforethe number density, n) increases towards the
plate leading edge, the cell size decreases proportionately. Thus
AMRshould naturally provide some control in maintaining a constant
number of particles per cell. However, as evidentfrom Fig. 6(b),
even with AMR, the average number of simulation particles per cell
varies widely from less than 10 to
8 of 14
American Institute of Aeronautics and Astronautics
-
more than 500. If one considers the number of real gas molecules
located within a volume of one cubic mean-free-path(Nreal), the
following dependence is realized:
Nreal = n3, n , 1/. (1)If constant simulation particle weights
are used, the number of simulation particles per cell should
therefore scalewith 1/2 in 3D (and 1/ in 2D). For example, if the
density drops by one order of magnitude, although the numberof
molecules per unit volume drops equally, the volume under
consideration rises by three orders of magnitude.Therefore, even
when every cell is refined precisely to the local value of , if
there were 10 particles per cell in thehigh density region near the
leading edge, one would expect (/)2 = 52 times more particles in
the lower densityfree-stream cells. This is precisely verified in
Fig. 6(b) where there are approximately 250 particles per cell in
thefree-stream region and 10 near the leading edge of the flat
plate. Note also, how in the very low density region abovethe
plate, there are an enormous number of particles per cell. As
stated earlier, only 20 particles per cell are requiredfor accuracy
and using additional particles per cell is inefficient. Thus, even
with precise AMR for 3D simulations,large variations in the number
of particles per cell will occur naturally in a DSMC simulation,
resulting in a veryinefficient simulations using far more particles
than required for statistical accuracy.
(a) Local time step ratio in each L3 cell. (b) Number of
simulation particles per cell.
Figure 7. MGDS simulation results with the use of variable time
steps.
As described by Kannenberg and Boyd,11 the use of a variable
time step in each cell significantly alleviates thisproblem.
Compared to a global reference time step, tre f , the time step
used to advance particles within a given cell isincreased or
decreased by a factor that varies from cell to cell (t = tratiotre
f ). This factor is equal to the ratio ofthe local
mean-collision-time to the global reference time step, tratio =
Sc/tre f and can be scaled with a globallyconstant factor S. At the
same time, the weight for all particles within the cell is also
multiplied by tratio. Essentially,where c (and therefore ) is
large, particles move more rapidly through these larger cells.
During a given instant,this reduces the number of particles found
in that cell and, by adjusting the particle weight by the same
factor, thenumber density in the cell remains correct. Since c 1/,
the combination of variable time steps and AMR improvesthe scaling
for the number of particles per cell to N constant in 2D and N 1/
in 3D. As detailed in Ref. 11, thenew time step, number of
particles, and particle weight are used in each cell to compute
collision rates and outcomeswithout any modification to the DSMC
algorithm.
The MGDS code is able to set a variable time step in each L3
cell. In addition, this local time step can be slightlyadjusted to
account for in-exact AMR. Even when using a 3-level adaptive grid,
locally x 6= precisely. In the MGDScode, the local time step factor
is set as tratio = S (c/tre f ) (x/). Thus if a given cell is
slightly too smallthe time step is lowered such that additional
particles will accumulate in this smaller cell. It should be noted
thatadjusting the local time step in this manner is accurate as
long as x/ 1, which is the case when using a 3-levelCartesian grid
with AMR. Figure 7(a) shows the time scale factor used in each L3
cell for the flat plate problem. Thecombination of precise AMR and
precise local time step adjustment leads to a much more uniform
distribution of
9 of 14
American Institute of Aeronautics and Astronautics
-
particles per cell as seen in Fig. 7(b). The flow field
solutions, surface heating rates, and surface shear stress
profilesobtained when using the variable time step method are
verified to reproduce the results obtained using a
constantsimulation timestep which required substantially more
particles.
Varying the particle weight in each cell independent of the time
step is an alternate method of controlling thenumber of particles
per cell that is used in existing DSMC codes and also discussed in
Ref. 11. Here, particles areeither cloned or deleted in order to
obtain the desired number of particles per cell, and their weights
are adjustedaccordingly. While the deletion of simulation particles
is not thought to influence solution accuracy, the cloning
ofparticles may correlate statistics and result in random walk
errors.1 Although techniques exist to minimize such errors,the
combination of AMR and local time steps should greatly reduce the
degree to which particles are cloned/destroyed.
B. Hypersonic blunt body flows
(a) Temperature field and final 3-level Cartesian grid. (b)
Close-up of the plate-tip flow region.
Figure 8. Simulation results for Mach 15 nitrogen flow over a
vertical flat plate.
Mach 15 flow of diatomic nitrogen gas over a 16 cm vertical flat
plate is simulated with the MGDS code usingboth AMR and the
variable time step method. The free-stream density and temperature
are 6.5106 kg/m3 and 80K, respectively. The temperature of the flat
plate is held fixed at Twall = 1500 K on front surface and Twall =
200 Kon the back. The variable hard-sphere (VHS) collision model is
used with a power law value of = 0.75, and diffusereflection and
full thermal accommodation are imposed for surface collisions. The
Larsen-Borgnakke12 model is usedfor translational-rotational energy
exchange together with the variable rotational energy exchange
probability model ofBoyd13 using a maximum rotational collision
number 18.1 and reference temperature for rotational energy
exchangeof 91.5 K. Vibrational energy is not considered. Again, the
simulation is three-dimensional stretching approximately8 in the z
direction with symmetry boundary conditions. Only the
two-dimensional flow field results are presented.
The MGDS simulation begins with a uniform Cartesian mesh
(L3=L2=L1) that is sized to x = 0.5 and auniform time step set to t
= 0.2c. The simulation rapidly reaches steady state and the
solution is then sampled forapproximately 200 time steps. At this
point the AMR function is called which refines the mesh, re-sorts
all particlesin the new mesh, resets all local time steps, and then
continues with the simulation. The translational temperaturefield
after four AMR calls is shown in Fig. 8(a) which clearly shows the
bow shock and compression region ahead ofthe plate and a rarefied
region behind the plate. A close-up of the plate tip is shown in
Fig. 8(b). Here the level ofrefinement between L1 cells (behind the
plate) and L3 cells in front is seen to vary by a factor of 30.
Indeed, withoutthe variable time step method, the large cells in
the plate wake would naturally contain 302 = 900 times more
particlesthan the refined cells in front of the plate. Also evident
in Fig. 8(b) is the increased flexibility a 3-level Cartesian
meshhas over a 2-level mesh. A 2-level Cartesian mesh requires
uniform cell spacing within each L1 cell which wouldresult in
substantially more (unnecessary) cells in high gradient region
upstream of the flat plate. Not only would a2-level Cartesian mesh
require more cells and therefore more particles, but would also
introduce further variation in
10 of 14
American Institute of Aeronautics and Astronautics
-
the number of particles per cell. In such a 2-level mesh, some
of the cells would be now be much smaller than thelocal value of .
As a result, these cells may no longer contain sufficient particles
( 20) and the solution may nowbecome inaccurate (not just
inefficient). In order to restore accuracy, either the global
number of particles would haveto be increased (highly inefficient)
or particle cloning would become necessary.
Figure 9. Mach 15 nitrogen flow over a 3D vertical flat
plate.
In order to highlight the capability of the MGDS geometry model,
AMR algorithm, and cut-cell algorithm, twoadditional solutions are
presented in Fig. 9 and Fig. 10. The validation of MGDS simulation
predictions withexperimental results is not addressed in the
current article. Figure 9 depicts 3D flow over a vertical flat
plate of finitesize in the z direction. The free-stream conditions
are identical to those used for the simulation displayed in Fig.
8,except Twall = 2500 K on the front of the plate. This
modification results in more moderate density gradients near
theplate surface and requires less refinement, thereby enabling
this 3D flow to be simulated on a single processor. Figure9
highlights how the general AMR algorithm outlined in Section III B.
is capable of smoothly adapting a 3-levelCartesian mesh for 3D
flows involving large density gradients. Contours of translational
temperature are plotted forthe flow field, and contours of the heat
transfer coefficient are plotted on the plate surface. In Fig. 9,
the fine gridspacing on the front of the plate and the coarse
spacing immediately behind the plate is clearly evident.
Finally, Fig. 10 shows a 3D solution of hypersonic flow of argon
gas over a cylinder. The solution is uniform in thez direction and
only the resulting 2D flow field is displayed. The free-stream
density and temperature are 6.5106kg/m3 and 80 K, respectively. The
temperature of the flat plate is held fixed at Twall = 1500 K.
Diffuse reflectionand full thermal accommodation are imposed for
surface collisions. The free-stream velocity is set to correspond
to avery high Mach number of 29. The resulting temperatures (seen
in Fig. 10) are unrealistically high as no ionizationreactions are
enabled. However, the reason such a high Mach number was selected
is to test the MGDS AMR andcut-cell algorithms with a challenging
case involving large density variations. The cylinder geometry is
specified asa triangulated surface and is cut from the initial
uniform Cartesian flow field mesh, and also cut from each new
meshgenerated by the AMR function. Although the cylinder geometry
may seem simple, the cut cells resulting from thesurface geometry
span a wide range of volumes ranging from near complete cells, to
very small slices of the originalCartesian cells. A close-up view
of the mesh refinement near the cylinder surface at approximately
45 degrees fromthe stagnation point is shown in Fig. 11.
11 of 14
American Institute of Aeronautics and Astronautics
-
Figure 10. Mach 29 argon flow over a cylinder.
12 of 14
American Institute of Aeronautics and Astronautics
-
Figure 11. Close-up view of the cylinder mesh at 45 degrees from
stagnation point.
V. Summary and Conclusions
In this article, the data structures and algorithms of a newly
developed 3D direct simulation Monte Carlo (DSMC)program, called
the Molecular Gas Dynamic Simulator (MGDS) code, are outlined. The
code employs an embedded3-level Cartesian mesh, accompanied by a
cut-cell algorithm to incorporate triangulated surface geometry
into theadaptively refined Cartesian mesh. This geometry model is
selected for its low memory storage requirements, itsability to
decouple surface and flow field discretizations, and increased
computational efficiency of DSMC particlemovement procedures over
non-Cartesian meshes. Two separate data structures are proposed in
order to separategeometry data from cell and particle information.
The geometry data structure contains all necessary
informationrequired for particle movement and AMR procedures,
however for a Cartesian mesh, this still involves little memorysuch
that each partition in a parallel simulation could potentially
store the entire geometry. A separate cell/particle datastructure
is maintained that holds all cell and particle data. This data
structure requires significant memory storage andmust be
partitioned in a parallel simulation. Such separate geometry and
cell/particle data structures enable particleswithin each cell to
be moved for a full time step without requiring information from
other cell data structures, includingmovement with variable time
steps across multiple cells. Within the MGDS code, particles are
moved through the 3-level Cartesian mesh using a ray-tracing
technique. A simple and efficient AMR algorithm that maintains
local cellsize and time step consistent with the local
mean-free-path and local mean collision time is detailed.
Simulationsare presented that demonstrate significant efficiency
gains for 3D flows with the use of precise AMR and variabletime
steps. Finally, a cut-cell method is outlined that sorts an
arbitrary list of triangulated surface elements intolocal Cartesian
cells and computes the cut volume using a Monte Carlo technique.
The ray-trace particle movementalgorithm is then able to accurately
detect and simulate particle collisions with complex 3D surface
geometries.
Acknowledgments
This work is partially supported by a seed-grant from the
University of Minnesota Supercomputing Institute (MSI).This work is
also supported by the Air Force Office of Scientific Research
(AFOSR) under Grant No. FA9550-04-1-0341. The views and conclusions
contained herein are those of the authors and should not be
interpreted as necessarilyrepresenting the official polices or
endorsements, either expressed or implied, of the AFOSR or the U.S.
Government.
13 of 14
American Institute of Aeronautics and Astronautics
-
References1Bird, G. A., Molecular Gas Dynamics and the Direct
Simulation of Gas Flows, Oxford University Press, New York,
1994.2Rader, D. J., Gallis, M. A., Torczynski, J. R., and Wagner,
W., DSMC Convergence Behavior of the Hard-Sphere-Gas Thermal
Conductivity
for Fourier Heat Flow, Physics of Fluids, Vol. 18, No. 7, 2006,
pp. 116.3Gallis, M. A., Torczynski, J. R., and Rader, D. J., DSMC
Convergence Behavior for Transient Flows, AIAA Paper 07-4258, June
2007,
presented at the 39th AIAA Thermophysics Conference, Miami,
FL.4Dietrich, S. and Boyd, I. D., Scalar and Parallel Optimized
Implementation of the Direct Simulation Monte Carlo Method, Journal
of
Computational Physics, Vol. 126, 1996, pp. 328342.5Otahal, T.
J., Gallis, M. A., and Bartel, T. J., An Investigation of
Two-Dimensional CAD Generated Models with Body Decoupled
Cartesian
Grids for DSMC, AIAA Paper 2000-2361, June 2000, presented at
the 39th AIAA Thermophysics Conference, Miami, FL.6LeBeau, G. J., A
parallel implementation of the direct simulation Monte Carlo
method, Computer Methods in Applied Mechanics and
Engineering, Vol. 174, 1999, pp. 319337.7Ivanov, M. S.,
Markelov, G. N., and Gimelshein, S. F., Statistical simulation of
reactive rarefied flows: numerical approach and applications,
AIAA Paper 98-2669, 1998, presented at the 43rd AIAA Aerospace
Sciences Meeting and Exhibit, Reno, NV.8Aftosmis, M. J., Berger, M.
J., and Melton, J. E., Robust and Efficient Cartesian Mesh
Generation for Component-Based Geometry, AIAA
Journal, Vol. 36, No. 6, 1998, pp. 952960.9LeBeau, G., Jacikas,
K., and Lumpkin, F., Virtual Sub-Cells for the Direct Simulation
Monte Carlo Method, AIAA Paper 03-1031, Jan.
2003, presented at the 39th AIAA Thermophysics Conference,
Miami, FL.10Gao, D. and Schwartzentruber, T. E., Parallel
Implementation of the Direct Simulation Monte Carlo Method for
Shared Memory Architec-
tures, Presented the 48th AIAA Aerospace Sciences Meeting, AIAA
Paper 2010-451, Jan. 2010.11Kannenberg, K. C. and Boyd, I. D.,
Strategies for Efficient Particle Resolution in the Direct
Simulation Monte Carlo Method, Journal of
Computational Physics, Vol. 157, 2000, pp. 727745.12Larsen, P.
S. and Borgnakke, C., Statistical Collision Model for Monte Carlo
Simulation of Polyatomic Gas Mixture, Journal of Compu-
tational Physics, Vol. 18, 1975, pp. 405420.13Boyd, I. D.,
Rotational-Translational Energy Transfer in Rarefied Nonequilibrium
Flows, Physics of Fluids A, Vol. 2, No. 3, 1990, pp. 447
452.
14 of 14
American Institute of Aeronautics and Astronautics