Fluid Flow through Porous Media Simulation Scalability with OpenFOAM and MPI Erik Jensen 1 , Mayank Tyagi 2,3 , Zhi Shang 3 1 Department of Modeling, Simulation, and Visualization Engineering; Old Dominion University; Norfolk, Virginia 23529 2 Craft and Hawkins Department of Petroleum Engineering; Louisiana State University; Baton Rouge, Louisiana 70803 3 Center for Computation and Technology; Louisiana State University; Baton Rouge, Louisiana 70803 Acknowledgements Dr. Becky Carmichael, CxC Coordinator, LSU Dr. Feng Chen, IT Consultant, HPC User Services, LSU Aaron Harrington, Graduate Student, Cain Department of Chemical Engineering, LSU This material is based upon work supported by the National Science Foundation under award OCI 1263236 with additional support from the Center for Computation & Technology at Louisiana State University. Procedure 1. Code a script (Python) to generate a threedimensional porousmedia model. The geometry format is Stereolithography (STL). The file format is binary. Run the script to generate the geometry. 2. Develop a base OpenFOAM simulation for laminar fluid flow. 3. Add the STL file to the base OpenFOAM simulation. Run blockMesh to generate the background mesh. Run snappyHexMesh to embed the porous media model and refine the mesh. 4. Use the Linux shell to create jobsubmission scripts that automate the simulations. Each script will: 1. Allocate one or more nodes on SuperMIC. 2. Edit the number of subdomains in the decomposeParDict file. 3. Decompose the domain with decomposePar. 4. Run icoFoam in parallel using OpenMPI over the allocated nodes. 5. Submit the job scripts to the cluster. Visualize the results in paraFoam. Computational Fluid Dynamics (CFD) • The velocity and pressure of a Newtonian fluid are governed by the NavierStokes equations. Momentum: !"# $ !% + !("# $ # ) ) !+ ) + !("# $ # , ) !+ , + !("# $ # - ) !+ - =− !0 !+ $ + ! , # $ !+ ) , + ! , # $ !+ , , + ! , # $ !+ - , + 3 4 $ successfulscaling is finding a balance between efficiency and performance. Given the computationally demanding nature of CFD simulations, appropriate scaling is necessary to obtain results in a reasonable amount of time. Discussion References 1 Figure 2: Microscopic View of Reservoir Rock. Digital image. Fundamentals of Oil & Gas: Series 2. Malaysian Structural Steel Association, n.d. Web. 29 July 2015. 2 Recreated from: Versteeg, H. K., and W. Malalasekera. "Figure 6.2." An Introduction to Computational Fluid Dynamics: The Finite Volume Method. 2nd ed. Harlow, Essex, England: Pearson Education Limited, 2007. 182. Print. 3 Figure 5.15: Surface Snapping in SnappyHexMesh Meshing Process. Digital image. 5.4 Mesh Generation with the SnappyHexMesh Utility. OpenFOAM Foundation, n.d. Web. 29 July 2015. 4 Figure 5.4. Digital image. DistributedMemory Systems. Institute for Microelectronics, n.d. Web. 29 July 2015. 5 Modified From: Figure 1: 1D Domain Decomposition Example Using 4 Processors. Digital image. Domain Decomposition Strategies. 2DECOMP&FFT, n.d. Web. 29 July 2015. OpenFOAM and Distributed Simulation • OpenFOAM (Open source Field Operation and Manipulation) is a CFD software package for Linux. • Open MPI (Message Passing Interface) is used to run simulations in parallel. • Parallelization is running a simulation on more than one processor at a time. • Scalability is a measure of how well a simulation parallelizes. • CFD simulations are parallelized by splitting up the physical domain and distributing the pieces to different processors or computers. Fluid Flow through Porous Media • Petroleum engineers study the motion of hydrocarbons in porous media (e.g., sandstone). • Although rocks appear to be solid, they contain a significant amount of pore space that can harbor fluid (e.g., crude oil). • When oil is extracted, pore space facilitates transport. • Porosity quantifies the amount of pore space. • Permeability describes resistance to fluid flow. Figure 1. 1 Magnification of reservoir rock shows an abundance of fluidfilled pore space. Figure 2. 2 The overlapping yellow, blue, and green boxes are the respective xvelocity, y velocity, and pressure control volumes; e.g. the pressure at P approximates pressure throughout the pressure control volume. Results Flow Visualization Figure 11. This twodimensional slice of a three dimensional mesh visualizes velocity magnitudes. Figure 12. This slice of the the uniform sphere pack simulation indicates the approaching steady state. • Solutions must be found computationally. • The momentum equations are approximated by discretized equations (e.g., for 6 ): 8,: 6 $,; = ∑ => 6 ?@ + 8,: CD6,: − C,: + 8,: • The domain is discretized into a staggered grid that stores velocity and pressure information, separately, at specific points in and between cells. • Velocities are stored at cell faces. • Pressure is stored at cell centers. • In practical applications, the grid contours to the model surface. Cells can assume noncubic shapes. Figure 3. 3 Adding geometry causes the grid to change. After refining the grid, cells near the surface of the model become smaller and more numerous. Figure 4. 4 Running a distributed simulation requires MPI to share the data in memory among the CPUs. Figure 5. 5 Parallel computing is achieved through domain decom position. Figure 7. Using superfluous nodes wastes resources and does not provide additional computational benefit. Communication time increases with scaling. Figure 6. Results for 0.1 seconds of simulation time with a time step of 0.001 seconds. Fast cache memory facilitates superlinear speedup. Model Geometry Figure 8. Randomly placed spheres mimic natural rock. Figure 9. Each sphere is made of numerous small triangles. Experimental results show a maximum speedup of about 800 percent. The physical implications of decomposition are evidenced by irregularities in the simulation time and total CPU time trends. The strong scaling efficiency metric quantifies the ability of a problem to scale up efficiently. This information, along with speed requirements, is useful in determining how to scale the simulation. Key to Figure 10. Uniformly packed spheres are easier to mesh. Nodes Wall Time Strong Scaling Efficiency 1 377 100.00 2 166 113.55 3 111 113.21 4 86 109.59 5 69 109.28 6 63 99.74 7 63 85.49 8 57 82.68 9 59 71.00 10 51 73.92 11 52 65.91 12 47 66.84 Figure 13. The strong scaling efficiency formula relates single and multiple processing elements. 0 100 200 300 400 0 2 4 6 8 10 12 14 16 18 20 Seconds SuperMIC Nodes (20 processors per node) Simulation Wall Time, 8M Cells Simulation Wall Time Amdahl's Law, P=1 0 500 1000 1500 0 2 4 6 8 10 12 14 16 18 20 Seconds SuperMIC Nodes (20 processors per node) Total CPU Time, 8M Cells Simulation Wall Time Total CPU Time