Introduction Case Study Key Lessons Case Study - Computational Fluid Dynamics (CFD) using Graphics Processing Units Aaron F. Shinn Mechanical Science and Engineering Dept., UIUC Summer School 2009: Many-Core Processors for Science and Engineering Applications, 8-13-09 A.F. Shinn CFD using GPUs 1 / 30
30
Embed
Case Study - Computational Fluid Dynamics (CFD) using ... · PDF fileIntroduction Case Study Key Lessons What is CFD? Computational Fluid Dynamics: solve governing equations of uid
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
IntroductionCase Study
Key Lessons
Case Study - Computational Fluid Dynamics(CFD) using Graphics Processing Units
Aaron F. Shinn
Mechanical Science and Engineering Dept., UIUC
Summer School 2009: Many-Core Processors for Science andEngineering Applications, 8-13-09
• Red-black Gauss-Seidel kernels consume over 2/3 of GPUtime!
• Must optimize red-black Gauss-Seidel kernels
A.F. Shinn CFD using GPUs 17 / 30
IntroductionCase Study
Key Lessons
OverviewImplementationResults
CUDA implementation of Red-Black Gauss-Seidel
• Memory management in red-black kernels- Global memory: easiest, but slow- Shared memory: gives marginally better performance,
perhaps due to low data reuse or handling of boundaryhalos for each sub-domain in shared memory.
- Texture memory: fetch device memory through texturesinstead of expensive global memory load. Currently workingon this. This is an alternative to avoid uncoalesed memoryloads.
A.F. Shinn CFD using GPUs 18 / 30
IntroductionCase Study
Key Lessons
OverviewImplementationResults
Computational Resources
• GPU verison: CUDA, CPU version: Fortran.• Single-precision used for all calculations.• Dell Precision 690 Workstation (Linux: Red Hat Enterprise
• Performance of GPU versus CPU for first 100 time-steps ofsimulation, with block size bx=by=bz=4 on coarser meshes andbx=32,by=1,bz=8 on finer meshes.
Speedup improved by factor of 2.4 for 256x64x64 case
A.F. Shinn CFD using GPUs 26 / 30
IntroductionCase Study
Key Lessons
Key Lessons
• Speedup of GPU scaled with the problem size; largestproblem size yielded maximum speedup.
• Single precision did not appreciably affect the results, evenfor turbulent flows.
• Global memory easiest to use, but worst for memorylatency.
• Need global residuals to observe convergence. This requirescudaMemcpy between CPU/GPU. Very expensive, so decidewhen you really need to see the residuals.
A.F. Shinn CFD using GPUs 27 / 30
IntroductionCase Study
Key Lessons
Key Lessons
• Optimization can be a time drain. Need to decide whencode is “good enough”
• Two possibilities:- Code is complete, just needs porting to CUDA and tuning.
Maybe have more time to optimize- Code is not complete, need to add physics features, write in
CUDA, and tune. Maybe need to spend more time onphysics algorithm and “get what you can get” out ofminimal time coding in CUDA
A.F. Shinn CFD using GPUs 28 / 30
IntroductionCase Study
Key Lessons
Future Work
• Model complex geometries in flow using the ImmersedBoundary Method (IBM)
• Multi-GPU capability - collaborating with John Stone,UIUC
A.F. Shinn CFD using GPUs 29 / 30
IntroductionCase Study
Key Lessons
References
[1] H. Ku, R. Hirsh, and T. Taylor. A Pseudospectral Method for Solution of theThree-Dimensional Incompressible Navier-Stokes Equations. Journal ofComputational Physics, 70:439-462, 1987.
[2] R.K. Madabhushi and S.P. Vanka. Large eddy simulation of turbulence-drivensecondary flow in a square duct. Phys. Fluids, 3(11):2734-2745, 1991.