New Features in ML 2004 Trilinos Users Group Meeting 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu , Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.
12
Embed
New Features in ML 2004 Trilinos Users Group Meeting November 2-4, 2004 Jonathan Hu, Ray Tuminaro, Marzio Sala, Michael Gee, Haim Waisman Sandia is a multiprogram.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
New Features in ML
2004 Trilinos Users Group Meeting2004 Trilinos Users Group Meeting
November 2-4, 2004
Jonathan Hu, Ray Tuminaro,Marzio Sala, Michael Gee, Haim Waisman
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.
• Coarsening rate fixed: h/H 3n in n-d problem• What can go wrong?
• AMG complexity goes up ∑[nnz(A(j))] / nnz(A(1)) • result: more time per iteration
• In parallel, each coarse grid has latency penalty
Aggressive Coarsening
● Idea: use graph partitioner to make larger aggregates
– METIS / ParMETIS
● Coarsening rate: user-determined
Fewer levels: mitigates coarse grid latency
Smaller + fewer coarse
grids → lower complexity
Convergence rate could suffer --with-ml_metis
--with-ml_parmetis3x
method smoothers coarse DOFs medium DOFs avg its avg time1-level DD ilu 113 1502-L geom ilu-gmres/ilu 32336 24 2553-L AMG gs-ilu-superlu 1292 129444 31 38
3D transient LES (13M DOFs/1K node Cplant)
App: MPSalsa Airport Simulation
Aggressivecoarsening
Coarsening with Zoltan
• Main idea– App provides coordinates on fine level (only)
– Call to Zoltan for coarsening (RCB algorithm)
• ML internally creates coordinates for coarser levels– Centers of mass
• Status: still in testing phase
-- with-ml_zoltan
A
Repartitioning to Improve Parallel Performance
• Load balances operators in multigrid hierarchy
• Motivation– App load balancing may be non-optimal for linear solver– App may take large % of memory (e.g., multiphysics)
• Linear solver gets remaining memory• Result: low parallel efficiency
– Coarsening rate may slow as get to few unknowns / proc
• Main idea– Determine “good” partitioning with ParMETIS– Construct permutation matrix P based on partitioning– Apply to multigrid coarse grid operators
APProc. 1
Proc. 3
Proc. 2 Proc. 1
Proc. 2
Repartitioning applied toZpinch simulation
210 450 600 3600
No repartioning X X X X
Repartitioning310 / 492s
284 / 479s
257 / 530s
X*
Before repartitioning on Janus…
210+ processor simulations failed
App-supplied linear system already imbalanced
Find modes not captured by MG
adaptive filter extra coarse grid
MG GGB
GMRES \ QMR
Adaptive AMG
GMRES(20) + GGB/ML
GMRES(150) + ML
GGB
GB
Analysis / Profiling Tools
• Aggregate visualization
– Assess aggregate quality
– User provides fine-level coordinates
– CoM used as coordinates on coarser levels
– Stats calculated on avg size, diameters
– Currently using 3rd party package, OpenDX
• Error visualization
Analysis/Profiling Tools (cont’d)
• Matrix performance
– Matrix statistics– Eigen analysis– Detailed operator profiling
• Apply & communication time
MultilevelPreconditioner::AnalyzeMatrixCheap()
ML_Operator_Profile()
• Internal memory profiling– Lightweight– Highwater mark, largest free block– Postprocessing for plotting
Updated Documentation
• ML User’s Guide, version 3.0– Configure & build information– MultilevelPreconditioner() class intro– Exhaustive options list
• ML Developer’s Guide– Configuration, building, testing details– Suggested practices– Intro to tools on software.sandia.gov
• Updated web pages– Now built automatically each night– Incorporates doxygen comments– http://software.sandia.gov/trilinos/packages/ml