Making Games Sound as Good as They Look Real-time Geometric Acoustics on the GPU
Making Games Sound as Good as They Look
Real-time Geometric Acoustics on the GPU
Sound in Modern Games
• Commercial middleware (FMod, Wwise)
• Hardware or software rendering (DirectSound 3D, OpenAL, EAX)
• Some additional extensions to improve immersion – ex: calculate occlusion (per source) and filter if occluded
• Existing systems are parametric, rather than simulations
Parametric vs. Simulation
• Parametric model: parameters are derived from game state (reverb wet/dry, room size, source distance, direction, etc..)
• Parameters then map to DSP variables which interact with raw source (recorded samples)
• Game state never directly interacts with signal processing
Parametric vs. Simulation
• Simulation model: game state directly generates audio
• Advantages: audio experience is more directly influenced by game world, can be more immersive if done right
• Disadvantages: can easily sound “broken” if model is used outside limitations - not simple to fix by ‘fudging’ parameters
S.T.A.L.K.E.R.™ : Call of Pripyat
• Basic positional audio (rendered through OpenAL)
• Almost no environment dependence
• Limited immersion, high performance
• Typical for games where focus is on graphics
S.T.A.L.K.E.R.™ : Call of Pripyat
ARMA 2: Operation Arrowhead
• 2010 (PC only), continuously updated (version from late-2012)
• Simulation focus, HRTF and “software” occlusion
• Rendering through OpenAL (although OpenAL cannot process geometry)
• Focuses on sound as a gameplay mechanic
ARMA 2: Operation Arrowhead
Quake 3 Implementation
Q3dm1
• Acoustic version shown (3186 triangles)
• 4096 rays to generate reflections
• 3 orders of reflection - specular & diffuse; HRTF per-reflection (12k)
• IIR based material descriptions
Example: Used in Game Engine (Quake 3 Arena)
For video: https://www.youtube.com/watch?v=TXUTgEmnD6U (please use headphones if possible!!)
Geometry Engine
“Backwards” Ray-tracing
• Start at listener
• Use specular and diffuse reflection approximation at each bounce
• Generate impulse response
• 1 ray per thread
Geometry Engine
• Finding exact paths vs. sampling sound field
• Impulse response acts as filter
• Each ray has different frequency characteristics due to HRTF (different incident direction)
• Unique among real-time systems
Example: Corner Bass Reinforcement
• Situation on right, both listeners are facing source
• Listener 1 is in around center of room, frequency response is fairly flat
• Listener 2 is in corner, frequency response has low frequencies boosted
• Why is this?
Example: Corner Bass Reinforcement
Example: Corner Bass Reinforcement
Example: Corner Bass Reinforcement
Performance
• Computationally expensive (scales to Order*Sources*Triangles*Rays)
• Triangles count can be fairly low (use collision mesh instead of displayed mesh)
• Single order BVH (bounding volume “hierarchy”) is sufficient
Geometry Engine
Geometry Engine
Geometry Engine
Optimizations (Geometry)
• Rather not tolerate linear scaling to sources (reasoning: everything else can be mitigated in design by reducing quality)
• Solution: Dynamically allocate rays to sources
• Psychoacoustic justification: more chaotic sound scene = less ability to discern individual sounds
Dynamic Ray Allocation
• Dynamic ray allocation:
• First problem: amplitude changes as rays are re-assigned
• Second problem: slower ray-tracing (warp divergence between bounding volumes)
Ray Sorting
• Re-sort rays by assigned source in each block (inspired by Garanzha & Loop, 2010)
• Advantage: Gain spatial coherence when performing occlusion testing
• Disadvantage: Ray source positions and directions are less coherent (remember each source still needs omnidirectional sampling of rays)
GPU Audio Processing
• Each ray generates audio stream for mix-down (4096 in example)
• Delayed and filtered due to reflected materials and HRTF position
• Parallel mix-down (optimal performance depends on architecture)
Behavior at Reflection
• Typical commercial software uses per-frequency band simulation
• Reflection (1.0-absorption) coefficient is a scalar for each band (may use separate sets of coefficients for diffuse reflections)
• Approximate multi-band simulation by representing reflection behavior as bi-quad filter (loss of some degrees of freedom)
GPU Mix-down
• Atomic operations on Kepler GPUs (easy)
• Round-robin method on pre-Kepler GPUs (similar to parallel reduction)
• Only part of the work is done on GPU (CPU needed to mix down shared memory regions)
HRTF and Diffraction
• At low frequencies (below 1KHz) HRTF essentially flat
• These are the frequencies which diffract the most around architecture (wavelength @ 250Hz ~ 1.3m)
• Precise positioning of diffracted sources is not very important
• HRTF IIR filter coefficients derived from FIR experiments (can use Prony’s method, but easier just to match approximate amplitude and cutoff)
Integrating Artificial Reverb
• Simulation is useful for first 3 or 4 orders (for reference, on GTX Titan using q3dm1 map, 3 orders simulation takes ~30% real-time performance)
• 3 orders generates about 500ms of reverb time, but in practice, T60 for a comparable room can be several seconds (proportional to volume and inversely proportional to absorption)
• Additional problem in that simulated reverberation gets grainier as reverb time gets longer
Integrating Artificial Reverb
• Can estimate artificial reverb length with mean free path
• Intuitive to apply reverb at end of DSP chain, but problematic in some situations
• Resolve by applying reverb at front of DSP chain (before HRTF filtering)
Design Considerations…
• On a parametric system, easy to have creative input
• Make the sound more ‘exciting’ or more ‘chaotic’
• We don’t want the sound designer to become level designer
• Simulation may only be suitable for certain types of games (realistic, rather than cinematic)
Thanks for Listening!
• For more information, look for dissertation entitled: Design of a Real-Time GPU Accelerated Acoustic Simulation Engine for Interactive Applications (University of Illinois Press, 2014)