DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters Thu D. Nguyen and Christopher Peery Department of Computer Science Rutgers University John Zahorjan Department of Computer Science & Engineering University of Washington
20
Embed
DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on Commodity Clusters Thu D. Nguyen and Christopher Peery Department of Computer Science.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
DDDDRRaw: A Prototype Toolkit for Distributed Real-Time Rendering on
Commodity Clusters
Thu D. Nguyen and Christopher Peery Department of Computer Science
Rutgers University
John ZahorjanDepartment of Computer Science & Engineering
University of Washington
IPDPS 2001
Overview
Improve real-time rendering performance using distributed rendering on commodity clusters
• Improve performance -> Render more complex scenes at interactive rates
Why real-time rendering?
• A critical component of an increasing number of continuous media applications
Virtual reality, data visualization, CAD, flight simulators, etc.
• Rendering performance will continue to be a bottleneck Model complexity increasing as fast (or faster) than hardware performance Part of the challenge is to leverage increasingly powerful hardware accelerators
IPDPS 2001
Challenges
How to structure the distributed renderer to leverage hardware-assisted rendering• Information that is useful for work partitioning and
assignment may be hidden in the hardware rendering pipeline
How to minimize non-parallelizable overheads (avoiding Amdhal’s Law)
How to decouple bandwidth requirement from the complexity of the scene and the cluster size
IPDPS 2001
Image Layer Decomposition (ILD)
Per-frame rendering load is partitioned using ILD
• presented in IPDPS 2000
Briefly review ILD because it affects DDDDRRaW’s architecture and performance
Basic idea: assign scene objects such that sets of objects assigned to different nodes are not mutually occlusive
Advantages of using ILD
• Do not need position of polygons in 2D This information may be hidden inside the graphics pipeline
• Do not need Z-buffer information This reduces the required bandwidth by at least 50%
IPDPS 2001
Spatial partitioning
Image Layer Decomposition (ILD)
1 2
3 4
5 6
3
5 4 1
26
IPDPS 2001
Non-mutually occlusive assignment -> legal for back-to-front compositing
Use heuristic-based algorithm to
• Balance load across cluster
• Minimize the screen real-estate covered by each assignment
Aztec City Chamber Hall Coronary Left Lung CSBuilding
Sp
eed
-up
SequentialP=1P=4
IPDPS 2001
Speed-up of Average Frame Rate on PCs
0
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9 10 11 12
Num ber of Rendering Nodes (P)
Sp
ee
d-u
p
CS Building
Hall
Chamber
Aztec City
Coronary
IPDPS 2001
Speed-up of Rendering Component on PCs
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10 11 12
Num ber of Rendering Nodes (P)
Sp
eed
-up
Aztec City
Coronary
IPDPS 2001
Conclusions
Can build an ILD-based distributed renderer to significantly improve real-time rendering performance on commodity hardware
DDDDRRaW currently scales to modestly sized cluster• This limitation is due to non-optimal hardware configurations
• This is NOT because more suitable hardware is not available!
• Expect good scalability to clusters of 16-32 nodes
Overlapping communication with computation increases average frame rate but ONLY at the expense of increasing frame latency• Problem is CPU contention for rendering & communication
• Either need dedicated hardware or can only optimize after reaching 10-15 fps, the nominal interactive frame rate