Top Banner
Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003
20

Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Jan 05, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Compiling Metaprogrammed Shaders to Stream GPUs

Michael D. McCool

Computer Graphics Lab

University of Waterloo

Graphics Hardware 2003

Page 2: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Topics GPUs are “Stream Processors”… But what does that mean, exactly? Can general programs be compiled to GPUs? Can they run efficiently on GPUs? How can GPUs be evolved to support more

powerful programming models without negatively impacting performance?

What abstractions should programming languages for GPUs support?

Page 3: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Imagine Stream ProcessorSIMD kernel processing on streams

containing homogeneous recordsMemory hierarchy

Local registers Stream register file External memory

Streaming external memory accessConditional read and write

Page 4: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Stream GPU Architecture

Vertex Shader

Rasterizer

Fragment Shader

CompositorDisplay

New

Optional

Page 5: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Stream GPU ArchitectureStream input to vertex unitArray inputs to fragment unitAt least two stream outputs from

fragment unit supporting conditional writes

Array output from fragment unit via compositor

Page 6: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Sh Metaprogramming LibraryEmbedded metaprogrammingBoth a library and a high-level

programming languageAvailable from SourceForge:

http://libsh.sourceforge.netCurrently semantically “Cg-equivalent”Adding control constructs, stream

algebra in next phase…

Page 7: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Julia Set: Sh ExampleShAttrib1f julia_max_iter = 20.0;ShAttrib1f julia_scale = 0.05;ShAttrib2f julia_c(1.0, -0.3);

ShTexture2D<ShColor3f> julia_map(32,32);

. . .

ShProgram julia0 = SH_BEGIN_VERTEX_SHADER { ShInputTexCoord2f ui; ShInputPosition3f pm; ShOutputTexCoord2f uo(ui); ShOutputPosition4f pd; pd = (perspective | modelview) | pm;} SH_END_SHADER;

ShProgram julia1 =

SH_BEGIN_FRAGMENT_SHADER {

ShInputTexCoord2f u;

ShInputPosition2f pdxy;

ShOutputColor3f fc;

ShAttrib1f i = 0.0;

SH_WHILE(((v|v) < 2.0) *

(i < julia_max_iter)) {

ShTexCoord2f v;

v(0) = u(0)*u(0) - u(1)*u(1);

v(1) = 2.0*u(0)*u(1);

u = v + julia_c;

i++;

} SH_ENDWHILE;

ShTexCoord2f lookup(0.0,0.0);

lookup(0) = julia_scale * i;

fc = julia_map(lookup);

} SH_END_SHADER;

Page 8: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Compiler: ControlFlowGraph

Page 9: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Control GraphControl flow graph from compiler also

describes multipass stream program!Need conditional write to avoid

accumulation of “garbage records” Iteration and conditionals may scramble

order of records --- but can always sort by ID later if necessary.

Page 10: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Julia Set:Control Graph

Iterator Render

197.48 Kwords

9450 Kwords

197.48 Kwords

197.48 Kwords

Rasterize

56.25Kwords

(800 tris)

Page 11: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Adaptive Tessellation:Control Graph

Oracle

Tess4

Tess3

Tess2

Bump

Split

Stack arc4010

Kwords(42771 tris)

5748Kwords

5661 Kwords

3.46 Kwords

368.6 Kwords

1137 Kwords

86.71 Kwords

(800 tris)

Page 12: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

SchedulerLocal arcs are system-allocated stream

buffers (ideally stream registers)System picks kernel to run:

Has enough input data Space available in available output buffer

Picks kernel that maximizes throughputRepeat until no more data in input

stream

Page 13: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Observations:True conditionals and iteration:

Implementable with conditional write to stream output

NEED NULL COMPRESSION! Multiple stream outputs also desirable

Fragment scatter: Implementable with render-to-vertex-array F-buffer feedback also desirable

Page 14: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Simulating Null Compression Want conditional write to stream No space wasted for nullified records Can simulate on current GPUs:

Write to array Use occlusion test to count number of non-null

records Sort array by mark bit (use depth channel to mark) Discard null records (now at end of array)

Expensive, perhaps other ways…

Page 15: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

HW Stream Null Compression

fsh

fsh

fsh

fsh

fsh

fsh

fsh

fsh

10 2

9 1

3

4

5

6

7

8

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

31

30

32

33

34

Page 16: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Stream AlgebraShProgram p;(a,b) = p(d,e,f);(a,b) = p << (d,e,f);(a,b) = p << d << e << f;(a,b) = p << q << (d,e,f);(a,b,u,v,w) = (p ** q) << (d,e,f,j,k,l);fb += p << r << q << (c,n,v)[i];ShStream cq = optimize(q << (c,n,v)[i]);fb += p << r << cq;a += s * t;ShCampaign k = . . .

Page 17: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

TargetsGPUs (via Cg, OGL Slang, etc.)

SIMD Multithreaded MIMD

SSE, SSE2 (via Intel compiler)Cluster computersShared-mem computersPS2, PS3

Page 18: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Issues: Null compression can be simulated with sparse

texture compression, but slow. H/W support would be useful.

On-chip stream registers… Off-chip stream buffer compression… On-GPU scheduler… Compilation of recursive algorithms? Virtualization: registers, stream record size, stream

length, textures, array read-write, synchronization, etc.

Abstractions: streams, sequences, sets, indexes, arrays, programs, campaigns, shapes, etc.

Page 19: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Material Mapping: Control Graph

HF

Wood

HF + WoodRastSplit

Page 20: Compiling Metaprogrammed Shaders to Stream GPUs Michael D. McCool Computer Graphics Lab University of Waterloo Graphics Hardware 2003.

Control Construct Templates

CA

SP

Predecessor

WHILE (C) {

A

}

Successor

CB

SP

Predecessor

IF (C) {

A

} ELSE {

B

}

Successor

A