Page 1
© 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission from Apple.
#WWDC14
Working With Metal—Advanced
Graphics and Games
!
Session 605 Gokhan Avkarogullari GPU Software
!
!
Aaftab Munshi GPU Software
!
!
Serhat Tekin GPU Software
Page 2
Agenda
Introduction to Metal
Fundamentals of Metal • Building a Metal application
• Metal shading language
Advanced Metal • Deep dive into creating a graphics application with Metal
• Data-Parallel Computing with Metal
• Developer Tools Review
Page 3
Creating Multi-Pass Graphics Applications with Metal
Page 4
Graphics Application with Multiple Passes
Multiple framebuffer configurations
Render to off-screen and on-screen textures
Meshes that are used with multiple shaders
Multiple encoders
Page 5
Deferred Lighting with Shadow Maps
Shadow Map • Depth-only render from the perspective of the directional light
Deferred Lighting • Multiple render targets
• Framebuffer fetch for in-place light accumulation
• Stencil buffer for light culling
Page 6
Deferred Lighting with Shadow Maps
Shadow Map • Depth-only render from the perspective of the directional light
Deferred Lighting • Multiple render targets
• Framebuffer fetch for in-place light accumulation
• Stencil buffer for light culling
Don’t worry about the algorithm • Focus on the details of how the API is used
Page 7
Deferred Lighting with Shadow Maps
Command Queue
Render Command Encoder for Deferred Lighting Pass
Command Buffer #1
Render Command Encoder for Shadow Pass
Pipeline State Shaders Blend, etc.
Depth State
Buffers
Textures
Pipeline State Shaders Blend, etc.
Depth State
Buffers
Textures
Page 8
DemoDeferred Lighting
Page 9
Render Setup
Actions to take once at application start time
Actions that are taken as needed • Level load time
• Texture streaming
• Mesh streaming
Actions to take every frame
Actions to take every render-to-texture pass
Page 10
Render SetupDo once
Create device
Create command queue
Page 11
Render SetupDo as needed
Create framebuffer textures
Create render pass descriptors
Create buffers for meshes
Create render pipeline objects
Create textures
Create state objects
Create uniform buffers
Page 12
Render SetupDo every frame
Create command buffer
Update frame-based uniform buffers
Submit command buffer
Page 13
Render SetupDo every render to texture pass
Create command encoder
Draw many times • Update uniform buffers
• Set states
• Make draw calls
Page 14
Render SetupDo as needed
Create framebuffer textures
Create render pass descriptors
Create buffers for meshes
Create render pipeline objects
Create textures
Create state objects
Create uniform buffers
Page 15
Render SetupA word on descriptors
Descriptors are like blueprints • Once the object is created the connection to the descriptor is gone
• Changing descriptors will not change the object
Same descriptor can be reused to create another object
Descriptor can be modified and then reused to create a new object
Page 16
Render SetupCreating render pass descriptors
MTLRenderPassDescriptor
Color Color
Color Color
Depth Stencil
Page 17
Render SetupCreating render pass descriptors
MTLRenderPassDescriptor
Color Color
Color Color
Depth Stencil
Page 18
Render SetupCreating render pass descriptors
MTLRenderPassAttachmentDescriptor
MTLTexture MTLResolveTexture
loadAction storeAction
- …
MTLRenderPassDescriptor
Color Color
Color Color
Depth Stencil
Page 19
Render SetupCreating render pass descriptors
Texture
MTLRenderPassAttachmentDescriptor
MTLTexture MTLResolveTexture
loadAction storeAction
- …
MTLRenderPassDescriptor
Color Color
Color Color
Depth Stencil
Page 20
Render SetupRender pass descriptor for shadowMap pass
shadowRenderPassDescriptor
Depth
Page 21
Render SetupRender pass descriptor for shadowMap pass
// Create texture id<MTLTexture> shadow_texture; MTLTextureDescriptor *shadowTextureDesc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatDepth32Float width: 1024 height: 1024 mipmapped: NO]; shadow_texture = [device newTextureWithDescriptor: shadowTextureDesc]; !
Page 22
Render SetupRender pass descriptor for shadowMap pass
// Create texture id<MTLTexture> shadow_texture; MTLTextureDescriptor *shadowTextureDesc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatDepth32Float width: 1024 height: 1024 mipmapped: NO]; shadow_texture = [device newTextureWithDescriptor: shadowTextureDesc]; !
Page 23
Render SetupRender pass descriptor for shadowMap pass
// Create texture id<MTLTexture> shadow_texture; MTLTextureDescriptor *shadowTextureDesc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatDepth32Float width: 1024 height: 1024 mipmapped: NO]; shadow_texture = [device newTextureWithDescriptor: shadowTextureDesc]; !
Page 24
Render SetupRender pass descriptor for shadowMap pass
// Create render pass descriptor MTLRenderPassDescriptor *shadowMapPassDesc = [MTLRenderPassDescriptor renderPassDescriptor]; !
Page 25
Render SetupRender pass descriptor for shadowMap pass
// Create render pass descriptor MTLRenderPassDescriptor *shadowMapPassDesc = [MTLRenderPassDescriptor renderPassDescriptor]; !
// Set the texture on the render pass descriptor shadowMapPassDesc.depthAttachment.texture = shadow_texture; !
Page 26
Render SetupRender pass descriptor for shadowMap pass
// Create render pass descriptor MTLRenderPassDescriptor *shadowMapPassDesc = [MTLRenderPassDescriptor renderPassDescriptor]; !
// Set the texture on the render pass descriptor shadowMapPassDesc.depthAttachment.texture = shadow_texture; !
Page 27
Render SetupRender pass descriptor for shadowMap pass
// Create render pass descriptor MTLRenderPassDescriptor *shadowMapPassDesc = [MTLRenderPassDescriptor renderPassDescriptor]; !
// Set the texture on the render pass descriptor shadowMapPassDesc.depthAttachment.texture = shadow_texture; !
// Set other properties on the render pass descriptor shadowMapPassDesc.depthAttachment.clearValue = MTLClearValueMakeDepth(1.0); shadowMapPassDesc.depthAttachment.loadAction = MTLLoadActionClear; shadowMapPassDesc.depthAttachment.setStoreAction = MTLStoreActionStore;
Page 28
Render SetupRender pass descriptor for shadowMap pass
// Create render pass descriptor MTLRenderPassDescriptor *shadowMapPassDesc = [MTLRenderPassDescriptor renderPassDescriptor]; !
// Set the texture on the render pass descriptor shadowMapPassDesc.depthAttachment.texture = shadow_texture; !
// Set other properties on the render pass descriptor shadowMapPassDesc.depthAttachment.clearValue = MTLClearValueMakeDepth(1.0); shadowMapPassDesc.depthAttachment.loadAction = MTLLoadActionClear; shadowMapPassDesc.depthAttachment.setStoreAction = MTLStoreActionStore;
Page 29
Render SetupRender pass descriptor for shadowMap pass
// Create render pass descriptor MTLRenderPassDescriptor *shadowMapPassDesc = [MTLRenderPassDescriptor renderPassDescriptor]; !
// Set the texture on the render pass descriptor shadowMapPassDesc.depthAttachment.texture = shadow_texture; !
// Set other properties on the render pass descriptor shadowMapPassDesc.depthAttachment.clearValue = MTLClearValueMakeDepth(1.0); shadowMapPassDesc.depthAttachment.loadAction = MTLLoadActionClear; shadowMapPassDesc.depthAttachment.setStoreAction = MTLStoreActionStore;
Page 30
Render SetupRender pass descriptor for the Deferred Lighting pass
deferredPassDesc
Color Color
Color Color
Depth Stencil
Page 31
Render SetupRender pass descriptor for the Deferred Lighting pass
// Create descriptor MTLTextureDescriptor *attachmentX_texture_desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatBGRA8Unorm width: desc.width height: desc.height mipmapped: NO]; !
Page 32
Render SetupRender pass descriptor for the Deferred Lighting pass
// Create descriptor MTLTextureDescriptor *attachmentX_texture_desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatBGRA8Unorm width: desc.width height: desc.height mipmapped: NO]; !
// Create textures based on the descriptors gbuffer_texture1 = [device newTextureWithDescriptor: attachmentX_tex_desc];
Page 33
Render SetupRender pass descriptor for the Deferred Lighting pass
// Create descriptor MTLTextureDescriptor *attachmentX_texture_desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatBGRA8Unorm width: desc.width height: desc.height mipmapped: NO]; !
// Create textures based on the descriptors gbuffer_texture1 = [device newTextureWithDescriptor: attachmentX_tex_desc];
Page 34
Render SetupRender pass descriptor for the Deferred Lighting pass
// Create descriptor MTLTextureDescriptor *attachmentX_texture_desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatBGRA8Unorm width: desc.width height: desc.height mipmapped: NO]; !
// Create textures based on the descriptors gbuffer_texture1 = [device newTextureWithDescriptor: attachmentX_tex_desc]; !
// Modify descriptor and create new texture [attachmentX_texture_desc setPixelFormat: … ]; gbuffer_texture2 = [device newTextureWithDescriptor: attachmentX_tex_desc];
Page 35
Render SetupRender pass descriptor for the Deferred Lighting pass
// Create descriptor MTLTextureDescriptor *attachmentX_texture_desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatBGRA8Unorm width: desc.width height: desc.height mipmapped: NO]; !
// Create textures based on the descriptors gbuffer_texture1 = [device newTextureWithDescriptor: attachmentX_tex_desc]; !
// Modify descriptor and create new texture [attachmentX_texture_desc setPixelFormat: … ]; gbuffer_texture2 = [device newTextureWithDescriptor: attachmentX_tex_desc];
Page 36
Render SetupRender pass descriptor for the Deferred Lighting pass
// Create descriptor MTLTextureDescriptor *attachmentX_texture_desc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat: MTLPixelFormatBGRA8Unorm width: desc.width height: desc.height mipmapped: NO]; !
// Create textures based on the descriptors gbuffer_texture1 = [device newTextureWithDescriptor: attachmentX_tex_desc]; !
// Modify descriptor and create new texture [attachmentX_texture_desc setPixelFormat: … ]; gbuffer_texture2 = [device newTextureWithDescriptor: attachmentX_tex_desc];
Page 37
Render SetupRender pass descriptor for the Deferred Lighting pass
// Create render pass descriptor MTLRenderPassDescriptor *deferredPassDesc = [MTLRenderPassDescriptor renderPassDescriptor]; !
!
!
!
!
Page 38
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; //will come from drawable
Page 39
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; //will come from drawable
Page 40
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore;
Page 41
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore;
Page 42
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore; !
// Describe color attachment 1 deferredPassDesc.colorAttachments[1].texture = gbuffer_texture1; !
Page 43
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore; !
// Describe color attachment 1 deferredPassDesc.colorAttachments[1].texture = gbuffer_texture1; !
Page 44
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore; !
// Describe color attachment 1 deferredPassDesc.colorAttachments[1].texture = gbuffer_texture1; !
Page 45
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore; !
// Describe color attachment 1 deferredPassDesc.colorAttachments[1].texture = gbuffer_texture1; !
deferredPassDesc.colorAttachments[1].clearValue = clearColor1; deferredPassDesc.colorAttachments[1].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[1].storeAction = MTLStoreActionDontCare;
Page 46
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore; !
// Describe color attachment 1 deferredPassDesc.colorAttachments[1].texture = gbuffer_texture1; !
deferredPassDesc.colorAttachments[1].clearValue = clearColor1; deferredPassDesc.colorAttachments[1].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[1].storeAction = MTLStoreActionDontCare;
Page 47
Render SetupRender pass descriptor for the Deferred Lighting pass
// Describe color attachment 0 deferredPassDesc.colorAttachments[0].texture = nil; !
deferredPassDesc.colorAttachments[0].clearValue = clearColor1; deferredPassDesc.colorAttachments[0].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[0].storeAction = MTLStoreActionStore; !
// Describe color attachment 1 deferredPassDesc.colorAttachments[1].texture = gbuffer_texture1; !
deferredPassDesc.colorAttachments[1].clearValue = clearColor1; deferredPassDesc.colorAttachments[1].loadAction = MTLLoadActionClear; deferredPassDesc.colorAttachments[1].storeAction = MTLStoreActionDontCare;
Page 48
Render SetupDo as needed
Create framebuffer textures
Create render pass descriptors
Create textures
Create buffers for meshes
Create state objects
Create pipeline objects
Create uniform buffers
Page 49
Creating Textures
// Copy texture data to bitmapData unsigned Npixels = tex_info.width * tex_info.height; id<MTLTexture> texture = [device newTextureWithDescriptor: …]; !
!
Page 50
// Copy texture data to bitmapData unsigned Npixels = tex_info.width * tex_info.height; id<MTLTexture> texture = [device newTextureWithDescriptor: …]; !
[texture replaceRegion: bitmapData … !
!
!
Creating Textures
Page 51
// Copy texture data to bitmapData unsigned Npixels = tex_info.width * tex_info.height; id<MTLTexture> texture = [device newTextureWithDescriptor: …]; !
[texture replaceRegion: bitmapData … !
!
!
Creating Textures
Page 52
Creating Buffers for Meshes
float4 temple[100]; … spriteBuffer = [device newBufferWithBytes: &temple length: sizeof(temple) options: 0];
Page 53
Creating Buffers for Meshes
float4 temple[100]; … spriteBuffer = [device newBufferWithBytes: &temple length: sizeof(temple) options: 0];
Page 54
Creating State Objects
MTLDepthStencilDescriptor *desc = [[MTLDepthStencilDescriptor alloc] init]; desc.depthCompareFunction = MTLCompareFunctionLessEqual; desc.depthWriteEnabled = YES; !
MTLStencilDescriptor *stencilStateDesc = [[MTLStencilDescriptor alloc] init]; stencilState.stencilCompareFunction = MTLCompareFunctionAlways; stencilState.stencilFailureOperation = MTLStencilOperationKeep; … desc.frontFaceStencilDescriptor = stencilStateDesc; desc.backFaceStencilDescriptor = stencilStateDesc; !
id <MTLDepthStencilState> shadowDepthStencilState = [device newDepthStencilStateWithDescriptor: desc];
Page 55
Creating State Objects
MTLDepthStencilDescriptor *desc = [[MTLDepthStencilDescriptor alloc] init]; desc.depthCompareFunction = MTLCompareFunctionLessEqual; desc.depthWriteEnabled = YES; !
MTLStencilDescriptor *stencilStateDesc = [[MTLStencilDescriptor alloc] init]; stencilState.stencilCompareFunction = MTLCompareFunctionAlways; stencilState.stencilFailureOperation = MTLStencilOperationKeep; … desc.frontFaceStencilDescriptor = stencilStateDesc; desc.backFaceStencilDescriptor = stencilStateDesc; !
id <MTLDepthStencilState> shadowDepthStencilState = [device newDepthStencilStateWithDescriptor: desc];
Page 56
GPU Render Pipeline
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Page 57
Render Pipeline State Object
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Page 58
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Render Pipeline State Object
Shaders
Page 59
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Render Pipeline State Object
Framebuffer configurationNumber of render targets, pixel format, sample count
Page 60
Buffers
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Render Pipeline State Object
Depth and stencil state
Page 61
Render Pipeline State Object
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Page 62
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Render Pipeline State Object—Not Included
Inputs
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Page 63
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Outputs
Render Pipeline State Object—Not Included
Page 64
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Render Pipeline State Object—Not Included
Primitive setupCull mode, facing orientationdepth clipping, polygon mode
Page 65
Vertex Fetch
Vertex Shader
Rasterizer
Fragment Shader
Framebuffer
Primitive Setup
Buffers
Buffers
Textures
Samplers
Buffers
Textures
Samplers
Buffers
Render Pipeline State Object—Not Included
Viewport and scissorDepth bias/slope/clampOcclusion queries
Page 66
Creating Render Pipeline State Object
Every draw call requires a render pipeline state object to be set
Same mesh usually has multiple pipeline state objects • The temple object in our demo is rendered twice
- The shadow pass—Depth only render with simple vertex shader
- The deferred pass—Rendered to generate g-buffer attributes
- Each one requires a different render pipeline state object to be created
Page 67
Creating Render Pipeline State ObjectshadowMap pass
// Create the descriptor MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new];
Page 68
Creating Render Pipeline State ObjectshadowMap pass
// Create the descriptor MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new]; !
// Get the shaders from the library id <MTLFunction> zOnlyVert = [zOnlyLibrary newFunctionWithName:@"ZOnly"];
Page 69
Creating Render Pipeline State ObjectshadowMap pass
// Create the descriptor MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new]; !
// Get the shaders from the library id <MTLFunction> zOnlyVert = [zOnlyLibrary newFunctionWithName:@"ZOnly"];
Page 70
Creating Render Pipeline State ObjectshadowMap pass
// Create the descriptor MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new]; !
// Get the shaders from the library id <MTLFunction> zOnlyVert = [zOnlyLibrary newFunctionWithName:@"ZOnly"]; !
// Set the states desc.label = @"Shadow Render"; desc.vertexFunction = zOnlyVert; desc.stencilWriteEnabled = false; desc.depthWriteEnabled = true; desc.fragmentFunction = nil; //depth write only desc.depthAttachmentPixelFormat = pixelFormat;
Page 71
Creating Render Pipeline State ObjectshadowMap pass
// Create the descriptor MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new]; !
// Get the shaders from the library id <MTLFunction> zOnlyVert = [zOnlyLibrary newFunctionWithName:@"ZOnly"]; !
// Set the states desc.label = @"Shadow Render"; desc.vertexFunction = zOnlyVert; desc.stencilWriteEnabled = false; desc.depthWriteEnabled = true; desc.fragmentFunction = nil; //depth write only desc.depthAttachmentPixelFormat = pixelFormat;
Page 72
Creating Render Pipeline State ObjectshadowMap pass
MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new]; !
id <MTLFunction> zOnlyVert = [zOnlyLibrary newFunctionWithName:@"ZOnly"]; !
desc.label = @"Shadow Render"; desc.vertexFunction = zOnlyVert; desc.stencilWriteEnabled = false; desc.depthWriteEnabled = true; desc.fragmentFunction = nil; //depth write only desc.depthAttachmentPixelFormat = pixelFormat; !
// Create the render pipeline state object id<MTLRenderPipelineState> pipeline = [device newRenderPipelineStateWithDescriptor: desc error: &err];
Page 73
Creating Render Pipeline State ObjectshadowMap pass
MTLRenderPipelineDescriptor *desc = [MTLRenderPipelineDescriptor new]; !
id <MTLFunction> zOnlyVert = [zOnlyLibrary newFunctionWithName:@"ZOnly"]; !
desc.label = @"Shadow Render"; desc.vertexFunction = zOnlyVert; desc.stencilWriteEnabled = false; desc.depthWriteEnabled = true; desc.fragmentFunction = nil; //depth write only desc.depthAttachmentPixelFormat = pixelFormat; !
// Create the render pipeline state object id<MTLRenderPipelineState> pipeline = [device newRenderPipelineStateWithDescriptor: desc error: &err];
Page 74
Creating Render Pipeline State ObjectDeferred Lighting pass
desc.vertexFunction = gBufferVert; desc.fragmentFunction = gBufferFrag; !
desc.colorAttachments[0].pixelFormat = gbuffer_texture0.pixelFormat; desc.colorAttachments[1].pixelFormat = gbuffer_texture1.pixelFormat; !
… !
desc.depthAttachmentPixelFormat = depth_texture.pixelFormat; desc.stencilAttachmentPixelFormat = stencil_texture.pixelFormat;
Page 75
Creating Render Pipeline State ObjectDeferred Lighting pass
desc.vertexFunction = gBufferVert; desc.fragmentFunction = gBufferFrag; !
desc.colorAttachments[0].pixelFormat = gbuffer_texture0.pixelFormat; desc.colorAttachments[1].pixelFormat = gbuffer_texture1.pixelFormat; !
… !
desc.depthAttachmentPixelFormat = depth_texture.pixelFormat; desc.stencilAttachmentPixelFormat = stencil_texture.pixelFormat;
Page 76
Creating Render Pipeline State ObjectDeferred Lighting pass
desc.vertexFunction = gBufferVert; desc.fragmentFunction = gBufferFrag; !
desc.colorAttachments[0].pixelFormat = gbuffer_texture0.pixelFormat; desc.colorAttachments[1].pixelFormat = gbuffer_texture1.pixelFormat; !
… !
desc.depthAttachmentPixelFormat = depth_texture.pixelFormat; desc.stencilAttachmentPixelFormat = stencil_texture.pixelFormat;
Page 77
Render SetupDo every frame
Create command buffer
Update frame-based uniform buffers
Submit command buffer
Page 78
Render SetupCreate and submit command buffer
// BeginFrame commandBuffer = [commandQueue commandBuffer]; !
!
// EndFrame [commandBuffer addPresent: drawable]; [commandBuffer commit]; commandBuffer = nil; !
Page 79
Render SetupCreate and submit command buffer
// BeginFrame commandBuffer = [commandQueue commandBuffer]; !
!
// EndFrame [commandBuffer addPresent: drawable]; [commandBuffer commit]; commandBuffer = nil; !
Page 80
Render SetupCreate and submit command buffer
// BeginFrame commandBuffer = [commandQueue commandBuffer]; !
!
// EndFrame [commandBuffer addPresent: drawable]; [commandBuffer commit]; commandBuffer = nil; !
Page 81
Render SetupDo every render to texture pass
Create command encoder
Draw many times • Update uniform buffers
• Set states
• Make draw calls
Page 82
Render SetupshadowMap pass render encoding
// Create encoder id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor: shadowMapPassDesc]; !
Page 83
Render SetupshadowMap pass render encoding
// Create encoder id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor: shadowMapPassDesc]; !
// Set states and draw [encoder setRenderPipelineState: shadow_render_pipeline]; [encoder setDepthStencilState: shadowDepthStencilState]; … [encoder setVertexBuffer: structureVertexBuffer offset:0 atIndex: 0]; [encoder drawIndexedPrimitives: …
Page 84
Render SetupshadowMap pass render encoding
// Create encoder id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor: shadowMapPassDesc]; !
// Set states and draw [encoder setRenderPipelineState: shadow_render_pipeline]; [encoder setDepthStencilState: shadowDepthStencilState]; … [encoder setVertexBuffer: structureVertexBuffer offset:0 atIndex: 0]; [encoder drawIndexedPrimitives: …
Page 85
Render SetupshadowMap pass render encoding
// Create encoder id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor: shadowMapPassDesc]; !
// Set states and draw [encoder setRenderPipelineState: shadow_render_pipeline]; [encoder setDepthStencilState: shadowDepthStencilState]; … [encoder setVertexBuffer: structureVertexBuffer offset:0 atIndex: 0]; [encoder drawIndexedPrimitives: … !
// end encoding [encoder endEncoding];
Page 86
Render SetupshadowMap pass render encoding
// Create encoder id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor: shadowMapPassDesc]; !
// Set states and draw [encoder setRenderPipelineState: shadow_render_pipeline]; [encoder setDepthStencilState: shadowDepthStencilState]; … [encoder setVertexBuffer: structureVertexBuffer offset:0 atIndex: 0]; [encoder drawIndexedPrimitives: … !
// end encoding [encoder endEncoding];
Page 87
Render SetupDeferred Lighting pass render encoding
deferredPassDesc.colorAttachments[0].texture = texture_from_drawable; !
Page 88
Render SetupDeferred Lighting pass render encoding
deferredPassDesc.colorAttachments[0].texture = texture_from_drawable; !
// Create encoder id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor: deferredPassDesc]; !
Page 89
Render SetupDeferred Lighting pass render encoding
deferredPassDesc.colorAttachments[0].texture = texture_from_drawable; !
// Create encoder id<MTLRenderCommandEncoder> encoder = [commandBuffer renderCommandEncoderWithDescriptor: deferredPassDesc]; !
… !
// End encoding [encoder endEncoding];
Page 90
Agenda
Introduction to Metal
Fundamentals of Metal • Building a Metal application
• Metal shading language
Advanced Metal • Deep dive into creating a graphics application with Metal
• Data-Parallel Computing with Metal
• Developer Tools Review
Page 91
Data-Parallel Computing with Metal
Aaftab Munshi GPU Software
Page 92
Data-Parallel Computing with MetalWhat you’ll learn
Page 93
Data-Parallel Computing with MetalWhat you’ll learn
What is data-parallel computing?
Page 94
Data-Parallel Computing with MetalWhat you’ll learn
What is data-parallel computing?
Data-parallel computing in Metal
Page 95
Data-Parallel Computing with MetalWhat you’ll learn
What is data-parallel computing?
Data-parallel computing in Metal
Writing data-parallel kernels in Metal
Page 96
Data-Parallel Computing with MetalWhat you’ll learn
What is data-parallel computing?
Data-parallel computing in Metal
Writing data-parallel kernels in Metal
Executing kernels in Metal
Page 97
Data-Parallel ComputingA brief introduction
Page 98
Data-Parallel ComputingA brief introduction
Similar and independent computations on multiple data elements
Page 99
Data-Parallel ComputingA brief introduction
Similar and independent computations on multiple data elements
Example—Blurring an image • Same computation for each input
• All results are independent
Page 100
Data-Parallelism in Metal
Page 101
Data-Parallelism in Metal
Code that describes computation is called a kernel
Page 102
Data-Parallelism in Metal
Code that describes computation is called a kernel
Independent computation instance • Work-item
Page 103
Data-Parallelism in Metal
Code that describes computation is called a kernel
Independent computation instance • Work-item
Work-items that execute together • Work-group
• Cooperate by sharing data
• Can synchronize execution
Page 104
Data-Parallelism in MetalComputation domain
Page 105
Data-Parallelism in MetalComputation domain
Number of dimensions • 1D, 2D, or 3D
Page 106
Data-Parallelism in MetalComputation domain
Number of dimensions • 1D, 2D, or 3D
For each dimension specify • Number of work-items in work-group also known as work-group size
• Number of work-groups
Page 107
Data-Parallelism in MetalComputation domain
Number of dimensions • 1D, 2D, or 3D
For each dimension specify • Number of work-items in work-group also known as work-group size
• Number of work-groups
Choose the dimensions that are best for your algorithm
Page 108
Pseudo Code for a Data-Parallel Kernel
!!void square(const float* input, float* output, uint id { output[id] = input[id] * input[id]; }
Page 109
Pseudo Code for a Data-Parallel Kernel
#include <metal_stdlib> using namespace metal; void square(const float* input, float* output, uint id { output[id] = input[id] * input[id]; }
Page 110
Pseudo Code for a Data-Parallel Kernel
#include <metal_stdlib> using namespace metal; kernel void square(const float* input, float* output, uint id { output[id] = input[id] * input[id]; }
Page 111
Pseudo Code for a Data-Parallel Kernel
#include <metal_stdlib> using namespace metal; kernel void square(const global float* input [[ buffer(0) ]], global float* output [[ buffer(1) ]], uint id { output[id] = input[id] * input[id]; }
Page 112
Metal Kernel
#include <metal_stdlib> using namespace metal; kernel void square(const global float* input [[ buffer(0) ]], global float* output [[ buffer(1) ]], uint id [[ global_id ]]) { output[id] = input[id] * input[id]; }
Page 113
Another Kernel Example in MetalUsing images in a kernel
Page 114
Another Kernel Example in MetalUsing images in a kernel
kernel void horizontal_reflect(texture2d<float> src [[ texture(0) ]], texture2d<float, access::write> dst [[ texture(1) ]], uint2 id [[ global_id ]]){ float4 c = src.read(uint2(src.get_width()-1-id.x, id.y)); dst.write(c, id); }
Page 115
Another Kernel Example in MetalUsing images in a kernel
kernel void horizontal_reflect(texture2d<float> src [[ texture(0) ]], texture2d<float, access::write> dst [[ texture(1) ]], uint2 id [[ global_id ]]){ float4 c = src.read(uint2(src.get_width()-1-id.x, id.y)); dst.write(c, id); }
Page 116
Another Kernel Example in MetalUsing images in a kernel
kernel void horizontal_reflect(texture2d<float> src [[ texture(0) ]], texture2d<float, access::write> dst [[ texture(1) ]], uint2 id [[ global_id ]]){ float4 c = src.read(uint2(src.get_width()-1-id.x, id.y)); dst.write(c, id); }
Page 117
Another Kernel Example in MetalUsing images in a kernel
kernel void horizontal_reflect(texture2d<float> src [[ texture(0) ]], texture2d<float, access::write> dst [[ texture(1) ]], uint2 id [[ global_id ]]){ float4 c = src.read(uint2(src.get_width()-1-id.x, id.y)); dst.write(c, id); }
Page 118
Another Kernel Example in MetalUsing images in a kernel
kernel void horizontal_reflect(texture2d<float> src [[ texture(0) ]], texture2d<float, access::write> dst [[ texture(1) ]], uint2 id [[ global_id ]]){ float4 c = src.read(uint2(src.get_width()-1-id.x, id.y)); dst.write(c, id); }
Page 119
Another Kernel Example in MetalUsing images in a kernel
kernel void horizontal_reflect(texture2d<float> src [[ texture(0) ]], texture2d<float, access::write> dst [[ texture(1) ]], uint2 id [[ global_id ]]){ float4 c = src.read(uint2(src.get_width()-1-id.x, id.y)); dst.write(c, id); }
Page 120
Built-in Kernel VariablesAttributes for kernel arguments
kernel void my_kernel(texture2d<float> img [[ texture(0) ]], ushort2 gid [[ global_id ]], uint glinear [[ global_linear_id ]], ushort2 lid [[ local_id ]], ushort linear [[ local_linear_id ]], uint wgid [[ work_group_id ]], …){ … }
Page 121
Built-in Kernel VariablesAttributes for kernel arguments
kernel void my_kernel(texture2d<float> img [[ texture(0) ]], ushort2 gid [[ global_id ]], uint glinear [[ global_linear_id ]], ushort2 lid [[ local_id ]], ushort linear [[ local_linear_id ]], uint wgid [[ work_group_id ]], …){ … }
Page 122
Built-in Kernel VariablesAttributes for kernel arguments
kernel void my_kernel(texture2d<float> img [[ texture(0) ]], ushort2 gid [[ global_id ]], uint glinear [[ global_linear_id ]], ushort2 lid [[ local_id ]], ushort linear [[ local_linear_id ]], uint wgid [[ work_group_id ]], …){ … }
Page 123
Built-in Kernel VariablesAttributes for kernel arguments
kernel void my_kernel(texture2d<float> img [[ texture(0) ]], ushort2 gid [[ global_id ]], uint glinear [[ global_linear_id ]], ushort2 lid [[ local_id ]], ushort linear [[ local_linear_id ]], uint wgid [[ work_group_id ]], …){ … }
Page 124
Built-in Kernel VariablesAttributes for kernel arguments
kernel void my_kernel(texture2d<float> img [[ texture(0) ]], ushort2 gid [[ global_id ]], uint glinear [[ global_linear_id ]], ushort2 lid [[ local_id ]], ushort linear [[ local_linear_id ]], uint wgid [[ work_group_id ]], …){ … }
Page 125
Built-in Kernel VariablesAttributes for kernel arguments
kernel void my_kernel(texture2d<float> img [[ texture(0) ]], ushort2 gid [[ global_id ]], uint glinear [[ global_linear_id ]], ushort2 lid [[ local_id ]], ushort linear [[ local_linear_id ]], uint wgid [[ work_group_id ]], …){ … }
Page 126
Built-in Kernel VariablesAttributes for kernel arguments
kernel void my_kernel(texture2d<float> img [[ texture(0) ]], ushort2 gid [[ global_id ]], uint glinear [[ global_linear_id ]], ushort2 lid [[ local_id ]], ushort linear [[ local_linear_id ]], uint wgid [[ work_group_id ]], …){ … }
Page 127
Executing Kernels in MetalPost-processing example
Page 128
Post-Processing KernelKernel source
Page 129
Post-Processing KernelKernel source
kernel void postprocess_filter(texture2d<float> inImage [[ texture(0) ]], texture2d<float, access::write> outImage [[ texture(1) ]], texture2d<float> curveImage [[ texture(2) ]], constant Parameters& param [[ buffer(0) ]], uint2 gid [[ global_id ]]){ // Transform global ID using param.transformMatrix float4 color = inImage.sample(s, transformedCoord); // Apply post-processing effect outImage.write(color, gid);}
Page 130
Post-Processing KernelKernel source
kernel void postprocess_filter(texture2d<float> inImage [[ texture(0) ]], texture2d<float, access::write> outImage [[ texture(1) ]], texture2d<float> curveImage [[ texture(2) ]], constant Parameters& param [[ buffer(0) ]], uint2 gid [[ global_id ]]){ // Transform global ID using param.transformMatrix float4 color = inImage.sample(s, transformedCoord); // Apply post-processing effect outImage.write(color, gid);}
Page 131
Post-Processing KernelKernel source
kernel void postprocess_filter(texture2d<float> inImage [[ texture(0) ]], texture2d<float, access::write> outImage [[ texture(1) ]], texture2d<float> curveImage [[ texture(2) ]], constant Parameters& param [[ buffer(0) ]], uint2 gid [[ global_id ]]){ // Transform global ID using param.transformMatrix float4 color = inImage.sample(s, transformedCoord); // Apply post-processing effect outImage.write(color, gid);}
Page 132
Post-Processing KernelKernel source
kernel void postprocess_filter(texture2d<float> inImage [[ texture(0) ]], texture2d<float, access::write> outImage [[ texture(1) ]], texture2d<float> curveImage [[ texture(2) ]], constant Parameters& param [[ buffer(0) ]], uint2 gid [[ global_id ]]){ // Transform global ID using param.transformMatrix float4 color = inImage.sample(s, transformedCoord); // Apply post-processing effect outImage.write(color, gid);}
Page 133
Post-Processing KernelProcessing multiple pixels/work-item
constexpr constant int num_pixels_work_item = 4;kernel void postprocess_filter(…, uint2 gid [[global_id]], uint2 lsize [[local_size]]){ for (int i=0; i<num_pixels_work_item; i++) { uint2 gid_new = uint2(gid.x+i*lsize.x, gid.y); // Transform gid_new using param.transformMatrix // Read from input image float4 color = inImage.sample(s, transformedCoord); // apply post-processing effect // Write to output image outImage.write(color, gid_new); } }
Page 134
Post-Processing KernelProcessing multiple pixels/work-item
constexpr constant int num_pixels_work_item = 4;kernel void postprocess_filter(…, uint2 gid [[global_id]], uint2 lsize [[local_size]]){ for (int i=0; i<num_pixels_work_item; i++) { uint2 gid_new = uint2(gid.x+i*lsize.x, gid.y); // Transform gid_new using param.transformMatrix // Read from input image float4 color = inImage.sample(s, transformedCoord); // apply post-processing effect // Write to output image outImage.write(color, gid_new); } }
Page 135
Post-Processing KernelProcessing multiple pixels/work-item
constexpr constant int num_pixels_work_item = 4;kernel void postprocess_filter(…, uint2 gid [[global_id]], uint2 lsize [[local_size]]){ for (int i=0; i<num_pixels_work_item; i++) { uint2 gid_new = uint2(gid.x+i*lsize.x, gid.y); // Transform gid_new using param.transformMatrix // Read from input image float4 color = inImage.sample(s, transformedCoord); // apply post-processing effect // Write to output image outImage.write(color, gid_new); } }
Page 136
Executing a KernelCompute command encoder
Page 137
Executing a KernelCompute command encoder
// Load library and kernel functionid <MTLLibrary> library = [device newLibraryWithFile:libname error:&err];id <MTLFunction> filterFunc = [library newFunctionWithName:@“postprocess_filter”];
Page 138
Executing a KernelCompute command encoder
// Load library and kernel functionid <MTLLibrary> library = [device newLibraryWithFile:libname error:&err];id <MTLFunction> filterFunc = [library newFunctionWithName:@“postprocess_filter”];// Create compute stateid <MTLComputePipelineState> filterKernel = [device newComputePipelineStateWithFunction:filterFunc error:&err];
Page 139
Executing a KernelCompute command encoder
// Load library and kernel functionid <MTLLibrary> library = [device newLibraryWithFile:libname error:&err];id <MTLFunction> filterFunc = [library newFunctionWithName:@“postprocess_filter”];// Create compute stateid <MTLComputePipelineState> filterKernel = [device newComputePipelineStateWithFunction:filterFunc error:&err];// Create compute command encoder
Page 140
Executing a KernelCompute command encoder
// Load library and kernel functionid <MTLLibrary> library = [device newLibraryWithFile:libname error:&err];id <MTLFunction> filterFunc = [library newFunctionWithName:@“postprocess_filter”];// Create compute stateid <MTLComputePipelineState> filterKernel = [device newComputePipelineStateWithFunction:filterFunc error:&err];// Create compute command encoderid <MTLComputeCommandEncoder> computeEncoder = [commandBuffer computeCommandEncoder];
Page 141
Executing a KernelEncode compute commands
Page 142
Executing a KernelEncode compute commands
// Set compute state[computeEncoder setComputePipelineState:filterKernel];
Page 143
Executing a KernelEncode compute commands
// Set compute state[computeEncoder setComputePipelineState:filterKernel];
// Set Resources used by kernel
[computeEncoder setTexture:inputImage atIndex:0]; [computeEncoder setTexture:outputImage atIndex:1]; [computeEncoder setTexture:curveImage atIndex:2]; [computeEncoder setBuffer:params offset:0 atIndex:0];
Page 144
Executing a KernelEncode compute commands
Page 145
Executing a KernelEncode compute commands
// Calculate the work-group size and number of work-groupsMTLSize wgSize = { 16, 16, 1 };MTLSize numWorkGroups = { (outputImage.width + wgSize - 1)/wgSize.x, (outputImage.height + wgSize - 1)/wgSize.y, 1 };
Page 146
Executing a KernelEncode compute commands
// Calculate the work-group size and number of work-groupsMTLSize wgSize = { 16, 16, 1 };MTLSize numWorkGroups = { (outputImage.width + wgSize - 1)/wgSize.x, (outputImage.height + wgSize - 1)/wgSize.y, 1 };
Page 147
Executing a KernelEncode compute commands
// Calculate the work-group size and number of work-groupsMTLSize wgSize = { 16, 16, 1 };MTLSize numWorkGroups = { (outputImage.width + wgSize - 1)/wgSize.x, (outputImage.height + wgSize - 1)/wgSize.y, 1 };
// Execute Kernel[computeEncoder executeKernelWithWorkGroupSize:wgSize workGroupCount:numWorkGroups];
Page 148
Executing a KernelEncode compute commands
// Calculate the work-group size and number of work-groupsMTLSize wgSize = { 16, 16, 1 };MTLSize numWorkGroups = { (outputImage.width + wgSize - 1)/wgSize.x, (outputImage.height + wgSize - 1)/wgSize.y, 1 };
// Execute Kernel[computeEncoder executeKernelWithWorkGroupSize:wgSize workGroupCount:numWorkGroups];
Page 149
Executing a KernelEncode compute commands
// Calculate the work-group size and number of work-groupsMTLSize wgSize = { 16, 16, 1 };MTLSize numWorkGroups = { (outputImage.width + wgSize - 1)/wgSize.x, (outputImage.height + wgSize - 1)/wgSize.y, 1 };
// Execute Kernel[computeEncoder executeKernelWithWorkGroupSize:wgSize workGroupCount:numWorkGroups];
// Finish encoding[computeEncoder endEncoding];
Page 150
Executing a KernelSubmit commands to the GPU
// Commit the command buffer [commandBuffer commit];
Page 151
DemoPost-processing kernels
Page 152
Agenda
Introduction to Metal
Fundamentals of Metal • Building a Metal application
• Metal shading language
Advanced Metal • Deep dive into creating a graphics application with Metal
• Data-Parallel Computing with Metal
• Developer Tools Review
Page 153
ToolsDebugging and profiling Metal applications in Xcode
Page 155
Summary
A deeper dive into Metal • Structuring your application for Metal
• Using descriptors and state objects for rendering
• Multi-pass encoding in Metal
Page 156
Summary
A deeper dive into Metal • Structuring your application for Metal
• Using descriptors and state objects for rendering
• Multi-pass encoding in Metal
Data-parallel computing in Metal • How data-parallelism works in Metal
• Write and execute kernels in Metal
Page 157
Summary
A deeper dive into Metal • Structuring your application for Metal
• Using descriptors and state objects for rendering
• Multi-pass encoding in Metal
Data-parallel computing in Metal • How data-parallelism works in Metal
• Write and execute kernels in Metal
Tools • How to create and compile Metal Shaders in Xcode
• Debug and profile a Metal application
Page 158
More Information
Filip Iliescu Graphics and Games Technologies Evangelist [email protected]
Allan Schaffer Graphics and Games Technologies Evangelist [email protected]
Documentation http://developer.apple.com
Apple Developer Forums http://devforums.apple.com
Page 159
Related Sessions
• Working with Metal—Overview Pacific Heights Wednesday 9:00AM
• Working with Metal—Fundamentals Pacific Heights Wednesday 10:15AM
• Tools Location Sunday 0:00PM
• Media Location Sunday 0:00PM
• Graphics and Games Location Sunday 0:00PM
• Core OS Location Sunday 0:00PM
• Special Events Location Sunday 0:00PM
Page 160
Labs
• Metal Lab Graphics and Games Lab A Wednesday 2:00PM
• Metal Lab Graphics and Games Lab B Thursday 10:15AM
• Tools Location Sunday 0:00PM
• Media Location Sunday 0:00PM
• Graphics and Games Location Sunday 0:00PM
• Core OS Location Sunday 0:00PM
• Special Events Location Sunday 0:00PM