Top Banner
Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data Synchronization: When one thread needs the results of processing happening in another thread, (i.e. one thread will wait) Locks: multiple threads might need to access the same data. They have to lock it/manipulate it/unlock it (as quick as possible)
22

Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Dec 30, 2015

Download

Documents

Susan Robinson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Multi-threading basicsMulti-threading basicsMain process forks additional processing threadsTakes advantage of multiple processors, or CPU dead times while waiting for dataSynchronization: When one thread needs the results of processing happening in another thread, (i.e. one thread will wait)Locks: multiple threads might need to access the same data. They have to lock it/manipulate it/unlock it (as quick as possible)

Page 2: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Why multi-threading/multi-core?Why multi-threading/multi-core?

Clock rates are stagnantFuture CPUs will be predominantly multi-thread/multi-core

Xbox 360 has 3 coresPS3 has a stream architecture with eight coresAlmost all new PC’s are dual or quad core.

Two performance possibilities:Single-threaded? Minimal performance growthMulti-threaded? Exponential performance growth

Page 3: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Design for MultithreadingDesign for MultithreadingGood design is critical

Bad multithreading can be worse than no multithreading

Deadlocks, synchronization bugs, poor performance, etc.

Comments can help alot!

Page 4: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Bad MultithreadingBad Multithreading

Thread 1

Thread 2

Thread 3

Thread 4

Thread 5

Page 5: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Rendering ThreadRendering ThreadRendering Thread

Game Thread

Good MultithreadingGood Multithreading

Main Thread

Physics

Rendering Thread

Animation/Skinning

Particle Systems

Networking

File I/O

Game Thread

Page 6: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Another Paradigm: CascadesAnother Paradigm: CascadesThread 1

Thread 2

Thread 3

Thread 4

Thread 5

Input

Physics

AI

Rendering

Present

Frame 1Frame 2Frame 3Frame 4

Advantages:Synchronization points are few and well-defined

Disadvantages:Increases latency (for constant frame rate)

Needs simple (one-way) data flow

Page 7: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Available Synchronization ObjectsAvailable Synchronization Objects

Events

Semaphores

Mutexes

Critical Sections

Don't use SuspendThread()Some title have used this for synchronization

Can easily lead to deadlocks

Interacts badly with Visual Studio debugger

Page 8: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Exclusive Access: MutexExclusive Access: Mutex// InitializeHANDLE mutex = CreateMutex(0, FALSE, 0);

// Usevoid ManipulateSharedData() { WaitForSingleObject(mutex, INFINITE); // Manipulate stuff... ReleaseMutex(mutex);}

// DestroyCloseHandle(mutex);

Page 9: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Exclusive Access: CRITICAL_SECTIONExclusive Access: CRITICAL_SECTION// InitializeCRITICAL_SECTION cs;InitializeCriticalSection(&cs);

// Usevoid ManipulateSharedData() { EnterCriticalSection(&cs); // Manipulate stuff... LeaveCriticalSection(&cs);}

// DestroyDeleteCriticalSection(&cs);

Page 10: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

How Many Threads?How Many Threads?No more than one CPU intensive software thread per core

3-6 on Xbox 3601-? on PC (1-4 for now, need to query)

Too many busy threads adds complexity, and lowers performance

Context switches are not free

Can have many non-CPU intensive threads

I/O threads that block, or intermittent tasks

Page 11: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Typical Threaded TasksTypical Threaded Tasks

File Decompression

Rendering

Graphics Fluff

Physics

Page 12: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

File DecompressionFile Decompression

Most common CPU heavy thread on the Xbox 360

Easy to multithread

Allows use of aggressive compression to improve load times

Don’t throw a thread at a problem better solved by offline processing

Texture compression, file packing, etc.

Page 13: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

RenderingRendering

Separate update and render threadsRendering on multiple threads (D3DCREATE_MULTITHREADED) works poorly

Exception: Xbox 360 command buffers

Special case of cascades paradigmPass render state from update to render

With constant workload gives same latency, better frame rateWith increased workload gives same frame rate, worse latency

Page 14: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Graphics FluffGraphics Fluff

Extra graphics that doesn't affect playProcedurally generated animating cloud textures

Cloth simulations

Dynamic ambient occlusion

Procedurally generated vegetation, etc.

Extra particles, better particle physics, etc.

Easy to synchronize

Potentially expensive, but if the core is otherwise idle...?

Page 15: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Physics?Physics?

Could cascade from update to physics to rendering

Makes use of three threads

May be too much latency

Could run physics on many threadsUses many threads while doing physics

May leave threads mostly idle elsewhere

Page 16: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Rendering ThreadRendering Thread

Overcommitted Multithreading?Overcommitted Multithreading?Physics

Rendering Thread

Animation/Skinning

Particle Systems

Game Thread

Page 17: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Synchronization tips/costs:Synchronization tips/costs:

Synchronization is moderately expensive when there is no contention

Hundreds to thousands of cycles

Synchronization can be arbitrarily expensive when there is contention!Goals:

Synchronize rarely

Hold locks briefly

Minimize shared data

Page 18: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Threading File I/O & DecompressionThreading File I/O & Decompression

First: use large reads and asynchronous I/O

Then: consider compression to accelerate loading

Don't do format conversions etc. that are better done at build time!

Have resource proxies to allow rendering to continue

Page 19: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

File I/O Implementation DetailsFile I/O Implementation Details

vector<Resource*> g_resources;

Worst design: decompressor locks g_resources while decompressing

Better design: decompressor adds resources to vector after decompressing

Still requires renderer to synch on every resource access

Best design: two Resource* vectorsRenderer has private vector, no locking required

Decompressor use shared vector, syncs when adding new Resource*

Renderer moves Resource* from shared to private vector once per frame

Page 20: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Profiling multi-threaded apps Profiling multi-threaded apps

Need thread-aware profilersProfiling may hide many synchronization stallsHome-grown spin locks make profiling harderConsider instrumenting calls to synchronization functions

Don't use locks in instrumentation

Windows: Intel VTune, AMD CodeAnalyst, and the Visual Studio Team System ProfilerXbox 360: PIX, XbPerfView, etc.

Page 21: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Windows tipsWindows tips

Avoid using wglMakeCurrent or this.Invoke()

Best to do all rendering calls from a single thread

Test on multiple machines and configurations

Single-core, SMT (i.e. Hyper-Threading), Dual-core, Intel and AMD chips, Multi-socket multicore (4+ cores)

Page 22: Multi-threading basics Main process forks additional processing threads Takes advantage of multiple processors, or CPU dead times while waiting for data.

Ogre-specificOgre-specific

Ogre has a class to load resources in a background process

http://www.ogre3d.org/docs/api/html/classOgre_1_1ResourceBackgroundQueue.html