Technical Architect Ubisoft Montreal

Post on 26-Feb-2016

60 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Squeeze the Juice out of CPUs Post Mortem of a Data-Driven Scheduler. Remi QUENIN. Michael LAVAIRE. Technical Lead Ubisoft Montreal. Technical Architect Ubisoft Montreal. Table of Contents. Issues on common MT Archi . “Shears” Solution Tips and Tricks. Usual multithreading patterns. - PowerPoint PPT Presentation

Transcript

Technical ArchitectUbisoft Montreal

Michael LAVAIRETechnical LeadUbisoft Montreal

Remi QUENIN

Squeeze the Juice out of CPUsPost Mortem of a Data-Driven Scheduler

Table of Contents

1. Issues on common MT Archi.

2. “Shears” Solution

3. Tips and Tricks

1. COMMON ARCHITECTURAL DESIGNS Usual multithreading patterns

Folded loop

Gameplay

Engine Loop

Graphic

Smaller Engine Loop

Thread 1

Thread 2

Background LoadingThread 3

Frame NN+1N+2

Sync point

Sync

Sync

Sync

Sync

Applying modifications

Applying modifications

Applying modifications

Gameplay

Graphic

Background Loading

Applying modifications

Tasks SchedulingStage 1

Waiting

Stage 2 Stage 3

Gate

Task A

Task D

Task C

Task B

Task E

Task F

2. « SHEARS » SOLUTION60 FPS for everyone

Objectives

2.1 DATA DRIVEN SCHEDULINGFocus on data, not on the code

Task A

Task A

Task A

Data driven scheduling

Task A

Task A

Task A

Task A

Task B

Task A

Task A

Task A

Task C

Data D1

Data D0

Data D0

Data D1

D0 D1

Data driven scheduling

Task A

Task A

Task A

Task A

Task A

Task A

Task A

Task B

Task A

Task A

Task A

Task C

Data D1

Data D0

Data driven schedulingD0 D1

Task A

Task B

Task C

Data driven schedulingD0 D1

Task A

Task B

Task C

Data driven scheduling

Data driven scheduling

Data driven scheduling

2.2 WORKLOADSNo locks, be scalable

Workload

Lock-free : Internal

Container

State

Thread 2Container::Remove

()

State State

Thread 1Container::Add()

State StateState State

Test &Set Test &SetSUCCESS ! FAILED !SUCCESS !

Lock-free

1 2 3 40

2000

4000

6000

8000

10000

12000

Lock-free - Op/msecCrit.Sec. - Op/msec

Lock-free : Comparison Q6600Thread Count 1 2 3 4

Lock-free Mean Op Time (cycles) 233 678 2145 3278

Op/msec 10300 3540 1119 732Race count 0 9571 91205 161371

Crit.Sec. Mean Op Time (cycles) 771 24862 36835 49079

Op/msec 3113 97 65 49Idle % 57% 74% 83% 87%

Lock-free / Crit.Sec. ratio 3.31 36.67 17.17 14.97

Lock-free : Comparison X360

Thread Count 1 2 3 4 5 6 Lock-free Mean Op Time (usec) 0.29 0.5 0.6 0.83 0.91 1.2

Op/msec 3448 2000 1667 1205 1099 833Race count 0 10447 72716 152283 218379 329422

Crit. Sec. Mean Op Time (usec) 0.96 10.06 13.76 19.16 24.66 29.88

Op/msec 1042 99 73 52 41 33Idle % 23% 75% 85% 89% 92% 93%

Lock-free / Crit.Sec. ratio 3.31 20.12 22.93 23.08 27.10 24.90

1 2 3 4 5 60

500

1000

1500

2000

2500

3000

3500

4000

Lock-free - Op/msec

Crit.Sec. - Op/msec

1 20

500

1000

1500

2000

2500

3000

3500

4000

Lock-free - Op/msec

Crit.Sec. - Op/msec

Lock-free : Comparison PS3Thread Count 1 2

Lock-free Mean Op Time (usec) 0.27 0.46

Op/msec 3704 2174Race count 0 253

Crit.Sec. Mean Op Time (usec) 1.15 5.15

Op/msec 870 194Idle % 25% 60%

Lock-free / Crit.Sec. ratio 4.26 11.20

2.3 WORKING WITH SPUEasy cross-platforming, easy debugging

Working with SPUCross Platform API

?

Working with SPUCross Platform API

Main Memory Impl DMA Impl

Memory Access Interface

SatelliteTask

Working with SPU Easy Debugging

Main Memory Impl DMA Impl

Memory Access Interface

SatelliteTask

Working with SPUEasy Debugging

Working with SPU Easy Debugging

Main Memory Impl DMA Impl

Memory Access Interface

SatelliteTask

Named Pipe

3. TIPS & TRICKSMultithreading & peace of mind

Tip 1: Clever profiling

Tip 2: Watchdog

Tip 3: Unit Tests

Trick 1: Perturbation

Trick 1: Perturbation

Test A

Test C

Test B

Test D

Test A

Test C

Test B

Test D

Loop n Loop n+1

Thread A

Thread B

New thread synchronization

Trick 2: State validation

Trick 2: State validationSt

ate

1

Process A Process B Process C

Process X

Stat

e 2

Stat

e 3

Stat

e X

Assert !

Trick 2: State validationclass StateChecker{public:

enum State { State1, State2, State3 };

StateChecker() { m_state = State1; }

bool SetState( State oldState, State newState ){return Atomic::TestAndSet( &m_state, oldState, newState ) == oldState;}

private:volatile State m_state;

};

Trick 3: Access verification

Trick 3: Access verificationclass AccessChecker{public:

AccessChecker() { m_access = 0; }

bool StartReadAccess() { return Atomic::Inc( &m_access ) > 0; }bool EndReadAccess() { return Atomic::Dec( &m_access ) >= 0; }

bool StartWriteAccess() { return Atomic::Dec( &m_access ) == -1; }bool EndWriteAccess() { return Atomic::Inc( &m_access ) == 0; }

private:volatile int m_access;

};

Trick 4: Multithreaded Assert

Trick 4: Multithreaded Assert

extern volatile bool g_waitOnAssert = false;

#define ASSERT( condition ) \while(g_waitOnAssert) {} \if( !(condition) ) \{ \

g_waitOnAssert = true; \DoAssert(); \g_waitOnAssert = false; \

}

Squeeze the Juice !

Inspiration• Game Programming Gems 6: Lock-free Algorithms

by Toby Jones• Design and Implementation of Multi-Threaded Games

by Bruce Dawson• Floodgate: Maximizing SPU parallelism without

sacrificing cross platform developmentby David Asbell & Michael Noland

• SPU Shadersby Mike Acton

michael.lavaire@ubisoft.com

remi.quenin@ubisoft.com

Ubisoft is recruiting!Come see us at the Ubisoft Booth in the Career

Pavilion (CP 2308, South Hall)

You can also check out: www.creatorsofemotions.com

top related