Top Banner
C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group
33

C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Dec 17, 2015

Download

Documents

Arnold Reynolds
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

C++ on Next-Gen Consoles:Effective Code for New ArchitecturesPete IsenseeDevelopment ManagerMicrosoft Game Technology Group

Page 2: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Last Year at GDC Chris Hecker ranted What did he say?

Programmers: danger ahead Out-of-order execution: good In-order execution: bad Microsoft and Sony are going to screw you You are so hosed. Game over, man.

“There’s absolutely nothing you can do about this”

Page 3: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Console Hardware Architectures Optimized to do floating-point math Optimized for multithreaded tasks Optimized to run games Not optimized to run general purpose

code Not optimized to do branch prediction,

code reordering, instruction pipelining or other out-of-order magic

Large L2 caches Large latencies

Page 4: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

We’re Game Programmers.We Love Challenges. We will make games on these consoles The solution is not assembly language The solution is to tailor our C/C++

engines, inner loops and bottleneck functions to the realities of the hardware

Remember: C++ code can make or break your game’s performance

Page 5: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Not Covering

Profiling (do it) Multithreading (do it) Memory allocation (avoid in game loop) Compiler settings (experiment) Exception handling (avoid it)

Page 6: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Topics for Today

Thinking about L2 Optimize memory access Use CPU caches effectively

Thinking about in-order processing Avoid function call overhead Tips for efficient math Avoid hidden C++ inefficiencies

Page 7: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Optimize Memory Access

Proverb: thou shalt treat memory as if it were thy hard drive

You will be memory-bound on new consoles

Recommendations Never read from the same place twice in a

frame Read data sequentially Write data sequentially Use everything you read

Page 8: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Minimize Data Passes

Game frame loops often access data twice Or three times Or more

Optimize for a single pass Consider less frequent operations

AI Physics, collision Networking Particle systems

Multiple PassArchitecture

Page 9: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Pointer Aliasing Explained

void init( float *a, const float *b ) {

a[0] = 1.0f - *b;

a[1] = 1.0f - *b;

}

Nominal case

Worst case float a[2]={0.0f};

init( a, &a[0] );

0.0 0.00.0

0.0 0.0

b a

1.0 1.0

1.0 0.0

ab

Page 10: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

A Solution: Restrict Restrict keyword tells the compiler there’s no

aliasing Restrict permits the compiler to generate

much more efficient code

void init( float* __restrict a,

const float* __restrict b ) {

a[0] = 1.0f - *b; // compiler can do

a[1] = 1.0f - *b; // the right thing

}

Page 11: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

What to Restrict

Use restrict widely Function pointer parameters Local pointers Pointers in structs/classes But not:

Function return types Casts Global pointers (maybe) References (maybe)

Page 12: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Use the CPU Caches Effectively The L2 cache is your best friend Using the cache well is an art Ensure you have a good profiler by

your side

Page 13: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Keep the Working Set Small

Pack commonly used data together Frequently used data might deserve its

own struct/class Keep rarely used data separate

Example: texture file names Consider bitfields

Bitfields are extremely efficient on PowerPC

Consider other forms of lossless compression

Page 14: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Inefficient Structs Are Bad Mojostruct InefficientCar {

bool manual; // padding here

wheel wheels[8]; // 8 wheels?

bool convertible; // more pad

char engine; // 4 bits used

char file[32]; // rarely used

double maxAccel; // double?

};

sizeof(InefficientCar) = 80

Page 15: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Carefully Design Structures

struct EfficientCar { wheel wheels[4]; // 4 wheels wheel *moreWheels; char *file; // stored elsewhere float maxAccel; // float unsigned engine:4; // bitfields unsigned manual:1; unsigned convertible:1;};sizeof(EfficientCar) = 32

Page 16: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Choose the Right Container

Prefer contiguous containers Or at least mostly contiguous Examples: array, vector, deque

Avoid node-based containers List, set/map, binary trees, hash tables

If you must use a tree, consider a custom allocator for memory locality

Vector + std::sort is often faster (and smaller) than set or map or hash tables, by an order of magnitude

Page 17: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Avoid Function Call Overhead

Function call overhead was a surprising cause of performance issues on Xbox

The same is true on Xbox 360 and PS3 Fortunately, there are lots of solutions Research compiler settings. On Xbox

360: Inline “any suitable” Enable link-time code generation

Spend time ensuring the compiler is inlining the right things

Page 18: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Avoid Virtual Functions

Weigh the limitations of virtual functions Adds a branch instruction Branch is always mispredicted Compiler is limited in how it can optimize

Consider replacing virtual void Draw() = 0;

With Xbox360.cpp: void Draw() { ... } Windows.cpp: void Draw() { ... } PS3.cpp: void Draw() { ... }

Page 19: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Maximize Leaf Functions

Leaf functions don’t call other functions, ever

If a potential leaf function calls another function, the high-level function: Is much less likely to be inlined Must set up a stack frame Must set up registers

Potential solutions Remove the inner function completely Inline the inner function Provide two versions of the outer function

Page 20: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Unroll Inner Loops

Compiler can’t unroll loops where n is variable

Even unrolling from ++i to i+=4 can be a significant gain Eliminates three branch instructions Increases opportunity for code scheduling

Don’t forget to hoist invariants out, too

Page 22: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Pass Native Types by Value

Tradition says that “large” types are passed by pointer or reference, but be careful New consoles have really large registers

Native types include 64-bit int (__int64) VMX vector (__vector4) – 128 bits!

Pass structs by pointer or reference One exception: pass structs consisting of

bitfields <= 64 bits by value

Page 23: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Know Data Type Performance

int32 and int64 have equivalent perf float and double have equivalent perf int8 and int16 are slower than int

They generate extra instructions High bits cleared or sign-extended

Example: int32 adds 2X faster than int16 adds

Recommendations Store as smallest type required Load into int32, int64 or double for

calculations

Page 24: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Use Native Vector Types

In CS 101, you learned to create abstract data types, such as matrices

typedef std::vector<float,4> vec;

typedef std::vector<vec,4> matrix;

This code is an abomination At least on Xbox 360 and PS3 Xbox 360 and PS3 have dedicated

vector math units called VMX units Use them!

Page 25: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Your Math Buddies

__vector4 (4 32-bit floats; 128-bit register)

XMVECTOR (typedef for vector4) XMMATRIX (array of 4 vector4s) XMVECTOR operators (+,-,*,/) Hundreds of XMVECTOR and

XMMATRIX functions Xbox 360-specific, but similar

constructs in PS3 compilers

Page 26: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Avoid Floating-Point Branches

FP branches are slow Cache has to be flushed ~10X slower than int branches

Avoid loops with float test expressions

Eliminate altogether if possible Can be faster to calculate values

you won’t use! Compare integers instead Replace with fsel when

possible 10-20X performance gain

Page 27: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

The fsel Option in Detail

Definition of hardware implementation:

float fsel(float a, float b, float c)

{

return ( a < 0.0f ) ? b : c;

}

You can replace expressions like v = ( w < x ) ? y : z; // slow

With faster expressions like v = fsel( w - x, y, z ); // turbo

Page 28: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Prefer Platform-Specific Funcs

The C runtime (CRT) is not usually the best option when performance matters

Xbox 360 examples Prefer CreateFile to fopen or C++ streams

Options for asynchronous reads and other goodness

Prefer XMemCpy to memcpy 2-6X faster

Prefer XMemSet to memset 8-14X faster

Page 29: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Avoid Hidden C++ Inefficiencies C++ rocks the house! C++ can bring your game to its knees! Consider these innocuous snippets

Quaternion q; s.push_back( k ); if( (float)i > f ) obj->Draw(); GameObject arr[1000]; a = b + c; i++;

Page 30: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

C++ is Dangerous With power comes responsibility Beware constructors

Is initialization the right thing to do? Beware hidden allocations Conversion casts may have significant cost Use virtual functions with care Beware overloaded operators Stick to known idioms

Operator++ should be a constant-time operation.

Really.

Page 31: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Summary

There absolutely are many things you can do to efficiently program next-gen consoles

Two key issues: L2/memory and in-order processing Treat memory as you would a hard disk Watch out for those branches; use tricks

like fsel Prefer a light C++ touch

Page 32: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

What’s Next

Our games are only as good as the weakest member of the team

Share what you’ve learned “The sharing of ideas allows us to

stand on one another’s shoulders instead of on one another’s feet” – Jim Warren

Page 33: C++ on Next-Gen Consoles: Effective Code for New Architectures Pete Isensee Development Manager Microsoft Game Technology Group.

Questions

[email protected] Fill out your feedback forms