Feeding the Monster Advanced Data Packaging for Consoles Bruno Champoux Nicolas Fleury Outline Next-Generation Loading Problems LIP: A Loading Solution Packaging C++ Objects Demo: LIP Data Viewer Questions Some things never change… Optical Disc RAM ...since 1992! ...since 1992! Next-Gen Loading Problem Processing power up by 10-40X Memory size up by 8-16X Optical drive performance up by 1-4X Next-Gen Loading Problem Xbox 360 12X dual-layer DVD drive Outer edge speed: 15 MB/s Average seek: 115 ms PlayStation 3 Blu-ray performance still unknown 1.5X is the most likely choice CAV drive should give 6 to 16 MB/s Average seek time might be worse than DVD Next-Gen Loading Problem Maximum Bandwidth 4.4 MB/s Time to fill PS2 Memory Size 32 MB PS3 192 MB* 16 MB/s 12 s Xbox 64 MB 6.9 MB/s 9.3 s Xbox 360 480 MB** 15 MB/s 32 s 7.3 s Next-Gen Loading Problem In order to feed the next-gen data needs, loading will need to be more frequent Hard drives are optional for PS3 and Xbox 360 Optical drive performance does not scale with the memory/CPU power increase Conclusion: Loading performance must be optimal; any processing other than raw disc transfers must be eliminated Did I Hear “Loading Screen”? Disruptive Boring as hell Non-skippable cutscenes are not better! Conclusion: Loading screens must not survive the current generation Background Loading Game assets are loaded during gameplay Player immersion is preserved Solution: Use blocking I/O in a thread or another processor Background Loading Requirements: Cannot be much slower than a loading screen Must have low CPU overhead Must not block other streams Conclusion: Once again, loading performance must be optimal; any processing other than raw disc transfers must be eliminated Proposing A Solution Requirements for a next-generation packaging and loading solution: Large amounts of assets must be loaded at speeds nearing the hardware transfer limit Background loading must be possible at little CPU cost Data assets must be streamed in and phased out without causing memory fragmentation Understanding Loading Times Freeing memory space Unloading Defragmenting Seek time Read time Allocations Parsing Relocation (pointers, hash ID lookups) Registration (e.g. physics system) Reducing Loading Times Always load compressed files Using N:1 compression will load N times faster Double-buffering hides decompression time Plenty of processing power available for decompression on next-gen consoles Reducing Loading Times Compression algorithm choice Favor incremental approach Use an algorithm based on bytes, not bits Lempel-Ziv family LZO Reducing Loading Times Take advantage of spatial and game flow coherence Batch related data together in one file to save seek time Place related files next to each other on the disc to minimize seek time Reducing Loading Times Take advantage of optical disc features Store frequently accessed data in the outer section of the disc Store music streams in the middle (prevents full seek) Store single-use data near the center (videos, cutscenes, engine executable) Beware of layer switching (0.1 seconds penalty) Reducing Loading Times Use the “flyweight” design pattern Geometry instancing Animation sharing Favor procedural techniques Parametric surfaces Textures (fire, smoke, water) Reducing Loading Times Always prepare data offline Eliminate text or intermediate format parsing in the engine Engine time spent converting or interpreting data is wasted Load native hardware and middleware formats Load C++ objects directly Why Load C++ Objects? More natural way to work with data Removes any need for parsing or interpreting assets Creation is inexpensive Pointer relocation Hash ID conversion Object registration Loading C++ Objects Requires a very smart packaging system Member pointers Virtual tables Base classes Alignment issues Endianness Loading Non-C++ Objects Must be in a format that is ready to use after being read to memory Texture/normal maps Havok structures Audio Script bytecode Pretty straightforward Load-In-Place (LIP) Our solution for packaging and loading game assets Framework for defining, storing and loading native C++ objects Dynamic Storage: a self-defragmenting game asset container Maya Exporter Level Editor LIP Generator LIP Packaging Game Assets LIP Loading Dynamic Storage Engine Load-In-Place: “LIP Item” 1 LIP item 1 game asset 1 LIP item unique hash ID (64-bit) 32 bits for the type ID and properties 32 bits for the hashed asset name (CRC-32) The smallest unit of data that can be queried moved by defragmentation unloaded Supports both C++ objects and binary blocks Examples of LIP Items Joint Animation Character Model Environment Model Section Collision Floor Section Game Object (hero, enemy, trigger, etc.) Script Particle Emitter Texture C++-Based LIP Items Can be made of any number of C++ objects and arrays On the disc, all internal pointers are kept relative to the LIP item block Pointer relocation starts with a placement new on a “relocation constructor” Internal pointers are relocated automatically through “constructor chaining” Placement “new” Operator Syntax new(<address>) <type>; Calls the constructor but does not allocate memory Initializes the virtual table Called once for each LIP item on the main class relocation constructor Relocation Constructors Required by all classes and structures that can get loaded by the LIP framework contain members that require relocation 3 constructors Loading relocation constructor Moving relocation constructor (defragmentation) Dynamic constructor (optional, can be dummy) No default constructor! Object Members Relocation Internal pointer Must point within the LIP item block Converted into absolute pointer External reference (LIP items only) Stored as a LIP item hash ID Converted into a pointer in the global asset table entry that points to the referenced LIP item LIP framework provides wrapper classes with appropriate constructors for all pointer types Relocation Example class GameObject { public: GameObject(const LoadContext& ctx); GameObject(const MoveContext& ctx); GameObject(HASHID id, Script* pScript); protected: lip::RelocPtr<Transfo> mpLocation; lip::LipItemPtr<Script> mpScript; }; Relocation Example (cont’d) GameObject::GameObject(const LoadContext& ctx) : mpLocation(ctx), mpScript(ctx) {} GameObject::GameObject(const MoveContext& ctx) : mpLocation(ctx), mpScript(ctx) {} GameObject::GameObject(HASHID id, Script* pScript) : mpLocation(new Transfo), mpScript(pScript) { SetHashId(id); } Relocation Example (cont’d) template<typename LipItemT> void PlacementNew( lip::LoadContext& loadCtx) { new(loadCtx.pvBaseAddr) LipItemT(loadCtx); } loadCtx.pvBaseAddr = pvLoadMemory; PlacementNew<GameObject>(loadCtx); Relocation Example (cont’d) Placement new GameObject Constructors RelocPtr<Transfo> LipItemPtr<Script> Placement new hash ID lookup Transfo Script Constructors Load-In-Place: “Load Unit” Group of LIP items The smallest unit of data that can be loaded 1 load unit 1 load command Number of files is minimized 1 language-independent file Models, animations, scripts, environments, … N language-dependent files Fonts, in-game text, some textures, audio, … Load unit files are compressed Load Unit Table Each LIP item has an entry in the table Hash ID Offset to LIP Item Table LIP items Dynamic Storage Loading process Load unit files are read and decompressed to available storage memory Load unit table offsets are relocated Load unit table entries are merged in the global asset table A placement new is called for each LIP item Some LIP item types may require a second initialization pass (e.g. registration) Dynamic Storage Unloading process Each LIP item can be removed individually All LIP items of a load unit can be removed together Destructors are called on C++ LIP items Dynamic storage algorithm will defragment the new holes later Locking LIP items can be locked Locked items cannot be moved or unloaded Platform-Specific Issues GameCube Special ARAM load unit files Animations Collision floors Small disc compression Xbox/Xbox 360 Special LIP items for DirectX buffers Vertex, index and texture buffers 4KB-aligned LIP items (binary blocks) Buffer headers in separate LIP items (C++ objects) Load-In-Place: Other Uses Network-based asset editing LIP items can be transferred from and to our level editor during gameplay Changes in asset sizes do not matter Used by Maya exporters to store our intermediate art assets LIP is much more efficient than parsing XML! Packaging C++ Objects Nicolas Fleury Our Previous Python-Based System class MyClass(LipObject): x = Member(UInt32) y = Member(UInt32, default=1) p = Member(makeArrayType( makePtrType(SomeClass))) Cool Things with this System Not too complex to implement. Python is easy to use. Introspection support. A lot of freedom in corresponding C++ classes. Problems with this System Python and C++ structures must be synchronized. Exporters must be written, at least partly, in Python. Validations limited (unless you parse C++ code). We just invented a Python/C++ hybrid. C++-based system class MyClass : public MyBaseCls { ... LIP_DECLARE_REGISTER(MyClass); uint32 x; }; // In .cpp LIP_DEFINE_REGISTER(MyClass) { LIP_REGISTER_BASE_CLASS(MyBaseCls); LIP_REGISTER_MEMBER(x); } Consequences Exporters are now written in C++. Class content written twice, but synchronization fully validated. Dummy engine .DLL must be compiled (not a working engine, provides only reflection/introspection). Need a good build system. We just added reflection/introspection to C++. Member Registration Information Name Offset Type Special flags (exposed in level editor, etc.) (Non Empty) Base Class Registration Information Name Type Offset; calculated with: (size_t)(BaseClassType*)(SubClassType*)1 - 1 Member Type Deduction In IntrospectorBase class: template < typename TypeT, typename MemberT> void RegisterMember( const char* name, MemberT(TypeT::*memberPtr)); Member Type Deduction (Arrays) In IntrospectorBase class: template < typename TypeT, typename MemberTypeT, int sizeT> void RegisterMember( const char* name, MemberT(TypeT::*memberPtr)[sizeT]); Needed Information in Tools Every class base class (to write their members too). Every class member. Every base class offset (to detect missing base class registration). Every member name, size, type and special flags. For every type, the necessary alignment and if it is polymorphic. Introspection Classes (Members) IntrospectorBase MemberInfoBase TypeT TypeT Introspector MemberInfoTypedBase 1 * TypeT, MemberTypeT, sizeT:int ArrayMemberInfo MemberInfo TypeT, MemberTypeT Introspection Classes (Base Classes) IntrospectorBase BaseClassInfoBase * TypeT SubClassTypeT Introspector BaseClassTypedInfoBase 1 SubClassTypeT, BaseClassTypeT BaseClassInfo Result: Member Introspection Able to know all types and their members. Can be used for both writing and reading binary data. Same class used in tools to fill the data as in engine. LipViewer Data of any platform, endianness, pointer size (binary files have a header with platform id). Both for engine data and tools binary formats. Hexadecimal viewer integration, edition support. Excellent learning and debugging tool. LipViewer Demo Restrictions for Simplification of Implementation Polymorphic types must begin with a vtable pointer (their first non-empty base class must be polymorphic). Can’t inherit twice from same class indirectly (or offset trick doesn’t work). No virtual base classes. All padding is explicit. Explicit Padding class MyClass { ... LIP_PADDING(mP1, LIP_PS3(12)); uint16 mSomeMember; lip::Padding<4> mP2; uint32 mSomeOtherMember; LIP_PADDING(mP3,LIP_PC(4) LIP_PS3(8)); ... }; Particular Things to Handle Endianness. 64 bits pointers (no more!). VTable padding. Type alignment. VTable Padding __declspec(align(16)) class Matrix {…}; vtable vtable class MyClass { x uint32 x, y, z; y Matrix m; z x }; y m 32 bytes on PS3, z 48 bytes on PC. m Automatic Versioning Create a huge string with member names/types and member names/types of pointed classes. In the case of polymorphic pointers, all sub-types must also be included. Hash the huge string. Can be integrated in tools dependency tree mechanism. Needed Information in Engine A hash map of objects to do the placement new of the appropriate type. Smart pointers/arrays handle the rest. Type Ids VTable pointers are replaced by a type id. LIP_DECLARE_TYPE_ID(MyClass, id) in .hpp. Defines a compile-time mechanism to get id. Declares a global object. LIP_DEFINE_TYPE_ID(MyClass) in .cpp. Defines the global object. Its constructor adds itself as a hash node to a hash map. This object class is templated to make operations with the good type (example: placement new). Hash Map Overview TypeManager +PolymorphicPlacementNew() +PolymorphicMovePlacementNew() +GetTypeSize() +StreamObject() +AddTypeConstructor() TypeConstructorBase 1 * +mVtableId +mpNext : TypeConstructorBase +PlacementNew() +MovePlacementNew() +GetTypeSize() +StreamObject() TypeT TypeConstructorBase +PlacementNew() +MovePlacementNew() +GetTypeSize() +StreamObject() Pointers Normal pointers like T* must be set to 0 when exporting an object. For relocated pointers, smart pointer classes must be used. Different types of smart pointers/arrays: Ownership of pointed data? Relocation of pointed data? Smart Members Classes can derive from a lip::SmartMember class to implement a custom writing/reading in tools. Class only used as a tag, it doesn’t have any virtual function. Classes deriving from lip::SmartMember are expected to implement a compiletime interface. Useful for smart pointers (normal pointers cannot be load-in-placed). Smart Members: Full Control Over Writing and Reading. class MySmartArray : lip::SmartMember { public: void Write(lip::LipWriter&) const; void WriteExternalData( lip::LipWriter&)const; void Stream(lip::LipReader&); void StreamExternalData( lip::LipReader&); }; WeakRelocPtr<Type> Parent class Not owned pointed data mpPtr RelocPtr<Type> Relocation assumes pointed data is not of a sub-type and does directly a placement new of Type. Parent class mpPtr RelocPolymorphicPtr<Type> Relocation looks in the hash map to do placement new of the good type (involves a search and a virtual function call). Parent class mpPtr RelocFixedArray<Type, size> Parent class [0] [1] [2] [3] WeakRelocArray Parent class Owned pointed array (not needing relocation) mpPtr muiCount RelocArray<Type> Parent class Owned pointed array (with relocation) mpPtr muiCount RelocPolymorphicArray<Type> Parent class Owned array of pointers mppPtr muiCount Owned array of objects (can be of different sub-types) RelocWeakPtrArray<Type> Parent class Owned array of pointers mppPtr muiCount Not owned pointed objects (can be of different sub-types) BinaryBlockPtr<alignment=4> Parent class Owned pointed data mpPtr Enums Concept of exclusivity group masks to regroup values in mask in a radio button group in GUI. LIP_REGISTER_ENUM(MyEnum) { LIP_DEFINE_ENUM_VALUE(eNO); LIP_DEFINE_ENUM_VALUE(eYES); } Other Solutions Parse debug info files. Compile as C++/CLI. Parse source code. Questions? Links Latest slides http://www.a2m.com/gdc/ How to reach us bruno.champoux@a2m.com nicolas.fleury@a2m.com