Multi-User Virtual Worlds Accomplishments

advertisement
Multi-user Extensible
Virtual Worlds
Increasing complexity of objects
and interactions with increasing
world size, users, numbers of
objects and types of interactions.
Sheldon Brown, Site Director CHMPR, UCSD
Daniel Tracy, Programmer, Experimental Game Lab
Erik Hill, Programmer, Experimental Game Lab
Todd Margolis, Technical Director, CRCA
Kristen Kho, Programmer, Experimental Game Lab
Current schemes using compute clusters break virtual worlds into
small “shards” which have a few dozen interacting objects.
Compute systems with large amounts of coherent addressable
memory alleviate cluster node jumping and can create worlds
with several orders of higher level data complexity. Tens of
thousands of entities vs. dozens per shard. Takes advantage of
techniques hybrid compute techniques for richer object
dynamics.
Central server manages world state
changes
Number of clients and amount of activity
determines world size and shape
City road schemes are computed for
each player when they enter a new city,
using Hybrid multicore compute
accelerators
Each player has several
views of the world:
•Partial view of one city
•Total view of one city
•Partial view of two cities
•View of entire globe
Within a city are several
thousand objects. The
dynamics of these objects are
computed on the best
available resource,
balancing computability and
coherency and alleviating
world Sharding.
Many classes of computing devices are used.
z10 mainframe – transaction
processing state management
Server side compute accelerators:
NVidia Tesla, Cell processor and
x86
Multi-core portable devices (i.e.
snapdragon based cell phone)
Varied desktop comptuation
including hybrid multicore
Computing cloud data storage.
Increasing complexity of objects and interactions with increasing world size,
users, numbers of objects and
types of interactions.
Multiple 10gb interfaces
to compute accelerators,
storage clusters and
compute cloud.
Cell Processor,
x86 and GPU
compute
accelerators for
asset
transformation,
physics and
behaviors.
Server services are distributed across cloud clusters, and redistributed across
clients as performance or local work necessitates. Coherency with overall system
is pursued, managed by centralized server. Virtual world components have
dynamic tolerance levels for discoherency and latency.
Development Server Framework 5/2010
2 QS22 blades – 4
Cell Processors
3 10gb interfaces to
compute
accelerators
2 HS22 blades - 4
Xeons
1 10gb interfaces to
internet
Many Clients
Z10 mainframe computer at San
Diego Supercomputer Center
2- IFL’s with 128mb Ram, zVM
virtual OS manager with Linux
guests
6 tb storage fast local storage –
15K disks
4 SR and 2 LR 10gb ethernet
interfaces
4 QS20 blades
nVidia Tesla accelerator
– 4 GPU’s on linux host,
external dual pci
connection.
SDSC View
Multi-user Extensible
Virtual Worlds
Producing a multi-user
networked virtual world from a
single-player environment
Goals
• Feasibility
–
–
Transformation from single-player program to client/server
multi-player networking is non-trivial
Structured methodology for transformation required
• Scalability
–
–
Support large environments, massively multi-player
After working version, iteratively tackle bottlenecks
• Multi-platform server
–
–
Explore z10, x86, CellBE, Tesla accelerators
Cross-platform communication required
Evaluate “drop in” solutions
Benefits and liabilities of client/server
side schemes such as OpenSIM and
Darkstar.
The (Original) Scalable City
Technology Infrastructure
OpenGL
Direct3D
Ogre3D
Custom virtual reality engine
real time3D rendering engine
ERSATZ
ODE, Newton
Open source physics libraries
fmod
Sound library
Intel OpenCV
Real time computer vision
CGAL
Computational Geometry
Library
Autodesk Maya, 3DMax
Procedural assets creation
through our own plug-ins
Loki, Xerces, Boost
Utilities Libraries
Chromium, DMX, Sage
Distributed rendering Libraries
NVIDIA FX Composer, ATI Render Monkey Serial pipeline. Increase performance by
IDEs for HLSL and GLSL, GPU programming
increasing CPU speed.
Moore’s law computational gains
have not been achievable via
faster clock speeds for the
past 8 years.
Multicore computing is the tactic
•
•
•
•
New
New
New
New
computing architectures
algorithmic methods
software engineering
systems designs
nVidia Fermi GPGPU
16 units with 32 cores each
IBM System z processor
4 cores 1 service procesor
Sony/Toshiba/IBM
Cell BE Processor
Intel Larrabee Processor
1 PPU, 8 SPU’s per chip
32 x86 cores per chip
The Scalable City Next Stage Technology Infrastructure
Cell Processors
compute
Dynamic Assets
Intel OpenCV Real time
computer vision
ERSATZ
ENGINE
Computational Geometry
Library
Input Data
Data Parallel
n threads + SIMD
Thread Barrier
Abstract physics to use multiple
physics libraries (ODE, Bullet,
etc.) Replace computational
Output Data bottlenecks in these libraries
with data parallel operations.
Fmod Sound library
Input Data
Output Data
Convert assets to data parallel
meshes after physics
transformation, boosts rendering
~33%
Ogre3D Scene graph
Open Source Libraries – needs work for adding data level parallelism
The Scalable City Next Stage Technology Infrastructure
Cell Processors
compute
Dynamic Assets
Intel OpenCV Real time
computer vision
ERSATZ
ENGINE
Computational Geometry
Library
Input Data
DarkStar
Server
Data Parallel
n threads + SIMD
Thread Barrier
Abstract physics to use multiple
physics libraries (ODE, Bullet,
etc.) Replace computational
Output Data bottlenecks in these libraries
with data parallel operations.
Fmod Sound library
Input Data
Output Data
Convert assets to data parallel
meshes after physics
transformation, boosts rendering
~33%
Ogre3D Scene graph
Max’s out at about 12 clients for world as complex as Scalable City
Open Sim
Server
Real Xtend
or Linden
Client
ERSATZ
ENGINE
Systems are not designed for interaction of 10,000’s of dynamic objects
Even a handful of complex objects overload dynamics computation.
Extensive re-engineering makes to provide capability and use hybrid
multicore infrastructure – defeating their general purpose platform
Challenges & Approach
• Software Engineering Challenges:
–
SC: Large, Complex, with many behaviors.
–
Code consisted of tightly coupled systems not conducive to
separation into client and server.
–
Multi-user support takes time, and features will be expanded by
others simultaneously!
• Basic Approach - Agile methodology:
–
Incrementally evolve single-user code into a system that can be
trivially made multi-user in the final step.
–
Always have a running and testable program.
–
Test for unwanted behavioral changes at each step.
–
Allows others to expand features simultaneously.
Step by Step Conversion
1.
Data-structure focused: is it client or server?
– Some data structures may have to be split.
Data Structures
Landscape
Manager
BlackBoard
(Singleton)
Rendering
Clouds
Physics
Player
Inverse
Kinematics
Camera
House Piece
Audio
Road Animation
House Lots
User Input
Visual
Component
MeshHandler
Abstracting Client & Server
Object Representations
• Server: Visual Component
– Visual asset representation on the server side
– Consolidates task of updating clients
– Used for house pieces, cyclones, landscape, roads, fences, trees,
signs (animated, static, dynamic).
– Dynamic, run-time properties control update behavior
• Client: Mesh
– Mesh properties communicated from Visual Component
– Used to select rendering algorithm
– Groups assets per city for quick de-allocation
Step by Step Conversion
1.
Data-structure focused: is it client or server?
– Some data structures may have to be split.
2.
All data access paths must be segmented into c/s
– Cross-boundary calls recast as buffered communication.
Data Access Paths
• Systems access world state via the
Blackboard (singleton pattern)
• After separating into Client & Server
Blackboard, Server systems must be weaned
off of Client Blackboard and vice versa.
• Cross-boundary calls recast as buffered
communication.
Step by Step Conversion
1.
Data-structure focused: is it client or server?
– Some data structures may have to be split.
2.
All data access paths must be segmented into c/s
– Cross-boundary calls recast as buffered communication.
3.
Initialization & run loop separation
– Dependencies on order must be resolved.
Initialization & Run-loop
Initialize Graphics
Initialize Physics
Init Loading Screen
Load Landscape Data
Initialize Clouds
Create Roads
Place Lots
Place House Pieces
Place Player
Get Camera Position
Initialize Graphics
Init Loading Screen
Initialize Clouds
Get Camera Position
Initialize Physics
Load Landscape Data
Create Roads
Place Lots
Place House Pieces
Place Player
Step by Step Conversion
1.
Data-structure focused: is it client or server?
– Some data structures may have to be split.
2.
All data access paths must be segmented into c/s
– Cross-boundary calls recast as buffered communication.
3.
Initialization & run loop separation
– Dependencies on order must be resolved.
4.
Unify cross-boundary comm. to one subsystem.
– This will interface with network code in the end.
Unify Communication
ReadClient
ReadServer
MovePlayer
Transforms
Animations
Render
Physics/IK
UserInput
WriteClient
WriteServer
Single buffer, common format, ordered messages
Communicate in one stage: solve addiction to immediate answers
Step by Step Conversion
1.
Data-structure focused: is it client or server?
– Some data structures may have to be split.
2.
All data access paths must be segmented into c/s
– Cross-boundary calls recast as buffered communication.
3.
Initialization & run loop separation
– Dependencies on order must be resolved.
4.
Unify cross-boundary comm. to one subsystem.
– This will interface with network code in the end.
5.
Final separation of client & server into two programs
– Basic networking code allows communication
Separate
Two programs, plus basic synchronous networking code
Loops truly asynchronous (previously one called the other)
Step by Step Conversion
1.
Data-structure focused: is it client or server?
– Some data structures may have to be split.
2.
All data access paths must be segmented into c/s
– Cross-boundary calls recast as buffered communication.
3.
Initialization & run loop separation
– Dependencies on order must be resolved.
4.
Unify cross-boundary comm. to one subsystem.
– This will interface with network code in the end.
5.
Final separation of client & server into two programs
– Basic networking code allows communication
6.
Optimize!
– New configuration changes behavior even for single player
Experience
• Positives
– Smooth transition to multi-user possible
– All features/behaviors retained or explicitly disabled
– Feature development continued successfully during
transition (performance, feature, and behavioral
enhancements on both client and server side, CAVE
support, improved visuals, machinima engine, etc).
• Negatives
– Resulting code structure not ideal for client/server
application (no MVC framework, some legacy structure).
– Feature development and client/server work sometimes
clash, require re-working in client/server fashion.
Initial Optimizations
Basic issues addressed in
converting to a massively multi-user
networked model
Multi-User Load Challenges
• Communications
• Graphics Rendering
• Geometry Processing
• Shaders
• Rendering techniques
• Dynamics Computation
• Physics
• AI or other application specific behaviors
• Animation
Multi-User Load Challenges
• Communications
• Graphics Rendering
• Geometry Processing
• Shaders
• Rendering techniques
• Dynamics Computation
• Physics
• AI or other application specific behaviors
• Animation
Communication
• In a unified system, subsystems can share
data and communicate quickly.
• In a Client/Server model, subsystems on
different machines have to rely on
messages sent over the network
– Data marshalling overhead
– Data unmarshalling overhead
– Bandwidth/latency limitations
New Client Knowledge Model
• Stand-Alone version had all cities in memory
– All clients received updates for activity in all cities
– Increased memory & bandwidth use as environment scales
• Now: Clients only given cities they can see
– City assets dynamically loaded onto client as needed
– Reduces the updates the clients need
• Further Challenge: Dynamically loading cities
without server or client hiccups.
Communication Challenges
• More Clients leads to:
– More activity
– Physics object movements
– Road/Land Animations
– House Construction
– More communication
– Per client due to increase in activity
– More clients for server to keep up to date
– Server communication = activity x clients!
• Dynamically loading large data sets (cities in this case) without
server or client hiccups
Communication Subsystem
– Code-generation for data marshalling
– Fast data structure serialization
– Binary transforms for cross-platform
– Token or text-based too slow
– Endian issues resolved during serialization
– Tested on z10, Intel
• Asynchronous reading and writing
– Dedicated threads perform communication
– Catch up on all messages each game cycle
Reducing Data Marshalling
Time
• Reduce use of per-player queues:
– Common messages sent to a queue associated
with the event’s city
– Players receive buffers of each city they see, in
addition to their player-specific queue.
– Perform buffer allocation, data marshalling, &
copy once for many players.
– Significantly reduces communication overhead
for server.
Preventing Stutters
• Send smaller chunks of data
– Break up large messages
• Incrementally load cities as a player
approaches them
– Space out sending assets over many cycles
– Large geometry (landscape) subdivided
– If player arrives, finish all transfers
• Prevent disk access on client
– Pre-load resources
Download