Design Patterns for Parallel Programming
Research work by:
Kurt Keutzer, EECS, UC Berkeley
Tim Mattson, Intel Corporation
Presented to faculty at Tsinghua Multicore Workshop
Michael Wrinn, Intel Corporation
1
Outline

A Programming Pattern Language
– patterns: why and what?
 Structural patterns
 Computational patterns
 Composing patterns: examples
 Concurrency patterns and PLPP
 Examples: pattern language case studies
 Examples: pattern language in Smoke Demo
 Motivation
2
Outline






Motivation – what is the problem we are trying to solve
How do programmers think? Psychology meets computer science.
Pattern Language for Parallel programming
 Toy problems showing some key PLPP patterns
 A PLPP example (molecular dynamics)
Expanding the effort: Patterns for engineering parallel software
A Survey of some key patterns
Case studies
3
Microprocessor trends
Many core processors are the “new normal”.
160 cores
80 cores
240 cores
ATI RV770
1 CPU + 6 cores
NVIDIA Tesla C1060
Intel Terascale
research chip
3rd party names are the property of their
IBM Cell
The State of the field


A harsh assessment …
 We have turned to multi-core chips not because of the success of
our parallel software but because of our failure to continually
increase CPU frequency.
Result: a fundamental and dangerous (for the computer industry)
mismatch
 Parallel hardware is ubiquitous.
 Parallel software is rare
The multi-core Challenge: How do we make Parallel software as
ubiquitous as parallel hardware?
This could be the greatest challenge ever faced by the computer
industry
5
We need to create a new
generation of parallel programmers


Parallel Programming is difficult, error prone and only accessible to
small cadre of experts … Clearly we haven’t been doing a very good
job at educating programmers.
Consider the following:
 All known programmers are human beings.
 Hence, human psychology, not hardware design, should guide
our work on this problem.
We need to focus on the needs of the programmer, not the
needs of the computer
6
Outline






Motivation – what is the problem we are trying to solve
How do programmers think? Psychology meets computer science.
Pattern Language for Parallel programming
 Toy problems showing some key PLPP patterns
 A PLPP example (molecular dynamics)
Expanding the effort: Patterns for engineering parallel software
A Survey of some key patterns
Case studies
7
Cognitive Psychology and human reasoning

Human Beings are model builders
 We build hierarchical complexes of mental models.
 Understand sensory input in terms of these models.
 When input conflicts with models, we tend to believe the models.
Consider the following slide …
Why did most of you see motion?


Your brain’s visual system
contains a model that says
this combination of
geometric shapes and colors
implies motion.
Your brain believes the
model and not what your
eyes see..
To understand people and how we work, you need
understand the models we work with.
Programming and models


Programming is a process of successive
refinement of a problem over a hierarchy of
models. [Brooks83]
The models represent the problem at a different
level of abstraction.
 The
top levels express the problem in the original
problem domain.
 The lower levels represent the problem in the
computer’s domain.

The models are informal, but detailed enough to
support simulation.
Model based reasoning in programming
Models
Specification
Programming
Domain
Problem Specific: polygons,
rays, molecules, etc.
OpenMP’s fork/join,
Actors
Computation
Machine
(AKA Cost Model)
Threads – shared memory
Processes – shared nothing
Registers, ALUs, Caches,
Interconnects, etc.
Programming process: getting
started.


The programmer starts by constructing a high
level model from the specification.
Successive refinement starts by using some
combination of the following techniques:
 The
problem’s state is defined in terms of objects
belonging to abstract data types with meaning in the
original problem domain.
 Key features of the solution are identified and
emphasized. These features, sometimes referred to
as "beacons” [Wiedenbeck89] , emphasize key aspects
of the solution.
The programming process

Programmers use an informal, internal notation
based on the problem, mathematics,
programmer experience, etc.
 Within
a class of programming languages, the
program generated is only weakly dependent on the
language. [Robertson90] [Petre88]

Programmers think about code in chunks or
“plans”. [Rist86]
 Low
level plans code a specific operation: e.g.
summing an array.
 High level or global plans relate to the overall
program structure.
Programming Process: Strategy +
Opportunistic Refinement

Common strategies are:
 Backwards
goal chaining - start at result and work
backwards to generate sub goals. Continue
recursively until plans emerge.
 Forward chaining: Directly leap to plans when a
problem is familiar. [Rist86]

Opportunistic Refinement: [Petre90]
 Progress
is made at multiple levels of abstraction.
 Effort is focused on the most productive level.
Programming Process: the role of
testing


Programmers test the emerging solution
throughout the programming process.
Testing consists of two parts:
 Hypothesis
generation: The programmer forms an
idea of how a portion of the solution should behave.
 Simulation: The programmer runs a mental simulation
of the solution within the problem models at the
appropriate level of abstraction. [Guindom90]
From psychology to software

Hypothesis:
A
Design Pattern language provides the roadmap to
apply results from the psychology of programming to
software engineering:



Design patterns capture the essence of plans
The structure of patterns in a pattern language should mirror
the types of models programmers use.
Connections between patterns must fit well with goal-chaining
and opportunistic refinement.
References









[Brooks83] R. Brooks, "Towards a theory of the comprehension of computer programs",
International Journal of Man-Machine Studies, vol. 18, pp. 543-554, 1983.
[Guindom90] R. Guindon, "Knowledge exploited by experts during software system design",
Int. J. Man-machine Studies, vol. 33, pp. 279-304, 1990
[Hoc90] J.-M. Hoc, T.R.G. Green, R. Samurcay and D.J. Gilmore (eds.), Psychology of
Programming, Academic Press Ltd., 1990.
[Petre88] M. Petre and R.L. Winder, "Issues governing the suitability of programming
languages for programming tasks. "People and Computers IV: Proceedings of HCI-88,
Cambridge University Press, 1988.
[Petre90] M. Petre, "Expert Programmers and Programming Languages", in [Hoc90], p. 103,
1990.
[Rist86] R.S. Rist, "Plans in programming: definition, demonstration and development" in E.
Soloway and S. Iyengar (Eds.), Empirical Studies of Programmers, Norweed, NF, Ablex,
1986.
[Rist90] R.S. Rist, "Variability in program design: the interaction of process with
knowledge", International Journal of Man-Machine Studies, Vol. 33, pp. 305-322,1990.
[Robertson90] S. P. Robertson and C Yu, "Common cognitive representations of program
code across tasks and languages", int. J. Man-machine Studies, vol. 33, pp. 343-360, 1990.
[Wiedenbeck89] S. Wiedenbeck and J. Scholtz, "Beacons: a knowledge structure in
program comprehension", In G. Salvendy and M.J. Smith (eds.) Designing and Using Human
Computer interfaces and Knowledge-based systems, Amsterdam: Elsevier,1989.
18
People, Patterns, and Frameworks
Design Patterns
Application Developer
Uses application
design patterns
(e.g. feature extraction)
to design the
application
Application-Framework Uses programming
Developer
design patterns
(e.g. Map/Reduce)
to design the
application framework
Frameworks
Uses application
frameworks
(e.g. CBIR)
to develop application
Uses programming
design patterns
(e.g MapReduce)
to develop the
application framework
Eventually
1
2
3
Domain Experts
+
Domain literate programming +
gurus (1% of the population).
Application
patterns &
frameworks
End-user,
application
programs
Parallel
patterns &
programming
frameworks
Application
frameworks
Parallel programming gurus (1-10% of programmers)
Parallel
programming
frameworks
The hope is for Domain Experts to create parallel code with little or no understanding
of parallel programming.
Leave hardcore “bare metal” efficiency layer programming to the parallel
programming experts
Today
1
2
3
Domain Experts
+
Domain literate programming +
gurus (1% of the population).
Application
patterns &
frameworks
End-user,
application
programs
Parallel
patterns &
programming
frameworks
Application
frameworks
Parallel programming gurus (1-10% of programmers)
Parallel
programming
frameworks
• For the foreseeable future, domain experts, application framework builders, and
parallel programming gurus will all need to learn the entire stack.
• That’s why you all need to be here today!
Definitions - 1



Design Patterns: “Each design pattern describes a problem
which occurs over and over again in our environment, and then
describes the core of the solution to that problem, in such a way
that you can use this solution a million times over, without ever
doing it the same way twice.“ Page x, A Pattern Language,
Christopher Alexander
Structural patterns: design patterns that provide solutions to
problems associated with the development of program structure
Computational patterns: design patterns that provide solutions
to recurrent computational problems
Definitions - 2



Library: The software implementation of a computational pattern
(e.g. BLAS) or a particular sub-problem (e.g. matrix multiply)
Framework: An extensible software environment (e.g. Ruby on
Rails) organized around a structural pattern (e.g. model-viewcontroller) that allows for programmer customization only in
harmony with the structural pattern
Domain specific language: A programming language (e.g.
Matlab) that provides language constructs that particularly
support a particular application domain. The language may also
supply library support for common computations in that domain
(e.g. BLAS). If the language is restricted to maintain fidelity to a
structure and provides library support for common
computations then it encompasses a framework (e.g. NPClick).
Getting our software
act together:
First step … Define a
conceptual roadmap to
guide our work
13 dwarves
Alexander’s Pattern Language
Christopher Alexander’s approach to
(civil) architecture:
 "Each pattern describes a problem
which occurs over and over again
in our environment, and then
describes the core of the solution
to that problem, in such a way that
you can use this solution a million
times over, without ever doing it
the same way twice.“ Page x, A
Pattern Language, Christopher
Alexander
Alexander’s 253 (civil) architectural
patterns range from the creation of
cities (2. distribution of towns) to
particular building problems (232. roof
cap)
A pattern language is an organized way
of tackling an architectural problem
using patterns
Main limitation:
 It’s about civil not software
architecture!!!
25
Alexander’s Pattern Language (95-103)
Layout the overall arrangement of a group of buildings: the
height and number of these buildings, the entrances to
the site, main parking areas, and lines of movement
through the complex.
95. Building Complex
96. Number of Stories
97. Shielded Parking
98. Circulation Realms
99. Main Building
100. Pedestrian Street
101. Building Thoroughfare
102. Family of Entrances
103. Small Parking Lots
26
Family of Entrances (102)
May be part of Circulation Realms (98).
Conflict:
When a person arrives in a complex of offices or services
or workshops, or in a group of related houses, there is a
good chance he will experience confusion unless the
whole collection is laid out before him, so that he can
see the entrance of the place where he is going.
Resolution:
Lay out the entrances to form a family. This means:
 1) They form a group, are visible together, and each is
visible from all the others.
 2) They are all broadly similar, for instance all porches, or
all gates in a wall, or all marked by a similar kind of
doorway.
 May contain Main Entrance (110), Entrance Transition (112),
Entrance Room (130), Reception Welcomes You (149).
27
Family of Entrances
http://www.intbau.org/Images/Steele/Badran5a.jpg
28
Computational Patterns
The Dwarfs from “The Berkeley View” (Asanovic et al.)
Dwarfs form our key computational patterns
29
Patterns for Parallel Programming
• PLPP is the first attempt to
develop a complete pattern
language for parallel software
development.
• PLPP is a great model for a
pattern language for parallel
software
• PLPP mined scientific
applications that utilize a
monolithic application style
•PLPP doesn’t help us much with
horizontal composition
•Much more useful to us than:
Design Patterns: Elements of
Reusable Object-Oriented Software,
Gamma, Helm, Johnson &
Vlissides, Addison-Wesley, 1995.
30
Structural programming patterns



In order to create more complex
software it is necessary to
compose programming patterns
For this purpose, it has been
useful to induct a set of patterns
known as “architectural styles”
Examples:

pipe and filter
 event based/event driven
 layered
 Agent and
repository/blackboard
 process control
 Model-view-controller
31
Putting it all together…
13 dwarves
Elements of a Pattern Description
•
Name
•
Problem:

•
Context

•
Trade-offs that crop up in this situation
Solution

•
Context in which this problem occurs
Forces

•
Classes of problems this pattern addresses
Solution the pattern embodies
Invariants

Properties that need to always be true for this pattern to work
•
Examples
•
Known uses
•
Related Patterns
33
Programming Pattern Language 1.0 Keutzer& Mattson
Applications
Choose your high level
structure – what is the
structure of my
application? Guided
expansion
Efficiency Layer
Productivity Layer
Pipe-and-filter
Agent and Repository
Process Control
Event based, implicit
invocation
Choose your high level architecture - Guided decomposition
Identify the key
computational patterns –
what are my key
computations?
Guided instantiation
Task Decomposition ↔ Data Decomposition
Group Tasks Order groups data sharing data access
Model-view controller
Graph Algorithms
Graphical models
Iterator
Dynamic Programming
Finite state machines
Map reduce
Dense Linear Algebra
Backtrack Branch and Bound
Layered systems
Sparse Linear Algebra
N-Body methods
Arbitrary Static Task Graph
Unstructured Grids
Circuits
Structured Grids
Spectral Methods
Refine the structure - what concurrent approach do I use? Guided re-organization
Event Based
Data Parallelism
Pipeline
Task Parallelism
Divide and Conquer
Geometric Decomposition
Discrete Event
Graph algorithms
Utilize Supporting Structures – how do I implement my concurrency? Guided mapping
Distributed
Fork/Join
Shared Queue
Array SharedCSP
Shared Hash Table
Data
Digital Circuits
Master/worker
Loop Parallelism
BSP
Implementation methods – what are the building blocks of parallel programming? Guided implementation
Thread Creation/destruction
Message passing
Speculation
Barriers
Process/Creation/destruction
Collective communication
Transactional memory
Mutex
Semaphores
34
Architecting Parallel Software
Decompose Tasks
Decompose Data
•Group tasks
•Identify data sharing
•Order Tasks
•Identify data access
Identify the Software
Structure
Identify the Key
Computations
35
Identify the SW Structure
Structural Patterns
•Pipe-and-Filter
•Agent-and-Repository
•Event-based coordination
•Iterator
•MapReduce
•Process Control
•Layered Systems
These define the structure of our software but they do not
describe what is computed
36
Analogy: Layout of Factory Plant
37
Identify Key Computations
Computational
Patterns
Computational patterns describe the key computations but not
how they are implemented
38
Analogy: Machinery of the Factory
39
Architecting Parallel Software
Decompose Tasks/Data
Order tasks
Identify Data Sharing and Access
Identify the Software
Structure
•Pipe-and-Filter
•Agent-and-Repository
•Event-based
•Bulk Synchronous
•MapReduce
•Layered Systems
•Arbitrary Task Graphs
Identify the Key
Computations
• Graph Algorithms
• Dynamic programming
• Dense/Spare Linear Algebra
• (Un)Structured Grids
• Graphical Models
• Finite State Machines
• Backtrack Branch-and-Bound
• N-Body Methods
• Circuits
• Spectral Methods
40
Analogy: Architected Factory
Raises appropriate issues like scheduling, latency, throughput,
workflow, resource management, capacity etc.
41
Outline

A Programming Pattern Language
– patterns: why and what?
 Structural patterns
 Computational patterns
 Composing patterns: examples
 Concurrency patterns and PLPP
 Examples: pattern language case studies
 Examples: pattern language in Smoke Demo
 Motivation
42
Inventory of Structural Patterns
1.
2.
3.
4.
5.
6.
7.
8.
pipe and filter
iterator
MapReduce
blackboard/agent and repository
process control
model-view controller
layered
event-based coordination
Elements of a structural pattern

Components are where the computation
happens


A configuration is
a graph of
components
(vertices) and
connectors
(edges)
A structural
pattern may be
described as a
family of graphs.
Connectors are where the communication happens
44
Pattern 1: Pipe and Filter
•Filters embody computation
•Only see inputs and produce
outputs
Filter 1
Filter 3
•Pipes embody
communication
Filter 2
Filter 4
May have feedback
Filter 5
Filter 6
Filter 7
Examples?
45
Examples of pipe and filter

Almost every large software program has a pipe and filter structure at
the highest level
Compiler
Image Retrieval System
Logic optimizer
46
Pattern 2: Iterator Pattern
Initialization condition
Variety of
functions
performed
asynchronously
iterate
Synchronize
results of iteration
No
Exit condition met?
Yes
Examples?
47
Example of Iterator Pattern:
Training a Classifier: SVM Training
Iterator Structural Pattern
Update
surface
iterate
Identify
Outlier
All points within
acceptable error?
No
Yes
48
Pattern 3: MapReduce
To us, it means
 A map stage, where data is mapped onto independent
computations
 A reduce stage, where the results of the map stage are
summarized (i.e. reduced)
Map
Map
Reduce
Reduce
Examples?
49
Examples of Map Reduce



General structure:
Map a computation across distributed data sets
Reduce the results to find the best/(worst),
maxima/(minima)
Support-vector machines (ML)
• Map to evaluate distance from
the frontier
• Reduce to find the greatest
outlier from the frontier
Speech recognition
• Map HMM computation
to evaluate word match
• Reduce to find the mostlikely word sequences
50
Pattern 4: Agent and Repository
Agent 2
Agent 1
Repository/
Blackboard
(i.e. database)
Agent 3
Examples?
Agent 4
Agent and repository : Blackboard structural pattern
Agents cooperate on a shared medium to produce a result
Key elements:
 Blackboard: repository of the resulting creation that is
shared by all agents (circuit database)
 Agents: intelligent agents that will act on blackboard
(optimizations)
 Manager: orchestrates agents access to the blackboard and
creation of the aggregate results (scheduler)
51
Example: Compiler Optimization
Common-sub-expression
elimination
Constant
folding
loop
fusion
Software
pipelining
Internal
Program
representation
Strength-reduction
Dead-code elimination
Optimization of a software program
 Intermediate representation of program is stored in the repository
 Individual agents have heuristics to optimize the program
 Manager orchestrates the access of the optimization agents to the
program in the repository
 Resulting program is left in the repository
52
Example: Logic Optimization
timing
opt agent 1
timing
opt agent 2
timing
opt agent 3
……..
timing
opt agent N
Circuit
Database





Optimization of integrated circuits
Integrated circuit is stored in the repository
Individual agents have heuristics to optimize the circuitry of an
integrated circuit
Manager orchestrates the access of the optimization agents to the
circuit repository
Resulting optimized circuit is left in the repository
53
Pattern 5: Process Control
manipulated
variables
control
parameters
controller
input variables
process
controlled
variables
Source: Adapted from Shaw & Garlan 1996, p27-31.

Process control:
 Process:
underlying phenomena to be controlled/computed
 Actuator: task(s) affecting the process
 Sensor: task(s) which analyze the state of the process
 Controller: task which determines what actuators should be
effected
Examples?
54
Examples of Process Control
user
timing
constraints
Timing
constraints
controller
Process control
structural pattern
Circuit
55
Pattern 6: Model-View-Controller
• Model: embodies the data and “intelligence” (aka business logic) of the
system
• Controller: captures all user input and translates it into actions on the
model
Examples?
• View: renders the current state of the model for user
56
Example of Model-View Controller
View 1
View 2
View 1
Control Form
Values
50%
50%
30%
View 2
30%
20%
Back
User Updates
Values
&
Presses
‘View 1’
Button
Controller
Controller
Selects View
State Change
20%
Back
View
Determines
Model State
a = 50%
b = 30%
c = 20%
Model
Extended from: Design Patterns
Elements of Reusable ObjectOriented Software
57
Pattern 7: Layered Systems


Individual layers are big but the interface between two adjacent layers
is narrow
Non-adjacent layers cannot communicate directly.
Examples?
58
Example: ISO Network Protocol
59
Pattern 8: Event-based Systems
Agent
Agent
Agent
Agent
Event
Manager
Agent
Agent
Agent
Agent
Medium



Agents interact via events/signals in a medium
Examples?
Event manager manages events
Interaction among agents is dynamic – no fixed connection
60
Example: The Internet
• Internet is the medium
• Computers are agents
• Signals are IP packets
• Control plane of the router is
the event manager
128.0.0.56
61
Remember the Analogy:
Layout of Factory Plant

We have only talked about structure. We haven’t described computation.
62
Architecting Parallel Software
Decompose Tasks
Decompose Data
•Group tasks
•Identify data sharing
•Order Tasks
•Identify data access
Identify the Software
Structure
Identify the Key
Computations
63
Outline

A Programming Pattern Language
– patterns: why and what?
 Structural patterns
 Computational patterns
 Composing patterns: examples
 Concurrency patterns and PLPP
 Examples: design patterns in Smoke Demo
 Motivation
64
Computational Patterns
the dwarfs from “The Berkeley View” form our key
computational patterns
65
Problem and Context

Problem


In many situations the proper behavior of a system can naturally
be described by a language of finite, or perhaps infinite, strings.
The problem is to define a piece of software that distinguishes
between valid input strings (associated with proper behavior) and
invalid input strings (improper behavior). The system may have a
set of pre-defined responses to proper input and another set of
responses to improper input.
Context:

As inputs arrive a system must respond depending on the input.
Alternatively, depending on inputs, changes in a process may be
actuated. The proper response of the system may be to idle after
receiving a sequence of inputs, or the input stream may be
presumed to be infinite.
66
Solution: Finite State Machines
Transducer FSM: Huffman decoding
Output
Symbol
Codeword
A
0
B
10
C
1100
D
1101
E
1110
F
1111
0/C, 1/D
0/A
1/1/a 0/B b
0/E, 1/F
d
0/c
1/e
Input Alphabet : 0, 1
Output Alphabet : A - F
States: a,b,c,d,e
Transitions:
 ( (a, 0) , (a, A)),
 ( (a, 1) , (b,-)),…
Initial state: a
After: Multimedia Image and Video Processing By Ling Guan, Jan Larsen
67
Example: Traffic Light State Machine
Otherwise
IF A=1
AND B=0
Always
A
B
Car
Sensors
Always
IF A=0
Note:
Clock
beats
every
4 sec.
So
Light
is
Yellow
for
4 sec.
AND B=1
Otherwise
68
Problem and Context

Problem


In many problems the output is a simple logical function, or
bit-wise permutation, of the input.
Context:

A vector of Boolean values is applied as input and another
set, defined by combinational operators or “wiring”, is given
as output.
69
Solution: Circuits

Describe desired function as a circuit
70
Example:
Digital
Encryption
Standard (DES)
Kp
Kp-1
Kp-2
K3
K2
K1
'1'
1
0
new SP
1
chosen
plaintext
0
MASK
DES
test DP?
EP
71
Outline

A Programming Pattern Language
– patterns: why and what?
 Structural patterns
 Computational patterns
 Composing patterns: examples
 Concurrency patterns and PLPP
 Examples: pattern language case studies
 Examples: pattern language in Smoke Demo
 Motivation
72
Architecture of Logic Optimization
Group, order tasks
Decompose
Data
Group, order tasks
Group, order tasks
Graph algorithm pattern
Graph algorithm pattern
73
Architecting Speech Recognition
Pipe-and-filter
Recognition
Network
Graphical
Model
Inference Engine
Active State
Computation Steps
Pipe-and-filter
Dynamic
Programming
MapReduce
Voice
Input
Beam
Search
Iterations
Signal
Processing
Most
Likely
Word
Sequence
Iterator
Large Vocabulary Continuous Speech Recognition Poster: Chong, Yi
Work also to appear at Emerging Applications for Manycore Architecture
74
CBIR Application Framework
New Images
250
200
150
Choose Examples
Feature Extraction
100
50
0
Learn Classifier
Exercise Classifier
Results
User Feedback
?
?
Catanzaro, Sundaram, Keutzer, “Fast SVM Training and Classification on
Graphics Processors”, ICML 2008
75
Feature Extraction
Image histograms are common to many feature extraction procedures,
and are an important feature in their own right
• Agent and Repository: Each agent
computes a local transform of the
image, plus a local histogram.
• Results are combined in the
repository, which contains the global
histogram

The data dependent access patterns found when constructing
histograms make them a natural fit for the agent and repository
pattern
76
Learn Classifier:
SVM Training
Update
Optimality
Conditions
iterate
Learn Classifier
MapReduce
Select
Working
Set,
Solve QP
MapReduce
Bulk Synchronous Structural Pattern
77
Exercise Classifier : SVM
Classification
Test Data
SV
Compute
dot
products
Dense Linear
Algebra
Exercise Classifier
Compute
Kernel values,
sum & scale
MapReduce
Output
78
Outline

A Programming Pattern Language
– patterns: why and what?
 Structural patterns
 Computational patterns
 Composing patterns: examples
 Concurrency patterns and PLPP
 Examples: pattern language case studies
 Examples: pattern language in Smoke Demo
 Motivation
79
Recap: what we’re trying to accomplish


Ultimately, we want domain-expert programmers to routinely
create parallel programs.
Domain-Expert programmers:

Driven by the need to ship robust solutions on schedule.
 New features are more important than performance.
 Little or no background in computer science or concurrency.
 Finished with formal education … “on the job” learning in burst mode.

Note: we not trying to tell concurrency experts how to write
parallel programs


Expert parallel programmers … they already have what they need.
HPC programmers … they want performance at all costs and will work
as close to the hardware as needed to hit performance goals.
80
How do we support domain experts:
1
2
3
Domain Experts
Domain literate programming
gurus (1% of the population).
+
Application
frameworks
End-user,
application
programs
+
Parallel
programming
frameworks
Application
frameworks
Parallel programming gurus (1-10% of programmers)
Parallel
programming
frameworks
The hope is for Domain Experts to create parallel code with little or no
understanding of parallel programming.
Leave hardcore “bare metal” efficiency layer programming to the parallel
programming experts
81
How will we make this happen?

Software architecture systematically described in
terms of a design pattern language:

Everyone … even the “gurus” … need direction.




The “parallel programming gurus” need to know the variety of
parallel programming patterns in use.
The “domain literate programming gurus” need to know which
application patterns to support.
… and we need to document these so everyone can work with them
at the appropriate level of detail.
By expressing parallel programming in terms of design pattern
languages:



We provide this direction to the gurus
We document the frameworks and how they work.
We layout a roadmap to solving the parallel programming problem.
82
Programming Pattern Language 1.0 Keutzer& Mattson
Applications
Choose your high level
structure – what is the
structure of my
application? Guided
expansion
Efficiency Layer
Productivity Layer
Pipe-and-filter
Agent and Repository
Process Control
Event based, implicit
invocation
Choose your high level architecture - Guided decomposition
Identify the key
computational patterns –
what are my key
computations?
Guided instantiation
Task Decomposition ↔ Data Decomposition
Group Tasks Order groups data sharing data access
Model-view controller
Graph Algorithms
Iterator
Dynamic Programming
Map reduce
Dense Linear Algebra
Layered systems
Sparse Linear Algebra
Arbitrary Static Task Graph
Unstructured Grids
Graphical models
Finite state machines
Backtrack Branch and Bound
N-Body methods
Circuits
Spectral Methods
Structured Grids
Refine the structure - what concurrent approach do I use? Guided re-organization
Event Based
Data Parallelism
Pipeline
Task Parallelism
Divide and Conquer
Geometric Decomposition
Discrete Event
Graph algorithms
Utilize Supporting Structures – how do I implement my concurrency? Guided mapping
Distributed
Fork/Join
Shared Queue
Array SharedCSP
Shared Hash Table
Data
Digital Circuits
Master/worker
Loop Parallelism
BSP
Implementation methods – what are the building blocks of parallel programming? Guided implementation
Thread Creation/destruction
Message passing
Speculation
Barriers
Process Creation/destruction
Collective communication
Transactional memory
Mutex
Semaphores
83
Programming Pattern Language 1.0 Keutzer& Mattson
These
patterns are all about parallel algorithms and how
 Top levels: Structural and
to turn them into code.
computational patterns:
Many
books talk about how to use a particular
 Note: in many cases, these pertain to
programming
… practices
We instead…focus
how to
goodlanguage
software
bothonserial
and
use
these languages and “think parallel”.
parallel.
Applications
Choose your high level
structure – what is the
structure of my
application? Guided
expansion
Efficiency Layer
Productivity Layer
Pipe-and-filter
Agent and Repository
Process Control
Event based, implicit
invocation
Choose your high level architecture - Guided decomposition
Task Decomposition ↔ Data Decomposition
Group Tasks Order groups data sharing data access
Model-view controller
Iterator
Map reduce
Layered systems
Arbitrary Static Task Graph
Identify the key
computational patterns –
what are my key
computations?
Guided instantiation
Graphical models
Finite state machines
Backtrack Branch and Bound
N-Body methods
Circuits
Spectral Methods
Graph Algorithms
Dynamic Programming
Dense Linear Algebra
Sparse Linear Algebra
Unstructured Grids
Structured Grids
Refine the structure - what concurrent approach do I use? Guided re-organization
Event Based
Data Parallelism
Pipeline
Task Parallelism
Divide and Conquer
Geometric Decomposition
Discrete Event
Graph algorithms
Utilize Supporting Structures – how do I implement my concurrency? Guided mapping
Distributed
Fork/Join
Shared Queue
Array
CSP
Shared Hash Table
Shared-Data
Master/worker
Loop Parallelism
BSP
Implementation methods – what are the building blocks of parallel programming? Guided implementation
Thread Creation/destruction
Message passing
Speculation
Barriers
Process Creation/destruction
Collective communication
Transactional memory
Digital Circuits
Semaphores
Mutex
84
PLPP’s structure:
Four design spaces in parallel software development
Decomposition
Original Problem
Tasks, shared and local data
Implementation. &
building blocks
Units of execution + new shared data
for extracted dependencies
Program SPMD_Emb_Par ()
{ Program SPMD_Emb_Par ()
TYPE
*tmp, *func();
{ Program
SPMD_Emb_Par ()
global_array
Data(TYPE);
TYPE
*tmp,
*func();
{ Program
SPMD_Emb_Par ()
global_array
Res(TYPE);
global_array
Data(TYPE);
TYPE
*tmp,
*func();
{
int N
= global_array
get_num_procs();
global_array
Data(TYPE);
TYPERes(TYPE);
*tmp,
*func();
int id
get_proc_id();
int=N
= global_array
get_num_procs();
global_array
Res(TYPE);
Data(TYPE);
if (id==0)
int id
get_proc_id();
int=setup_problem(N,DATA);
N
= get_num_procs();
global_array
Res(TYPE);
for (int
I= 0;
if (id==0)
int
id
=setup_problem(N,DATA);
get_proc_id();
intI<N;I=I+Num){
Num
= get_num_procs();
tmp
= (id==0)
func(I);
for (int
I= 0;
if
int
id I<N;I=I+Num){
=setup_problem(N,DATA);
get_proc_id();
Res.accumulate(
tmp
= (id==0)
func(I);
for (int
I= 0; tmp);
I<N;I=I+Num){
if
setup_problem(N, Data);
}
Res.accumulate(
tmp
= func(I);
for (int
I= ID;tmp);
I<N;I=I+Num){
}
}
Res.accumulate(
tmp = func(I, tmp);
Data);
}
}
Res.accumulate( tmp);
}
}
}
Corresponding source code85
Decomposition
(Finding Concurrency)
Start with a specification that solves the original problem -- finish with the
problem decomposed into tasks, shared data, and a partial ordering.
Start
DependencyAnalysis
DecompositionAnalysis
GroupTasks
OrderGroups
DataDecomposition
DataSharing
TaskDecomposition
decomposition
Structural
Design Evaluation
Computational
Concurrency strategy
Implementation strategy
Par prog building blocks
86
Concurrency Strategy
(Algorithm Structure)
Start
Organize By Flow of Data
Regular
?
Pipeline
Irregular?
Event Based
Coordination
Organize By Tasks
Organize By Data
Linear
?
Recursive?
Linear
?
Task
Parallelis
m
Divide and
Conquer
Geometric
Decompositio
n
Recursive?
Recursive
Data
decomposition
Decision
Structural
Computational
Decision Point
Key
Concurrency strategy
Implementation strategy
Par prog building blocks
Design Pattern
87
Implementation strategy
(Supporting Structures)
High level constructs impacting large scale organization of the source code.
Program Structure
Data Structures
SPMD
Shared Data
Master/Worker
Shared Queue
Loop Parallelism
Distributed Array
Fork/Join
decomposition
Structural
Computational
Concurrency strategy
Implementation strategy
Par prog building blocks
88
Parallel Programming building blocks
Low level constructs implementing specific constructs used
in parallel computing. Examples in Java, OpenMP and MPI.
These are not properly design patterns, but they are included
to make the pattern language self-contained.
UE* Management
Thread control
Process control
Synchronization
Communications
Memory sync/fences
Message Passing
barriers
Collective Comm
Mutual Exclusion
Other Comm
decomposition
Structural
Computational
Concurrency strategy
* UE = Implementation
Unit of execution
strategy
Par prog building blocks
89
Case Studies

Simple Examples


Linear Algebra


Managing recursive parallelism
Molecular dynamics


Multi-level parallelism
Branch and Bound


Abstracting complexity in data structures
Spectral Methods


Numerical Integration
A complete example from design to code.
Game Engines

A complex example showing that there are multiple ways
to parallelize a problem
90
The heart of a game is the “Game Engine”
Front End
network
Animation
Input
Sim
Render
data/state
Data/state
Audio
Data/state
Media
(disk,
optical
media)
assets
game state in its internally managed data structures.
Source: Conversations with developers at Electronic Arts
91
The heart of a game is the “Game Engine”
Front End
network
Sim
Physics
Input
Sim
Time
Loop
Particle
System
Render
Data/state
data/state
Collision
Detection
Media
(disk,
optical
media)
Animation
Audio
AI
Data/state
Sim is the integrator
assets
and integrates from
one time step to the
next; calling update
methods for the
other modules inside
a central time loop.
game state in its internally managed data structures.
Source: Conversations with developers at Electronic Arts
92
Finding concurrency: functional parallelism
Front End
network
Animation
Input
Sim
Render
data/state
Data/state
Audio
Data/state
Media
(disk,
optical
media)
assets
Combine modules into groups, assign one group to a
thread.
Asynchronous execution … interact through events.
Coarse grained parallelism dominated by flow of data
between groups … event based coordination pattern.
93
The FindingConcurrency Design Space
Start with a specification that solves the original problem -- finish with
the problem decomposed into tasks, shared data, and a partial ordering.
Start
Many cores needs many concurrent tasks
DependencyAnalysis
… functional parallelism just doesn’t
DecompositionAnalysis
expose
GroupTasks enough concurrency in this case.
DataDecomposition
We
need to go back and rethink
the design.
OrderGroups
TaskDecomposition
DataSharing
Design Evaluation
94
More sophisticated parallelization strategy
Front End
network
Animation
Input
Sim
Render
data/state
Data/state
Audio
Data/state
Media
(disk,
optical
media)
assets
Decompose computation into a pool of tasks –
finer granularity exposes more concurrency.
Work stealing critical to support good load
balancing.
95
Parallel Execution Strategy: Task parallelism pattern
Dependencies create an
acyclic graph of tasks.
Front End
network
Input
Animation
Sim
Render
data/state
Data/state
Audio
Data/state
Media
(disk,
optical
media)
assets
Tasks are assigned
priorities with some
labeled as optional.
A pool of threads executes tasks
based on:
• When dependencies are met.
• Task Priority
The Challenge is scheduling based on
bus usage, load balancing, and
managing concurrency for
correctness
96
decomposition
The AlgorithmStructure Design
Space
Solution: A composition
of these two patterns
Structural
Computational
Concurrency strategy
Implementation strategy
Par prog building blocks
Start
Organize By Flow of Data
Regular?
Pipeline
Irregular?
Event Based
Coordination
High level architecture …
functional decomposition
Organize By Tasks
Organize By Data
Linear?
Recursive?
Linear?
Task
Parallelism
Divide and
Conquer
Geometric
Decomposition
Collections of tasks
generated by modules
Recursive?
Recursive
Data
97
The Supporting Structures Design Space
Top level
level
functional
functional
Decomposition
Decomposition
with MPI
Manage task
Manage
task
queues inina a
queues
global pool
global
pool(to
(to
support work
support
work
stealing)
stealing)
Standard queue
won’t
willwill
need
to build
Standard
queue
won’twork,
work,
need
to build
more general
structures
to support
prioritized
more
generaldata
data
structures
to support
queue and interrupt
capabilities
prioritized
queue and
interrupt capabilities
Program Structure
Data Structures
SPMD
Shared Data
Master/Worker
Shared Queue
Loop Parallelism
Distributed Array
Fork/Join
98
Architecture/Framework implications
Top


level structural patterns
Event based, implicit invocation
Inside each module


Fine grained task decomposition
Master worker pattern extended with



Work stealing
Task priorities
Soft real time constraints … interrupt, update state, flush queues
This would fit nicely into a framework … in fact it
needs a framework since the task queues would
be very complicated to get right.
99
Outline

The parallel computing landscape

A Programming Pattern Language
 Overview
 Structural patterns
 Computational patterns
 Composing patterns: examples
 Concurrency patterns and PLPP
 Examples: simple case studies

Applications in CAD
 The CAD algorithm landscape
 Parallel patterns in CAD: overview
 Detailed case studies: architecting parallel CAD software
100
How does architecting SW help?

A good software architecture should provide:
 Composibility: allow you to build something complicated by
composing simple pieces
 Modularity: elements of the software are firewalled from each
other
 Locality: easy to identify what’s happening where
 May provide a basis for building a more general framework (e.g.
Model-view-controller and Ruby-on-Rails)

Patterns help by:
 Clearly identifying the problem
 Offering solutions that capture and embody a lot of shared
experience with solving the problem
 Offering a common vocabulary for this problem/solution process
 Also capturing common pitfalls of the solutions in the pattern
(e.g. repository will always become the bottleneck in data-andrespository)
101
Overall Summary

Manycore microprocessors are on the horizon and computationally intensive
applications, like computer-aided design, will want to exploit them

Incremental methods will not scale to exploit >= 32 processors:
 Use of threads (OpenMP, Win32, others)
 Incremental parallelization of serial codes
 Use of threading Tools and Library support

Need to re-architect software

Key to re-architect software for manycore is patterns and frameworks

We identified two key computational patterns for CAD:
 graph algorithms and
 backtracking/branch-and-bound

There is some software support in the from of a framework, Cilk, for
backtracking/branch-and-bound

The key to parallelize graph algorithms is netlist/graph partitioning

We presented a number of approaches for effective graph partitioning

We aim to use this to produce a framework for graph algorithms
Design Patterns in education:
lessons from history?
Early OO days
Now
Perception: Object
oriented? Isn’t that
just an academic
thing?
Perception:
OO=programming
Usage: specialists
only. Mainstream
avoids with
indifference,
anxiety, or worse.
Isn’t this how it was
always done?
1994
Performance: not
so good.
103
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Usage: cosmetically
widespread, some key
concepts actually
deployed.
Performance: so-so,
masked by CPU
advances until now.
Design Patterns in education:
lessons from history?
Early PP days(now)
Future
Perception: Parallel
programming? Isn’t
that just an HPC
thing?
Perception:
PP=programming
Usage: specialists
only. Mainstream
avoids with
indifference, anxiety,
or worse.
Isn’t this how it was
always done?
2010?
Performance: very
good, for the
specialists.
104
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Usage: cosmetically
widespread, some key
concepts actually
deployed.
Performance: broadly
sufficient. Application
domains greatly
expanded.
Michael Anderson, Mark Murphy,
Smoke Pattern Decomposition
Kurt Keutzer
Copyright © 2009, Intel Corporation. All rights reserved.
*Intel and the Intel logo are registered trademarks of Intel Corporation. Other brands and names are the property of their respective owners
Agenda
Characterize the architecture and computation of Smoke in
terms of Our Pattern Language
•
•
What is an effective multi-threaded game architecture?
•
Which components of Smoke can be easily sped up on manycore
devices?
Can the architecture and computation in Smoke be described in Our
Pattern Language?
Games as a potential ParLab app
•
•
106
Where can we as grad students have the most impact?
What else can manycore enable?
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Our Pattern Language
Hierarchy of patterns that describe reusable components of
software.
107
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Agenda
Characterize the architecture and computation of Smoke in
terms of Our Pattern Language
•
•
What is an effective multi-threaded game architecture?
•
Which components of Smoke can be easily sped up on manycore
devices?
Can the architecture and computation in Smoke be described in Our
Pattern Language?
Games as a potential ParLab app
•
•
108
Where can we as grad students have the most impact?
What else can manycore enable?
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Structural and Computational
Patterns
First lets talk about the structural and computational patterns
we found in Smoke.
109
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Intro
Top-Level: Architecture
Smoke is composed of several subsystems
Some (Physics, Graphics . .) are large independent engines.
Velocity
Input
Input data
Input
AI
Pos, Velocity, Goal
Physics
Po
Pos, Velocity, Collision s
Po
Po
s
s
Graphics
Pos, Verts, Texture
Audio
Pos, SFX
110
Brad Werth, GDC 2009
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
110/32
Top-Level v1: Task Graph Pattern
There exist fixed dependencies between subsystems
Can be modeled as an arbitrary task graph
Example: Moving the zombie
•
Keyboard -> AI -> Physics -> Graphics
Input
Physics
AI
Effects
Graphics
111
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Top-Level v1: Iterator Pattern
Iterates over consecutive frames
Data from previous frame is used in the next
Input
Physics
AI
Effects
Graphics
112
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Top-Level v2: Puppeteer
Subsystems have private data, need access to other
subsystems' data
Don't want to write N*(N-1) interfaces
Common use is to manage communication between multiple
simulators, each with different data structures
113
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Task Graph (v1) vs. Puppeteer (v2)
Subsystem modularity is important (swappable)
Want to reduce the complexity of communication for a scalable
multi-threaded design
Motivates the use of the puppeteer pattern instead of
arbitrary task graph for subsystem communication
Framework
Change Control Manager
Interfaces
Input
Physics
Graphics
Brad Werth, GDC 2009
114
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Effects
AI
Agenda
Characterize the architecture and computation of Smoke in
terms of Our Pattern Language
•
•
What is an effective multi-threaded game architecture?
•
Which components of Smoke can be easily sped up on manycore
devices?
Can the architecture and computation in Smoke be described
in Our Pattern Language?
Games as a potential ParLab app
•
•
115
Where can we as grad students have the most impact?
What else can manycore enable?
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Puppeteer Subsystem Patterns
Now that we’ve described the top-level structural patterns, look
at the subsystems.
Framework
Change Control Manager
Interfaces
Input
116
Physics
Graphics
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Effects
AI
Physics Subsystem
From Havok User Guide
Pipe and filter and structure
117
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Physics Subsystem
Constraint Solver – Minimize penalties and forces in a system
subject to constraints.
Dense Linear Algebra computational pattern
> 180°
118
robbocode.com
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Physics Subsystem
Collision Detection – Interpolate paths and test if overlap occurs
during the current timestep
N-body computational pattern
119
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
AI Subsystem
Agent and Repository structural pattern
•
•
Multiple agents access and modify shared data
Examples: Version control systems, logic optimization, parallel SAT
solvers
Agent 1
Agent 2
Shared
Repository
(i.e., Database)
...
Agent N
120
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
AI Subsystem
Zombies and chickens update location in POI repository
One writer (zombie) many readers (chickens) reduces the
burden on the repository controller
Agents (animals) are independent state machines
121
Brad Werth, GDC 2009
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Effects Subsystem
Procedural Fire: Two particle simulators
•
•
Visible fire
Invisible “Heat” particles
N-body computational pattern
122
Hugh Smith, Intel
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Parallel Implementation Patterns
We’ve described the structural and computational patterns.
What about the lower layers?
123
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Parallel Algorithm Strategy
Patterns (1)
At the top level, subsystems can run concurrently
•
Communication within a single frame is eliminated by doublebuffering
Smoke exploits this task parallelism (subject of Lab 2)
Frame 1 Data
Graphics
Physics
AI
Frame 2 Data
124
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
...
Parallel Algorithm Strategy
Patterns (2)
Procedural Fire is highly data parallel
Smoke spreads the computation over 8 threads (ideal
hardware utilization)
Hugh Smith,
Intel
Brad Werth, GDC 2009
125
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Parallel Algorithm Strategy
Patterns (3)
AI subsystem exploits task parallelism among independent
agents (chickens)
Thread Profile Before
AI
Render
Thread Profile After
Render
AI
Brad Werth, GDC 2009
126
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Implementation Strategy and
Concurrent Execution Patterns
Task queue -> Thread pool
Intel Thread Building Blocks
Brad Werth, GDC 2009
127
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Agenda
Characterize the architecture and computation of Smoke in
terms of Our Pattern Language
•
•
What is an effective multi-threaded game architecture?
•
Which components of Smoke can be easily sped up on
manycore devices?
Can the architecture and computation in Smoke be described in Our
Pattern Language?
Games as a potential ParLab app
•
•
128
Where can we as grad students have the most impact?
What else can manycore enable?
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
What was done in the demo?
Top level
•
Task parallelism among subsystems
Procedural Fire
•
Data parallel particle simulator
AI Subsystem
•
129
Task parallelism among agents
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
What else can be done? (1)
Add data parallel effects: rain, ragdoll corpses, more things
breaking apart
•
•
•
Similar to procedural fire
Can easily be added without changing gameplay
Objects may or may not be registered with the scene graph
Particle
simulator
130
nvidia.com
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
What else can be done? (2)
Speed up and enhance physics subsystem
•
•
•
Independent “simulation islands” can be executed in parallel
Parallel constraint solver & collision detection
Speedup allows for more detailed and realistic simulation
– More active independent objects in simulation
– Bring in new algorithms from scientific computing
•
•
131
Must ensure the worst-case computation can fit in a single frame
Nvidia PhysX is working on this
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
What else can be done? (3)
Enable more complex AI
•
•
Smoke’s AI is very simple. What is AI like in larger games?
If AI is just state machines, we can add more
Add new interfaces
•
•
•
132
Computer vision pose detection as input
Overlaps well with our current research
Microsoft Xbox – Project Natal
Xbox.com/projectnatal
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Agenda
Characterize the architecture and computation of Smoke in
terms of Our Pattern Language
133
•
•
What is an effective multi-threaded game architecture?
•
Which components of Smoke can be easily sped up on manycore
devices?
Can the architecture and computation in Smoke be described in Our
Pattern Language?
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners
Summary and conclusions
Characterize the architecture and computation of Smoke in
terms of Our Pattern Language
•
What is an effective multi-threaded game architecture?
– Puppeteer patterns allows for modularity, ease of development and
evolution
•
Can the architecture and computation in Smoke be described in Our
Pattern Language?
– Yes, the description is natural
•
Which components of Smoke can be easily sped up on manycore
devices?
– Data parallel effects, physics, AI, and new interfaces all show potential
benefit
134
Copyright © 2006, Intel Corporation. All rights reserved. Prices and availability subject to change without notice.
*Other brands and names are the property of their respective owners