G A P raphical

advertisement
Graphical Asymmetric
Processing
Prototype Presentation
December 13, 2004
December 13, 2004
1
Team Organization
Mohammed Iraqi
Web Developer
Sunny Nanda
Budget Analyst
Joseph Williams
Technical Writer
December 13, 2004
Gene Hill Price
General Manager
John Zareno
Project Manager
Thomas James
Team Lead
Roberta Serbenescu
Software Analyst
Tiffany Williams
Research Analyst
2
http://www.apple.com/education/science/profiles/vatech/
December 13, 2004
3
http://www.apple.com/education/science/profiles/vatech/
December 13, 2004
4
http://www.apple.com/education/science/profiles/vatech/
December 13, 2004
5
Problem Statement
• Computationally intensive environments
underutilize Graphical Processing Units.
December 13, 2004
6
Background Information
• Discussed since 1996, but never implemented.
• GPU Performance
– Multiplied at a rate of 2.8 times per year since 1993
– Expected to increase at this rate for another 5 years
• The more performance increases the more helpful
our product becomes
http://www.computer.org/computer/homepage/1003/entertainment
December 13, 2004
7
Solution: G.A.P.
Create a usable, extendable, and
maintainable API to leverage the unused
computing power of graphics processors
that will result in increased performance of
scientific, database, and other processorintensive applications.
December 13, 2004
8
Solution: G.A.P.
Utilizing existing hardware:
– Improve computing power
– Improve computing time
– Improve computing responsiveness
December 13, 2004
9
Solution Implementation
By creating a:
– SDK to utilize the GPU
– Selling that SDK to NVIDIA
December 13, 2004
10
What is an SDK?
• S.D.K. – Software Development Kit
• A set of programs that allows software
developers to create products to run on a
particular platform or to work with an API.
• Include: Manual, Examples, Libraries
• Other examples, both free and commercial:
– Java, OS/2, AW, Windows, DirectX
December 13, 2004
11
Phase 1 Product Goals
• Demonstrate amount of power in current
GPUs
– Also: Ability to utilize power
• Secure funding to continue development
• Secure interested parties
– universities and research labs
• Take first steps towards NVIDIA
partnership
December 13, 2004
12
Phase 1 Product Objectives
• Leverage the GPU for additional power
• Improve throughput on workstation
machines
• Ease programming difficulty for utilizing
the GPU
• Maintain current program compatibility
• Preserve system stability
December 13, 2004
13
Product Risks & Mitigations
• Vendor Support
– NVIDIA sets aside $1billion to use on
• Acquisitions
• R&D
• Writing the Software
– Time intensive product
• “Build first and optimize later”
December 13, 2004
14
Product Functional Diagram
USER
December 13, 2004
GAP
SOFTWARE
OUTPUT
15
Product Dataflow Diagram
USER
CONTEXT BUILD
CONTEXT
REQUEST
VERSION
CHECK
CAPABILITIES
TABLE
GAP BEGIN
CONTEXT
GENERATOR
GAP
COMMAND
GAP FLUSH
GAP PROCESSING
CONTEXT
CPU
QUEUE
GAP END
GPU
RESULTS
December 13, 2004
16
Product Dataflow Diagram
USER
CONTEXT BUILD
CONTEXT
REQUEST
VERSION
CHECK
CAPABILITIES
TABLE
GAP BEGIN
CONTEXT
GENERATOR
GAP
COMMAND
GAP FLUSH
GAP PROCESSING
CONTEXT
CPU
QUEUE
GAP END
GPU
RESULTS
December 13, 2004
17
Product Dataflow Diagram
USER
CONTEXT BUILD
CONTEXT
REQUEST
VERSION
CHECK
CAPABILITIES
TABLE
GAP BEGIN
CONTEXT
GENERATOR
GAP
COMMAND
GAP FLUSH
GAP PROCESSING
CONTEXT
CPU
QUEUE
GAP END
GPU
RESULTS
December 13, 2004
18
Product Dataflow Diagram
USER
CONTEXT BUILD
CONTEXT
REQUEST
VERSION
CHECK
CAPABILITIES
TABLE
GAP BEGIN
CONTEXT
GENERATOR
GAP
COMMAND
GAP FLUSH
GAP PROCESSING
CONTEXT
CPU
QUEUE
GAP END
GPU
RESULTS
December 13, 2004
19
Dataflow Diagram for Product
CONTEXT BUILD
CONTEXT
REQUEST
VERSION
CHECK
CAPABILITIES
TABLE
GAP BEGIN
CONTEXT
GENERATOR
GAP
COMMAND
GAP FLUSH
GAP PROCESSING
CONTEXT
CPU
QUEUE
GAP END
GPU
RESULTS
December 13, 2004
20
Prototype
December 13, 2004
21
Navier-Stokes Equations
• used to refer to the incompressible form of
the momentum equation.
• a full and general set of differential
equations governing the motion of a fluid
http://www.navier-stokes.net/nsdef.htm
December 13, 2004
22
Navier-Stokes Equations
• Simulation of Fluid Like Behavior
– Example of applications used within
Computational Intensive Environments
– Multiple Old Dominion PHD candidate’s thesis
topics focus on Navier Stokes
• Will serve as a basis application to prove
efficiency of GPU over CPU
– Shows an average 60% gain in efficiency
December 13, 2004
23
Prototype Functional Diagram
USER
USER
December 13, 2004
FLUID
SIMULATION
GPU VERSION
FLUID
SIMULATION
CPU VERSION
OUTPUT
OUTPUT
24
Dataflow Diagram for Prototype
CONTEXT BUILD
CONTEXT
REQUEST
VERSION
CHECK
CAPABILITIES
TABLE
CONTEXT
GENERATOR
STREAM
OPERATION
CONTEXT
GPU
PROCESSING
RESULTS
GAP VERSION
FUNCTIONAL PROTOTYPE
December 13, 2004
25
Dataflow Diagram for Prototype
CONTEXT BUILD
CONTEXT
REQUEST
VERSION
CHECK
CAPABILITIES
TABLE
CONTEXT
GENERATOR
USE OF CPU ONLY
WITH NO GPU
PROCESSING
STREAM
OPERATION
CONTEXT
CPU
RESULTS
CPU VERSION
FUNCTIONAL PROTOTYPE
December 13, 2004
26
Demonstration
• Two versions of an executable
– CPU vs GPU
• Navier Stokes on a vector field with four jets
– Demonstration will consist of firing the jets for
different lengths of time and observing performance
– Observe CPU alone
– Observe GPU alone
– Observe Simultaneously
December 13, 2004
27
On the CPU
December 13, 2004
28
With GAP on the GPU
December 13, 2004
29
Risks
• Main research issues include quality of
floating point
– The numbers are ‘single precision’ not double.
• Works best when ‘batched,’ which requires
a relatively ‘parallel’ system
– Already a multithreading issue. Solutions both
in programmer practice and compiler design
exist.
December 13, 2004
30
Risks Mitigated (Prototype)
• Floating Point Quality:
– Distributed the field thickly enough that
floating point was accurate.
• Batching:
– Used “Stream” operator that ensured a
command size was sufficient before it flushed
the results.
December 13, 2004
31
Risk Mitigation (Product)
• Floating Point
– NVIDIA says cards will include double
precision upon demand
– NVIDIA partnership will expedite.
• Batching
– The Context system has an internal, self
optimizing queue, with the “flush” instruction
for programmer flexibility.
December 13, 2004
32
Testing and Evaluation
• 20 Frames to 1 “real world second”
– Translates:
• .75-1.75 speed on GPU
– Faster than a “real world second”!
• .025-.25 speed on CPU
December 13, 2004
33
Suitability
• What does this prove?
–
–
–
–
Gives magnitude of performance increase
Efficiency gain with no new hardware
“Real world” problem solved
Standard interface any program could use
December 13, 2004
34
Degree of Completeness
Similarities
Prototype
Release
•General access functions
•“Context” based input
•Demonstrated
performance gain
•Utilizes GPU for as
much work as possible
•General access functions
•“Context” based input
•Demonstrated
performance gain
•Utilizes GPU for as
much work as possible
December 13, 2004
35
Degree of Completeness
Differences
Prototype
Release
•Specific to GF5 platform
•Limited GAP
Commands
•“All or Nothing” GPU
use
•General platform
•Wide array of GAP
commands
•Dynamic GPU use based
on capabilities
December 13, 2004
36
Budget Reports
December 13, 2004
37
Phase I Funding
• Phase I SBIR
– Completed at the end of Phase 0
December 13, 2004
38
Phase I Budget
Staff
Resource N am e
Project Manager
Technical Documenter
Web Developer
Programmer-1
Programmer-2
Programmer-3
Programmer-4
TOTAL
December 13, 2004
Initials
Standard Rate
Hours *
Total
PM
$40.00/hr 88x8=704 $28,160
TD
$15.00/hr 24x8=192 $2,880
WD
$15.00/hr 10x8=80 $1,200
P1
$25.00/hr 39x8=312 $7,800
P2
$25.00/hr 39x8=312 $7,800
P3
$25.00/hr 39x8=312 $7,800
P4
$25.00/hr 39x8=312 $7,800
~$63, 500
39
Phase I Budget
L en g t h
Staffing
40%
Overhead
Non-Staff
T ot al
December 13, 2004
88 day s
$63,500
$25,400
$3,600
$92, 500
40
Major Milestones Phase I
•
•
•
•
•
•
•
•
Organize Project Group
Produce Project Descriptive Paper
Develop Contracts
Produce Budget White Paper
Produce Project User Manual
Develop Prototype
Produce SBIR Phase II Proposal
Produce Project Website
December 13, 2004
41
Phase I Schedule
December 13, 2004
42
Phase II Funding
• Phase II SBIR
– Completed at the end of Phase I
December 13, 2004
43
Phase II Budget
Staff
Resource Name
Project Manager
Initials
PM
Standard Rate Hours *
$50.00/hr
90x8=720
Total
$36,000
Marketing
Business Expert
Communication
Specialist
Programmer*
Web Developer
Lawyer 1
Lawyer 2
Technical Document Writer
Software Quality Assurance 1
Software Quality Assurance 2
BE
$45.00/hr
90x8=720
$32,400
CS
P1
WD
L1
L2
TD
SQA 1
SQA 2
$45.00/hr
$35.00/hr
$25.00/hr
$40.00/hr
$40.00/hr
$25.00/hr
$30.00/hr
$30.00/hr
90x8=720
30x8=240
15*8=120
60*8=480
60*8=240
15*8=120
30*8=240
30*8=240
$32,400
$8,400
$3,000
$19,200
$19,200
$3,000
$7,200
$7,200
TOTAL
$165,600
*4 programmers needed
December 13, 2004
44
Patent Acquisition
L ength
Preliminary Patent Search
Preparing and Filing Patent
Application
Patent Abstract
Filing Fee
Patent Prosecution Phase
Patent Issue Phase
Patent Maintenance Fee
T OT AL
December 13, 2004
90 days
$1,500
$1,000
$7,000
$500
$6,000
$2,000
$6,000
$38, 000
45
Phase II Budget
L en g t h
Staffing
40% Overhead
Non-Staff
Patent
Acquisition
Travel
Expenses
T OT AL
90 day s
$165,600
$66,200
$0 (*)
38,000
$15,000
$314, 800
* Purchased in Phase 1
December 13, 2004
46
Major Milestones Phase II
•
•
•
•
Production
Marketing
Legal Negotiation
Final Preproduction Alterations
December 13, 2004
47
Phase II Schedule
December 13, 2004
48
Phase III
• We plan to sell the product to NVIDIA at
the end of Phase II
• Doing so would mitigate all responsibilities
and risk factors that may arise on the market
– While we increase the companies profit by over
$6.5 million
December 13, 2004
49
Profit Margin/Break Even
• Immediate Profit
• $70 million average profit for acquisitions
– If we obtain 1/10(average)
– We would still make a $6.5 million gain
http://nvidia.com/object/IO_20010612_6602.html
http://nvidia.com/object/IO_8086.html
December 13, 2004
50
Profit Margin/Break Even
Phase 1 Budget
<$93,000>
Phase 2 Budget
<$315,000>
Total
<$408,000>
GAP Acquisition by
NVIDIA
$7,000,000
NET PROFIT
$6,592,000
December 13, 2004
51
Conclusion
• Through our prototype we have achieved
“proof of concept”
• The overall efficiency gain obtained within
computationally intensive environments
proves a need for GAP
December 13, 2004
52
Graphical Asymmetric Processing
December 13, 2004
53
Download