(Intel® DPDK) Overview Packet Processing on Intel® Architecture

TRANSFORMING
COMMUNICATIONS
COMMUNICATIONS INFRASTRUCTURE DIVISION
Intel® Data Plane Development Kit (Intel® DPDK)
Overview
Packet Processing on Intel® Architecture
December 2012
Legal Disclaimer
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS
GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR
IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT
OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY
SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH,
HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR
DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL
PRODUCT OR ANY OF ITS PARTS.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or
"undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to
change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available
on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to:
http://www.intel.com/design/literature.htm
All information provided related to future Intel products and plans is preliminary and subject to change at any time, without notice. All dates provided are subject to change without notice. Intel may make
changes to specifications and product descriptions at any time, without notice. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel
representative to obtain Intel’s current plan of record product roadmaps. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps.
Celeron, Intel, Intel logo, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel SpeedStep, Intel XScale, Itanium, Pentium, Pentium Inside, VTune, Xeon, and
Xeon Inside are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Intel® Active Management Technology requires the platform to have an Intel® AMT-enabled chipset, network hardware and software, as well as connection with a power source and a corporate network
connection. With regard to notebooks, Intel AMT may not be available or certain capabilities may be limited over a host OS-based VPN or when connecting wirelessly, on battery power, sleeping, hibernating
or powered off. For more information, see http://www.intel.com/technology/iamt.
Enhanced Intel SpeedStep® Technology for specified units of this processor available Q2/06. See the Processor Spec Finder at http://processorfinder.intel.com or contact your Intel representative for more
information
64-bit computing on Intel architecture requires a computer system with a processor, chipset, BIOS, operating system, device drivers and applications enabled for Intel® 64 architecture. Performance will vary
depending on your hardware and software configurations. Consult with your system vendor for more information.
No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology is a security technology under development by Intel and requires for operation a computer system
with Intel® Virtualization Technology, an Intel Trusted Execution Technology-enabled processor, chipset, BIOS, Authenticated Code Modules, and an Intel or other compatible measured virtual machine
monitor. In addition, Intel Trusted Execution Technology requires the system to contain a TPMv1.2 as defined by the Trusted Computing Group and specific software for some uses. See
http://www.intel.com/technology/security/ for more information.
†Hyper-Threading Technology (HT Technology) requires a computer system with an Intel® Pentium® 4 Processor supporting HT Technology and an HT Technology-enabled chipset, BIOS, and operating system.
Performance will vary depending on the specific hardware and software you use. See www.intel.com/products/ht/hyperthreading_more.htm for more information including details on which processors
support HT Technology. Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain platform software
enabled for it. Functionality, performance or other benefits will vary depending on hardware and software configurations and may require a BIOS update. Software applications may not be compatible with all
operating systems. Please check with your application vendor. Intel® AES-NI requires a computer system with an AES-NI enabled processor, as well as non-Intel software to execute the instructions in the
correct sequence. AES-NI is available on select Intel® processors. For availability, consult your reseller or system manufacturer. For more information, see Intel® Advanced Encryption Standard Instructions
(AES-NI)
* Other names and brands may be claimed as the property of others.
Other vendors are listed by Intel as a convenience to Intel's general customer base, but Intel does not make any representations or warranties whatsoever regarding quality, reliability, functionality, or
compatibility of these devices. This list and/or these devices may be subject to change without notice.
Copyright © 2012, Intel Corporation. All rights reserved.
2
2
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Agenda
1. Intel’s Packet Processing
Motivation and Value Proposition
2. Overview of Intel® DPDK
3. Intel® DPDK Performance
Benchmarks
4. Lead Ecosystem Offerings
5. Intel® DPDK Website and
Collateral
6. Summary
3
3
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
CID Mission “TRANSFORMING COMMUNICATIONS”
ACCESS NETWORKS
EDGE/CORE NETWORKS
ENTERPRISE NETWORKS
+
Intel Architecture
Ecosystem + Standards
INTELLIGENT
EDGE
WIRELESS BASE
STATION
WIRELESS
INFRASTRUCTURE
MEDIA
PROCESSING
NETWORK
APPLIANCES
NETWORK
SECURITY
ROUTERS AND
SWITCHES
>10 Years Serving Communications and Networking Segments
4
4
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
4:1 Workload Consolidation Assets
Unleashed by Multi-Core IA and Software
2011
Application
Processing
Control
Processing
5
Intel® Xeon®
Processor
C3500/C5500
Intel® QuickAssist
Software Library
2012
Intel®
Communications
Chipset Series
89xx for
Cryptography,
Compression
Packet
Processing
NPU/ASIC
Signal
Processing
DSP
DSP
One Instruction
Set Architecture
One Tool Suite
2013
Next
Generation
Intel® Data Plane
Development Kit
TRANSFORMING COMMUNICATIONS
Intel® Signal
Processing
Development Kit
Multiple
Opportunities
Intel’s Data Plane Value Proposition
Datacom/Telecom convergence increasing Data Plane processing
requirements exponentially
TCO concerns leading customers to seek single architecture design topto-bottom
Intel addresses TCO and TTM concerns with single architecture, multiworkload IA capability, allied to an industry-leading beat-rate of
process and uArchitectural advancements (Tick-Tock Model)
Optimized Data Plane Software solutions will help unleash IA platform
potential
6
6
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
The Engineering Problem Statement …
From a CPU perspective:
10Gbps Line Rate
• A Last-level-cache (L3) hit on Intel® Xeon® processor 5500
is ~40 cycles
 L3 miss, memory read is ~70ns (201 cycles at 3 GHz)
 This problem gets much worse at 40 Gb and 100 Gb rates
16,000,000
Packets per second
14,000,000
12,000,000
10,000,000
Intel Silicon and Software advances are proactively
addressing this problem statement, making high
performance Packet Processing on IA a reality!
8,000,000
6,000,000
4,000,000
PPS
2,000,000
Network Infrastructure Packet Sizes
7
7
1504
1468
1432
1396
1360
1324
1288
1252
1216
1180
1144
1108
1072
1036
1000
964
928
892
Packet Size
856
820
784
748
712
676
640
604
568
532
496
460
424
388
352
316
280
244
208
172
136
64
100
0
Typical Server Packet Sizes
Packet Size
64 bytes
Packet Size
1024 bytes
10G Packets/second
14.88 Million each way
10G Packets/second
1.2 Million each way
Packet arrival rate
67.2 ns
Packet arrival rate
835 ns
2 GHz Clock cycles
135 cycles
2 GHz Clock cycles
1670 cycles
3 Ghz Clock cycles
201 cycles
3 Ghz Clock cycles
2505 cycles
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Ongoing Silicon Architectural Enhancements
Packet Processing Enhancements:
Pipeline Depth
Direct Cache Access
Integration of Memory Controller
Integration of High Bandwidth PCIe Gen3
New AVX Extensions
Intel® Virtualization Technology (Intel® VT)
Intel® Data Direct I/O Technology (Intel® DDIO)
Intel® Xeon® processor E52600 Microarchitecture
Intel® Core™ i5 / i7
Microarchitecture
Enhanced Intel® Core
Microarchitecture
8C, 6C, 4C, 2C Product Choices
1S, 2S, 4S Configurations
Intel® Hyper-Threading Technology
Integrated Memory Controller(s)
Integrated High BW PCIe gen3
2 QPI Links for 2S Configurations
Up to 20MB of L3 Cache
Intel® Core™ 2
Microarchitecture
Hyper-Threading, Smart
Cache, QuickPath,
Integrated Memory
Controller, SSE2/SSE3/SSE4
Instructions
Intel® Pentium® 4
Processor Extreme
Edition
Supporting Hyper-Threading
Technology
Netburst microarchitecture
Intel® Extended Memory 64
Technology, Hyperthreading
8
8
Intel Restricted Secret
Multi-Core Introduction
Advanced Smart Cache
Wide Dynamic Execution
SSE2/SSE3, Power
Management
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Improved
Packet
Processing
Capability
IA Performance over the Years
IPv4 Layer 3 Forwarding on an IA Platform
Introduction of
Integrated PCIe*
Controller
Introduction of Integrated
Memory Controller
+
Intel® DPDK
90
80
70
80 Mpps
DPDK R1.1
60
50
40
30
20
10
0
Mpps
2006
2007
2008
DP Intel® Xeon®
DP Intel® Xeon®
DP Intel® Xeon®
Processor LV
Processor E5345
Processor E5410
2 x 2 Core
2 x 4 Core
2 x 4 Core
2.0 GHz, 667 MHz FSB 2.33 GHz, 1333 MHz FSB 2.33 GHz, 1333 MHz FSB
2009
2S Intel® Xeon®
processor E5540
2 x 4 Core
2.53 GHz
2010
2S Intel® Xeon®
processor E5645
2 x 6 Core
2.40 GHz
2011
1S Intel® Xeon®
Processor E5-2600 (B0
stepping)
1 x 8 Core
2.0 GHz
PCIe* Gen2 Performance
Standard “off-the-shelf” IA platform can deliver huge performance.
Performance jump can be attributed to Core, Memory architecture (iMC) + Intel® DPDK
9
9
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Evolution of Data Plane Support
Software & Hardware performance enhancements over the next 2-3 years
Core
Faster Cache
Data Direct IO
1 GB pages
Faster AES instructions
Integrated Memory
Controller
SSE 4.2
128-bit SSE
CRC32
Increasing Cache Sizes
Increasing Core Count
Integrated PCI-E (Latency)
PCIe* Gen 3
PCIe* Gen 2
Platform
40G Networking silicon (Intel)
Increasing I/O bandwidth
Increasing memory bandwidth
Acceleration
AES-NI instruction
2010/2011
Intel® Xeon® processors E5540/E5640
1010
Intel Restricted Secret
Intel® Communications
Chipset Series 89xx for
Cryptography, Compression
2012/2013
Intel® Xeon® processor E5-2600
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Next Gen Chipset for
Cryptography, Compression
2013/14
Next Tick/Tock Processors
Management, Control & Data Plane Environments
Typically more Data Plane elements than Management/Control Plane
Packet Processing
Packet Processing
Processor
Switch
Switch
Processor
Packet Processing
Packet Processing
Control
Plane
Processor Blade
Control Plane
Data Plane
Intel Provided
Intel®
DPDK
Packet Processing Blade
Data Plane
Rack Mount Server, Enterprise Servers
AdvancedTCA*
Typically have Control and Data Plane on same board
Typically Control and Data Plane on different boards
Value Proposition is Consolidation of DP + CP
Value Proposition single blade for multiple purposes
• Customers have come up with solutions to add
additional 3rd party boards for Data Plane
• IA can fill both roles
• Normally many more Packet Processing Blades
• IA allows a single Blade Architecture for both now
Value proposition of Intel® DPDK is workload consolidation:
Provides framework and performance for NPU workloads on IA cores
1111
Ecosystem or Data
Proprietary Plane
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Agenda
1. Intel’s Packet Processing
Motivation and Value Proposition
2. Overview of Intel® DPDK
3. Intel® DPDK Performance
Benchmarks
4. Lead Ecosystem Offerings
5. Intel® DPDK Website and
Collateral
6. Summary
1212
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Goals of this Overview Section
•
•
•
1313
Look into the Intel® DPDK architecture and see how it’s
designed to squeeze the best packet processing
performance out of an IA-based platform
Be able to articulate the most common performance
bottlenecks for packet processing software on IA
Understand the optimization tricks to remove them
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
The Intel® DPDK Philosophy
Intel® DPDK Fundamentals
• Implements a run to completion model or pipeline
model
• No scheduler - all devices accessed by polling
• Supports 32-bit and 64-bit with/without NUMA
• Scales from Intel® Atom™ to Intel® Xeon®
processors
• Number of Cores and Processors not limited
• Optimal packet allocation across DRAM channels
Control
Plane
Data Plane
• Must run on any IA CPU
‒ From Intel® Atom™ processor to the latest
Intel® Xeon® processor family
‒ Essential to the IA value proposition
• Focus on the fast-path
‒ Sending large number of packets to the Linux
Kernel /GPOS will bog the system down
1414
Intel Restricted Secret
Provide software examples that address
common network performance deficits
‒
‒
‒
‒
Best practices for software architecture
Tips for data structure design and storage
Help the compiler generate optimum code
Address the challenges of achieving 80 Mpps
per CPU Socket
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Intel® Data Plane Development Kit (Intel® DPDK)
Intel® DPDK embeds optimizations for the IA
platform:
- Data Plane Libraries and Optimized NIC Drivers in Linux
User Space
Intel® DPDK Libraries
Buffer Management
Queue & Buffer Management, Packet Flow Classification, Poll-Mode NIC
Drivers (1/10GbE), and more!
Simple API Interface, Uses standard tool chain (gcc/icc, gdb, profiling
tools)
Queue/Ring Functions
Packet Flow Classification
NIC Poll Mode Library
- Run-time Environment
Low overhead, run-to-completion model optimized for fastest possible data
plane performance
Customer
Application
Customer
Application
Customer
Application
Environment Abstraction Layer
- Environment Abstraction Layer and Boot Code
Primarily platform-specific boot guidelines and initialization code, eases
application porting effort
- BSD-licensed & source downloadable from Intel and
leading ecopartners
User Space
Kernel Space
Environment Abstraction Layer
Linux Kernel
Provided under a very flexible BSD licensing model
Offered as a free, unsupported standalone solution by Intel or as part of
commercial solutions and offerings from leading ecopartners
Platform Hardware
The Intel® DPDK is a great starting point for customers and the industry in general –
delivering breakthrough packet processing performance.
1515
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Intel® DPDK Libraries and Drivers
•Memory Manager: Responsible for allocating pools of objects in memory. A pool is created in
huge page memory space and uses a ring to store free objects. It also provides an alignment
helper to ensure that objects are padded to spread them equally on all DRAM channels.
•Buffer Manager: Reduces by a significant amount the time the operating system spends allocating
and de-allocating buffers. The Intel® DPDK pre-allocates fixed size buffers which are stored in
memory pools.
•Queue Manager:: Implements safe lockless queues, instead of using spinlocks, that allow different
software components to process packets, while avoiding unnecessary wait times.
•Flow Classification: Provides an efficient mechanism which incorporates Intel® Streaming SIMD
Extensions (Intel® SSE) to produce a hash based on tuple information so that packets may be
placed into flows quickly for processing, thus greatly improving throughput.
•Poll Mode Drivers: The Intel® DPDK includes Poll Mode Drivers for 1 GbE and 10 GbE Ethernet*
controllers which are designed to work without asynchronous, interrupt-based signaling
mechanisms, which greatly speeds up the packet pipeline.
1616
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Component Overviews
•
•
•
•
1717
EAL Memory Management Overview
Queue/Ring Overview
Buffer Management Overview
Flow Classification Overview
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Memory Usage
• Basic unit for runtime object allocation is the memory zone
• Zones contain rings, pools, LPM routing tables, or any other
performance-critical structures
• Always backed by Huge Page (2 MB/1 GB page) memory
Ring:
Ring:
RX_RING_0
TX_RING_0
Memory Zone:
Memory Zone:
RG_RX_RING_0
Memory Pool: mbuf_pool
Memory Zone: MP_mbuf_pool
RG_TX_RING_0
Memory Segment 0
2MB
page
2MB
page
Memory Segment 1
2MB
2MB
page
page
2MB
page
Physically contiguous
memory
1818
Intel Restricted Secret
2MB
2MB
page
page
2MB
2MB
page
page
2MB
2MB
page
page
Physically contiguous
memory
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Memory Segment N
2MB
2MB
page
page
Queue/Ring Management API
• Effectively a FIFO implementation in software
‒ Lockless implementations for single or multi-producer, single or multi- consumer
enqueue/dequeue
‒ Supports bulk enqueue/dequeue to support packet-bunching
‒ Implements high & low watermark thresholds for back-pressure/flow control
• Essential to optimizing throughput
‒ Used to decouple stages of a pipeline
1919
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
The Buffer Management API (mempool)
• The Buffer Manager allocates memory from
the EAL and creates pools with fixed element
sizes.
‒ Typical usage is packet buffers, descriptor ring
buffers, etc.
‒ Intent is to speed up runtime allocation/deallocation
‒ Does not support runtime resizing of pools
• Multi-producer/multi-consumer safe
‒ Pools are based on Intel® DPDK rings so are multiproducer and multi-consumer safe
‒ No locking; use CAS instructions
‒ Pools can also be used in multi-process
environments
Memory Pool
Pkt Buffers
(60K 2K buffers)
Rings for
Cached
buffers
• Optimized for performance
‒ Cache alignment
‒ Per core buffer caches for each buffer pool so that
allocation/freeing can be done without using
shared variables
‒ Bulk allocation/freeing support
2020
Intel Restricted Secret
Events
(2K 100B buffers)
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Events
(2K 100B buffers)
Processor 0
Data
Plane
Data
Plane
Data
Plane
Data
Plane
Intel®
DPDK
Intel®
DPDK
Intel®
DPDK
C1
C2
C3
Intel®
DPDK
C4
10G
10G
Flow Classification API
•
The Intel® DPDK provides a Flow Classification API
•
•
Not expecting every customer to use it – more of a showcase to demonstrate how to do
optimization for IA
Classification is something that is very customer-specific – Each customer/segment has
different needs
‒
‒
•
The Flow classification API is designed to take advantage of current and
future hardware-based flow classification capabilities
•
•
2121
Router implementations typically use “longest-prefix-match”
Security implementations need to identify individual flows and can use flow classification
Intel® 82599 10GbE Ethernet Controller implements flow classification (limited in number of
flows)
Future Chipsets are expected to implement an extensive classifier
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
30,000 ft Overview of Packet Flow
Initialization
1. Initialization
– Initialize memory zones and pools
– Initialize devices and device queues
– Start the packet forwarding application
2. Packet Reception (RX)
Polling
RX
Packets to
send
– Poll devices’ RX queues and receive packets in
bursts
– Allocate new RX buffers from per queue
memory pools to stuff into descriptors
3. Packet Transmission (TX)
TX
2222
Intel Restricted Secret
–
–
Transmit the received packets from RX
Free the buffers used to store the packets
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Overcoming The Challenge of Achieving 80 Mpps (and More…) Per
CPU Socket
Memory and PCIe* access is really, really slow compared to
CPU operation
Process a bunch of packets (e.g. 4 packets at a time) to minimize
external memory and PCIe bandwidth. Avoid read-modify-write
transactions in favour of single write, and multiple reads in favour
of single read.
Data doesn’t seem to be near the CPU when it needs it (and
so it waits)
The system can’t keep up with the amount of interrupts
for packet Rx
For memory access, use HW or SW controlled prefetching and
align data structures to cache line size (64 Byte) to minimize
external memory and PCIe* bandwidth, as all external memory
accesses are in cache line increments; for PCIe access, use Direct
Data IO (available on Intel® Xeon® processor E5 Product Family)
to read data directly into cache.
Switch from an interrupt-driven network device driver to a
polled-mode driver.
Access to shared data structures is a bottleneck in the
application
The out-of-the-box Linux* Scheduler causes too much
overhead to task switch
Figure out clever access schemes that reduce the amount of
sharing (e.g. use lockless queues for message passing as
semaphores/spinlocks are costly).
Bind a single software thread to a logical core. Use CPU core
isolation and thread affinities for 1:1 mapping of SW threads
to HW threads.
Intel® PTU Tool indicates that page tables are constantly
evicted (D-TLB Thrashing)
Use 2MB or 1G Huge Pages in Linux* to reduce TLB misses.
The Challenge can be Overcome with Smart Programming and Hardware assists!!
2323
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
PCIe* connectivity and core usage
Using run-to-completion or pipeline software models
NUMA
Pool Caches
Queue/Rings
Buffers
NUMA
QPI
Pkt
Physical
PMD Packet I/O
Core 2
Flow work
Intel® DPDK
Rx
Tx
Physical
PMD Packet I/O
Core 3
Flow Classification
Intel® DPDK App A, B, C
Rx
Tx
Physical
PMD Packet I/O
Core 4
Flow Classification
Intel® DPDK App A, B, C
Rx
Tx
10 GbE
Intel Restricted Secret
10 GbE
Physical
Rx
Core 0
Tx
Intel® DPDK
Physical
Core 1
Intel® DPDK
PMD Packet I/O
Flow Classification
App A
App B
App C
App A
App B
App C
Pkt
Physical
Core 2
Intel® DPDK
Physical
Core 3
Intel® DPDK
Pkt
Physical
Core 4
Intel® DPDK
RSS Mode
Rx
Tx
Physical
Core 5
Intel® DPDK
Run to Completion model
• I/O and Application workload can be handled on a single core
• I/O can be scaled over multiple cores
2424
Pkt
10 GbE
Pkt
PCIe
Physical
PMD Packet I/O
Core 5
Flow Classification
Intel® DPDK App A, B, C
10 GbE
10 GbE
Pkt
PCIe
Rx
Tx
PCIe
PMD Packet I/O
Packet work
Physical
Core 1
Intel® DPDK
Pkt
Pkt
PCIe
Physical
Core 0
Linux* Control Plane
Processor 1
PCIe
Processor 0
Pool Caches
Queue/Rings
Buffers
Pipeline model
• I/O application disperses packets to other cores
• Application work performed on other cores
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Tools for Optimizing Intel® DPDK
Intel® VTune™ Amplifier XE profiles performance
Intel® Performance Tuning Utility (Intel® PTU) offers
specific tuning advice
Download both tools at whatif.intel.com
2525
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Agenda
1. Intel’s Packet Processing
Motivation and Value Proposition
2. Overview of Intel® DPDK
3. Intel® DPDK Performance
Benchmarks
4. Lead Ecosystem Offerings
5. Intel® DPDK Website and
Collateral
6. Summary
2626
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Performance Going Forward
IPv4 Layer 3 Forwarding on an IA Platform
250
200
150
Introduction of Integrated
Memory Controller
+
Intel® DPDK
Testing with PCIe* Gen3
expected to be higher
Introduction of Integrated
PCIe* Controller
164.9 Mpps
DPDK R1.2
100
93 Mpps
DPDK R1.2
50
0
Mpps
2009
2S Intel® Xeon® processor
E5540
2 x 4 Core
2.53 GHz
2010
2S Intel® Xeon® processor
E5645
2 x 6 Core
2.40 GHz
2011
1S Intel® Xeon® E5-2658
processors C1 Stepping
1 x 8 Core
2.1 GHz
PCIe Gen2 Performance
2012
2S Intel® Xeon® E5-2658
processors C1 Stepping
2 x 8 Core
2.1 GHz
PCIe Gen2 Performance
2012
2S Intel® Xeon® E5-2658
processors C1 Stepping
2 x 8 Core
2.1 GHz
PCIe Gen3 Performance
(Estimate ONLY)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results
to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that
product when combined with other products.
PCIe* Gen 3 will offer better performance and more options
In the case of IPv4 Layer 3 forwarding, we are still I/O limited – i.e. cores capability is not maxed out!!
2727
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Intel® DPDK Native and Virtualized Forwarding
Performance
2828
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Agenda
1. Intel’s Packet Processing
Motivation and Value Proposition
2. Overview of Intel® DPDK
3. Intel® DPDK Performance
Benchmarks
4. Lead Ecosystem Offerings
5. Intel® DPDK Website and
Collateral
6. Summary
2929
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Intel® DPDK Go To Market Options
Intel® DPDK free,
unsupported
standalone
package for
integration with
proprietary
customer stacks
Data Plane Applications
Dispatch Loop
Services
PMD
Memory & Buffer API
Intel® DPDK
Core 1 - n
Ecosystem Provided: Intel® DPDK Integrated into Commercial Solutions and Intel® DPDK
services offerings
For more information about ecosystem solutions, visit www.intel.com/go/dpdk
3030
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Intel Provided: Intel®
DPDK –
Free, Unsupported
Package
Agenda
1. Intel’s Packet Processing
Motivation and Value Proposition
2. Overview of Intel® DPDK
3. Intel® DPDK Performance
Benchmarks
4. Lead Ecosystem Offerings
5. Intel® DPDK Website and
Collateral
6. Summary
3131
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Intel® DPDK
www.intel.com/go/dpdk
Your One-Stop-Shop for:
• Documentation and articles, white
papers, pod casts
• Ecosystem information and articles
Examples
See the Video: “Intel® Data Plane Development Kit (Intel® DPDK)”.
Found on EDC site at www.intel.com/go/dpdk under Video: Intel®
Data Plane Development Kit (Intel® DPDK)
3232
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Agenda
1. Intel’s Packet Processing
Motivation and Value Proposition
2. Overview of Intel® DPDK
3. Intel® DPDK Performance
Benchmarks
4. Lead Ecosystem Offerings
5. Intel® DPDK Website and
Collateral
6. Summary
3333
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
Summary
Intel® DPDK enables multi-workload/single architecture potential by
making IA extremely competitive for packet processing workloads
Distribution of enabling software under flexible and cost-free licensing
model enabling maximum customer usability
Fully featured and supported IA Data Plane software solutions via
Intel’s lead Ecosystem partners
3434
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS
3535
Intel Restricted Secret
TRANSFORMING COMMUNICATIONS
TRANSFORMING
COMMUNICATIONS