Improving Digital Signal Processing
Performance and Lowering Costs
With the IntervalZero RTX Real-Time Platform Used
on Multi-Core-Based Intel® Processors
White Paper
Executive Summary
Intel Corporation
Digital signal processors (DSPs), with their specialized architectures optimized for heavy math computation,
IntervalZero Inc.
have been a mainstay of the real-time digital signal processing marketplace as a cost-effective solution for
many complex designs.
Although DSPs remain a viable choice for many systems, developers are finding that proprietary DSPs are
no longer required to perform real-time digital signal processing. Instead, developers are migrating to the
IntervalZero RTX Real-Time Platform, which comprises Intel® multi-core processors, the Microsoft Windows*
operating system and RTX real-time software. This platform is designed to outperform DSPs, reduce costs
and streamline development cycles.
This paper compares DSP-based systems to systems based on the RTX Real-Time Platform and provides
an overview of the economic and operational benefits that can serve as a catalyst for the migration away
from DSPs.
Among other findings, this paper presents benchmark test results showing that in terms
of floating-point operations per second, the performance of a multi-core DSP does not
match the raw performance and speed of a device based on the 2nd generation Intel®
Core™ i7 processor.
IntervalZero’s symmetric multiprocessing-enabled RTX software provides hard real-time functionality for
Windows operating systems, including Windows 7 with its touch-and-gesture capabilities.
White Paper: Improving Digital Signal Processing Performance and Lowering Costs
Contents
System Diagrams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
DSP-Based Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
RTX Real-Time Platform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Hardware Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Processor Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Board Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Communication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Memory Selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Software Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
General-Purpose Operating System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Real-Time Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Development Tools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Code Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
DSP Libraries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2
White Paper: Improving Digital Signal Processing Performance and Lowering Costs
System Diagrams
DSP-Based Platform
DSPs rely on specialized architectures designed for heavy math
computation, but not for general processing. Due to the computational
focus of DSPs, general-purpose processors (GPPs) are typically used
for human machine interface (HMI) and general-purpose functions,
TRADITIONAL DSP SYSTEM DESIGN
Dual-Core System
DSP Subsystem
Windows OS
Human
Machine
Interface
(HMI)
UNUSED
Real-Time
Processing
Real-Time
Processing
Real-Time
Processing
Processor 1
DSP 1
DSP 2
DSP 3
such as input and output (see Figure 1). In addition, it should be noted
that DSPs require separate memory devices for each device, and separate buses are needed for all inter-processor communications (i.e., PCIe
or serial interface).
Processor 0
Figure 1. Traditional DSP-based system.
RTX REAL-TIME PLATFORM
RTX Real-Time Platform
The RTX Real-Time Platform takes a different approach, dividing up
the multiple cores for dedicated real-time processing and generalpurpose processing. Figure 2 illustrates an Intel® processor-based
Application Platform
World-Class
User Interface
Real-Time Application
multi-core device with cores dedicated to Windows and cores dediRTX Platform
cated to RTX for real-time signal processing functions. Unlike the
DSP design, memory is shared using the same high-speed bus, allowing for data sharing and inter-processor communications.
The following chart compares the two architectures (see Table 1).
Subsequent sections of this paper will discuss the merits of the
architectures in detail.
Windows Real-Time
Kernel Extension − RTX
Windows Kernel
Core
0
Intel® Atom™
Processor
Core
X
Core
X+1
Core
X+2
x86 Single
or Multi-core
...
Core
X+31
x86 64-bit Multi-core
Figure 2. RTX real-time platform.
3
White Paper: Improving Digital Signal Processing Performance and Lowering Costs
Table 1. Comparing features and benefits of DSP and RTX platform architectures.
Category
Hardware Design
Drivers/BSP
System Communication
Memory
RTOS
Development Environment
Code Base
DSP Libraries
Benchmarks
DSP Platform
• Custom board design
• Long development cycle
• Difficult-to-modify hardware
• Requires In Circuit Emulators (ICE)
• Custom drivers – device-specific
RTX Real-time Platform
• COTS
• Short development cycle
• Easy-to-change hardware
• Local Debug – no ICE needed
• Standard drivers
• Custom protocol
• Custom buses
• Chip down – hard to change
• Limited selection
• Limited memory reach (1 GB)
• No MMU
• Proprietary
• Standard memory
• Standard APIs
• PnP/COTS Modules – easy to change
• Large selection
• Large memory reach (>4 GB)
• MMU – Virtual memory
• winAPI
• Proprietary tools
• Manufacture-specific
• Proprietary compilers
• Assembly and C
• Architecture-specific
• Processing libraries from OEM
• Standard x86 tools
• Microsoft Visual Studio*
• Intel and Microsoft compilers
• C, C++, C#
• Universal/Portable
• Intel-supplied IPP library
• 1.25 GHz – Max core speed
• 20 GFLOPS/Device max
• 3.0 GHz – Max core speed
• >50 GFLOPS/Device max
RTX Benefit
• Reduced cost
• Reduced risk
• Faster time to market
• Cost
• Productivity
• Ease-of-use
• Future proof
• Reduced system size, cost
and complexity
• Familiar API
• Common tool chain throughout
the product
• Use other higher level languages
for non real-time
• No requirement to use OEM
proprietary libraries
• Faster
• Higher speed
Hardware Design
Board Design
Intel’s multi-core advances are changing the way that systems develop-
In DSP-based systems, engineers spend a great deal of time selecting
ers meet real-time signal processing needs. The latest Intel processors
the right processor so that the board can be designed and built in time
not only have multiple cores with extremely high clock speeds, but also
for the software. The need to make hardware commitments early in
perform complex math efficiently. They can deliver several times the
the design process often creates challenges. Any miscalculation can
performance of standard DSPs, as can be seen from the benchmark
lead to a hardware redesign, which will have a negative impact on the
numbers in Table 1.
delivery schedule.
In addition, Intel multi-core devices are delivered with commercial-
Using the IntervalZero RTX Real-time Platform imposes no strict dead-
off-the-shelf (COTS) hardware, which is not typically an option for
lines for choosing and designing the hardware. Because RTX is Intel®
DSP users. COTS hardware, when compared to custom hardware,
architecture-based, standard COTS hardware is available for develop-
translates into reduced costs, shorter time to market and less risk.
ment and production. Since no custom boards or drivers are required for
development, the final hardware selection can be performed later in the
Processor Selection
design cycle — which lowers risk and improves the chances of releasing
When replacing DSPs with Intel multi-core devices, attention should be
products on time.
focused on the number of cores needed, as well as which Intel processor
family best fits the system requirements. As a rough guideline, developers should aim for a 1:1 relationship between DSPs and Intel cores.
Because Intel cores are capable of extremely high clock speeds, the
design will likely require fewer cores. However, the 1:1 ratio is a conservative start that allows room for optimization.
Intel offers a variety of processor families in a range of cost and power
options. For example, Intel® Atom™ processor-based devices are focused
on embedded systems that require low power and minimized cost.
4
Also, because RTX is based on Intel architecture and Windows,
engineers can use PCs as both development and target machines. No
separate target system with specialized in-circuit emulators is needed
to develop and debug code. As the chart in Table 1 shows, the use of
COTS hardware in RTX reduces engineering costs and lowers risks.
White Paper: Improving Digital Signal Processing Performance and Lowering Costs
Communication
support to satisfy most complex system requirements. As a result, GPPs
DSP-based systems rely on custom hardware and software interfaces
running general-purpose operating systems are typically found next to
when communicating between devices. Serial lines and sometimes PCI
DSPs to handle all of the I/O demands and the complex user interfaces.
buses are used for inter-processor communications. These custom interfaces can be troublesome to design on custom hardware, and they can
be difficult to upgrade as requirements change.
The strong demand for powerful GUIs is making Windows the operating
system of choice due to its abundant tools support and standardization.
The use of standardized tools such as Visual Studio, combined with the
The RTX platform uses shared memory and formalized APIs to commu-
large support structure of the Microsoft Developer Network (MSDN),
nicate between processes. Because everything runs on a single device,
help make transitioning from DSPs to RTX a straightforward process.
sharing data and messages between threads running on different cores
is easier. This communication architecture helps make programming easy
and scalable.
Real-Time Operating System
DSPs typically use proprietary real-time operating systems (RTOSs) from
device manufacturers such as DSP/BIOS* from Texas Instruments Inc. or
Memory Selection
VDK from Analog Devices Inc. These DSP-based RTOSs are capable, but
Whereas DSPs require separate memory devices, the RTX platform
they have significant limitations because of the lack of communications
consolidates memory into a single memory device. In addition, an Intel
among the various DSP cores. Each DSP runs its own application, and
processor-based platform can use standard off-the-shelf memory
little inter-processor communication is used.
modules (e.g., 8 GB modules), which can be easily changed as system
requirements dictate. DSPs require on-board memory chips because of
strict timing and routing requirements. Switching out memory chips is
difficult and requires board changes to add or remove memory. These
differences make the DSP-based approach a more costly, more complex
and less flexible alternative than the RTX platform.
Software Design
The IntervalZero RTX Real-Time Platform implements a single symmetric multiprocessing scheduler across all of the cores in the real-time
subsystem (see Figure 3). Instead of a separate RTOS and application
running on each real-time core, the RTX platform uses a single real-time
scheduler to schedule threads across a number of cores. This flexible
SMP scheduling model helps make synchronization and communications
among the various cores and threads simple and easy to implement.
Software advances have also had a significant impact on DSP-based
Load balancing can also be easily implemented from the APIs that the
systems. The demand for standardized tooling in lieu of proprietary
RTX subsystem provides. The RTX APIs enable threads to be moved
toolsets is a recent change affecting DSP developers. Programming
between processors during runtime.
DSPs requires proprietary tools and lower-level languages such as C
and Assembly. Many engineering teams find that proprietary DSP
tools and low-level languages require specialized expertise, which is
often difficult to find and expensive to acquire. By contrast, developers are discovering that the use of standardized development tools and
Quad-Core System
Windows OS
Core
RTX
Core
Core
Core
higher-level languages, such as Microsoft Visual Studio* and C++, can
increase productivity and greatly reduce engineering costs.
Figure 3. RTX SMP-enabled scheduler.
General-Purpose Operating System
The demand for complex graphical user interfaces (GUIs) has also
affected DSP-based system development. Increasingly, customers are
seeking elaborate touch-and-gesture-based user interfaces on top of
a real-time subsystem. DSPs have traditionally not had the GUI or I/O
5
White Paper: Improving Digital Signal Processing Performance and Lowering Costs
Development Tools
The RTX platform’s strong support for C++ is designed to provide
DSP manufacturers provide proprietary tools such as Texas Instruments’
distinct benefits. Because RTX makes use of powerful Intel-based
Code Composer Studio* and Analog Devices’ VisualDSP++*. These tools
devices, programmers can use C++ to enable increased productivity.
are useful for developing DSP-based applications, but market demand
By keeping the code base in higher-level languages such as C and C++,
and cost pressures are providing impetus for more standardized tool-
programmers can take advantage of portability and code reuse to
ing. In addition, the low-level tooling and proprietary coding practices
reduce their costs and risks.
inherent in DSPs do not promote code reuse and portability. This drawback typically translates into longer development times and higher risks.
DSP Libraries
Many developers have discovered that working with proprietary DSP
Texas Instruments and Analog Devices both offer optimized DSP
tools is not only challenging, but also costly to maintain.
libraries to help developers with performance and time to market. As
an alternative to these DSP libraries, Intel created the Intel® Integrated
The RTX platform uses Visual Studio, a standardized tool for program-
Performance Primitives (IPP). The Intel IPP library is essentially a collec-
ming in an Intel architecture environment. Finding engineering talent for
tion of optimized functions (i.e., DSP, imaging, video and so forth) for Intel
Microsoft tools is generally easier and more cost-effective than is the
multi-core architectures. The Intel IPP library is designed to help increase
case with DSP tools. Many engineering teams prefer to use standardized
productivity and performance by providing optimized routines for the
tools such as Visual Studio and MSDN for their robustness and support,
most common DSP-based functions.
and they find that taking advantage of these tools enables increased
productivity and reduced costs.
Benchmarks
Full development tool suites that integrate into Visual Studio are also
Performance differences between processors can be measured in
available from Intel, including proven C/C++ compilers; performance
and parallel libraries; and tools for error checking, code robustness and
performance profiling. Examples include the Intel® Parallel Studio XE
and Intel® C++ Studio XE Suites.
Code Base
a number of ways. Because the focus of this paper is on DSPs, the
following comparison measures the number of floating-point operations per second (GFLOPS).
Table 2. Performance differential: DSP vs. Intel® multi-core processor
Benchmarks
Migrating code base to the RTX platform is a straightforward process.
1.25 GHz - Max core speed
3.0 GHz - Max core speed
20 GFLOPS per core
32 GFLOPS per core
When considering a port, the developer needs to be aware that an
The benchmark test results shown in Table 2 compare the performance
RTX application has two parts — general-purpose, or non-real-time
of a Texas Instruments C66xx multi-core DSP and a device based on the
processing, and DSP real-time processing. The non-real-time, general-
2nd generation Intel® Core™ i7 processor.
purpose code will directly port over using a standard Visual Studio
project, and it will run on Windows. The DSP real-time code will also
be built using Visual Studio, but it will run on the RTX-controlled cores
(the real-time subsystem).
As the chart indicates, the TI C66xx at 1.25 GHz outputs 20 singleprecision GFLOPS per core. The 2nd generation Intel Core i7 processor
outputs 32 double-precision GFLOPS per core. Although the GFLOPS
performance is strong with the TI DSP, these benchmark tests indi-
DSPs are usually programmed in both C and Assembly. The Assembly
cate that it does not match the raw performance and speed of
code is not portable and will need to be written in C or C++, but the C
Intel processors.
code can be ported over using Visual Studio. The RTX platform enables
real-time users to code efficiently in C++ rather than having to start
with C and Assembly programming. Although DSPs do have some C++
support, object-oriented languages typically lose too much efficiency
to be useful when compiled for DSP architectures.
6
White Paper: Improving Digital Signal Processing Performance and Lowering Costs
Conclusion
The IntervalZero RTX Real-time Platform marries the power of the
Windows 7 interface with the real-time signal processing capabilities
of Intel multi-core architectures. The use of standardized COTS
hardware and standardized tooling is designed to greatly simplify
engineering efforts and reduce cost. The RTX Platform is also
designed to increase innovation, portability and scalability.
The Intel® Embedded Design Center provides qualified developers with Web-based access to technical resources. Access
Intel Confidential design materials, step-by step guidance,
application reference solutions, training, and Intel’s tool loaner
program, and connect with an e-help desk and the embedded
community. Design Fast. Design Smart. Get started today.
edc.intel.com.
The need for digital signal processing can be expected to continue,
but recent hardware and software advances have made dedicated
DSPs an option instead of a requirement. For complex systems requiring powerful GUIs and real-time signal processing, other options are
available that offer higher performance, more scalability and less cost.
7
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED
BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED
WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,
COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE
OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves
these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design
with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are
available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number
and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel’s Web site at www.intel.com.
Copyright © 2011 Intel Corporation. All rights reserved. Intel, the Intel logo, Core, and Atom are trademarks of Intel Corporation in the U.S. and other countries.
*Other names and brands may be claimed as the property of others.
Printed in USA
0911/MGA/OCG/XX/PDF
Please Recycle
326162-001US