Multicore Processors make real time embedded systems more

advertisement
Multicore Processors make real time embedded systems more realistic
Abstract:
With the ever-increasing demands for embedded system devices, multi-core solutions are becoming more
prevalent. The use of multiple cores increases the complexity of software design in many aspects. What
are the different hardware architectures that can be implemented and which of them are more realistic
and cost effective, are questions that all multicore systems software developers will ask. The majority will
require a straightforward, simple approach to accomplishing the task.
Background :
An Embedded System:
 Is a system built to perform its duty, completely or partially independent of human
intervention.
 Is specially designed to perform a few tasks in the most efficient way.
 Interacts with physical elements in our environment, viz. controlling and driving a motor,
sensing temperature, etc.
An embedded system can be defined as a control system
or computer system designed to perform a specific task.
Common examples of embedded systems include MP3
players, navigation systems on aircraft and intruder
alarm systems. An embedded system can also be
defined as a single purpose computer.
Embedded systems are often required to provide RealTime response. A Real-Time system is defined as a
system whose correctness depends on the timeliness of
its response. Examples of such systems are flight control
systems of an aircraft, sensor systems in
nuclear reactors and power plants. For
these systems, delay in response is a fatal
error. A more relaxed version of Real-Time
Systems, is the one where timely response
with small delays is acceptable. Example of
such a system would be the Scheduling
Display System on the railway platforms. In technical terminology, Real-Time Systems can be
classified as:
 Hard Real-Time Systems - systems with severe constraints on the timeliness of the
response.
 Soft Real-Time Systems - systems which tolerate small variations in response times.
 Hybrid Real-Time Systems - systems which exhibit both hard and soft constraints on its
performance.
Problem Statement :
Multicore for Real-time Embedded Applications
Methodology :
There are three main types of multiprocessing architectures, distributed processing (DP),
symmetric multiprocessing (SMP) and asymmetric multiprocessing (AMP). Each has its own set of
characteristics, advantages and disadvantages.
Distributed processing (DP) is based on independent nodes. With DP, each node has its own
processor and memory, and each of the nodes communicates over busses or a fabric. Each DP
node may have different peripherals, and individual, separate copies of the operating system are
run on each of the nodes. Advantages of a DP approach include predictable performance and
higher memory bandwidth since memory is not shared. The DP approach often works well for
multi-channel applications. The disadvantages of DP include the fact that load balancing must be
performed by the application and that the application must be tied to a number of nodes. Also, DP
typically supports a smaller per node quantity of memory compared to SMP and AMP designs.
In SMP architectures, each node may have two or more processors and memory is global to all
processors. In addition, the processors may also have both local cache and shared cache, and the
cache is coherent between all
processors and memory. Also, a single
O/S is used to control all the nodes. The
advantages of SMP include a large
global memory and better performance
per Watt, important for SWaP (size,
weight
and
power)
sensitive
applications thanks to the use of fewer
memory controllers. Instead of splitting
memory between multiple CPUs, SMP’s
large global memory is accessible to all
of the processor cores. Data intensive
applications, such as image processing
and data acquisition systems, often
prefer large global memories that can be accessed at data rates up to 100s of Mbytes/sec. These
large memory applications benefit from the single large memory common in most multi-core
designs.
SMP also provides simpler node-to-node communication, and SMP applications can be
programmed to be independent of node count. SMP especially lends itself to the use of new multicore processor designs. The disadvantages of SMP include the fact that the memory latency and
bandwidth of a given node can be affected by other nodes, and cache “thrashing” may occur in
some applications.
SMP architectures differ from AMP in that a single block of memory is shared by the multiple
processors or by multiple cores
on
a
single
multi-core
processor. A single OS image
runs across all the cores
enabling
truly
parallel
processing. A big advantage of
SMP operating systems is that
they perform load-balancing
for the tasks between all
available cores.
Asymmetric multiprocessing
designs use SMP hardware architecture where a common global memory is shared between the
various processors. To the system software, this makes the SMP architecture look like a DP
architecture. In AMP designs, application tasks are sent to the system’s separate processors. These
processors may all be located on different boards or collocated on the same board, but each is
essentially a separate computing system with its own OS and memory partition within the
common global memory. One advantage of an AMP design is that asymmetric memory partitions
can be assigned from one large global memory, making more efficient use of memory resources
and potentially reducing system cost.
Asymmetric multi-processing provides a sort of hybrid approach between DP and SMP by
implementing distributed processing on
an SMP architecture. In AMP,
applications memory is partitioned
between the nodes, and independent
copies of the O/S can run on each node.
Advantages of AMP include the fact that
it is simple to migrate existing (non-SMP)
O/Ss to the model and it offers superior
node-to-node communication compared
to a distributed architecture. Also, AMP
supports the sharing of a large global
memory asymmetrically between nodes.
The disadvantages of AMP include some
of the downsides of both DP and SMP,
including the fact that load balancing
must be performed by the application,
memory latency and bandwidth can be
affected by other nodes, cache
“thrashing” may occur in some
applications, and the application is tied to a number of nodes.
Key Results
For single board computers (SBCs), integrating two or more processors onto one device saves real
estate for other important I/O features such as integrated mass storage module or a highperformance serial backplane interface.
Embedded system developers can reduce multiple embedded systems into single hardware
platform by allocating CPU cores in a multicore processor dedicated to real time tasks.
Multi-core processors are especially well suited to SMP since they are ideal for the intensive
multitasking applications common to signal processing, mission computing and industrial control
which typically have multiple processes and multiple tasks or threads running in parallel within a
process. These types of applications are often best addressed with an SMP operating system.
Recent industry trends have combined to make SMP very attractive. Processor performance that
was once garnered by increasing the chip’s clock frequency has become more difficult to achieve.
Meanwhile, higher currents are required to drive signals faster, increasing the amount of power
used in increasingly smaller chip real-estate. A related issue is that leakage current becomes
problematic at higher frequencies and associated thermal solutions are problematic for embedded
applications. The trend reflects the inexorable march of Moore’s Law as silicon density continues to
double every couple of years. Unfortunately, smaller geometries no longer lend themselves to
faster frequencies, but rather to more circuitry. As a result, the major processor manufactures are
now moving to multi-core processors, which feature larger on-chip caches and enhanced
instruction sets. This is a trend that is suitable for SMP/real-time system architectures.
Discussion
The key to making multi-OS embedded systems work on a multicore CPU is an RTOS that supports
virtualization. Virtualization provides the isolation between multiple operating environments and
also enables legacy real-time systems to be integrated with new functionality and minimal impact
on legacy software.
The latest Intel Multicore processors include a feature called Intel Virtualization Technology or Intel
VT that enables hardware –enforced isolation of the processors I/O and memory. With
virtualization multiple control loops can run simultaneously.
Scope for Future Work:
With the increasing demands on embedded devices, it is no wonder that more processing power is
required. The move to multicore platforms is natural evolution for embedded devices considering
the convergence of functionality being placed on what have traditionally been known as single
purpose devices. Take the phone for example. It ha evolved from a device, whose main purpose
was to place a simple point to point call to a device that functions as a mobile phone, gaming unit,
camera, media server, web browser and more all-in-one.
While convergence of functionality being placed on embedded devices in and of itself may not
necessitate the move to multicore, additional considerations such as foot print, energy
consumption, heat dissipation and other aspects of driving a single processor at higher and higher
frequencies demand the transition.
The addition of multiple cores also creates the need for communication and synchronization.
Software developers want an easy to use mechanism that allows them to take advantage of
multicore systems efficiently. They also want that method to be extensible in the future.
With all these factors in mind, the need for a communication system and different types of
hardware architecture and RTOS and what they should support and how they should be
implemented should be explored.
Conclusion:
Multicore platforms for real time embedded systems are here to stay are will only become more
prevalent in the future. Multicore solutions today mostly contain two to four cores on a
chip/processor, but has the potential to grow to a very large number in the future. Some cores
today have more than 300DSPs per chip. This number will continue to grow and be accompanied
by combinations of different specialized cores.
Whether one opts for an AMP or SMP RTOS architecture, the cost of developing real time
embedded system will reduce and response time will also decrease making them more effective.
Acknowledgement
We thank all those who have contributed to this research article. We thank our friends and family
members for their support and encouragement. We would also like to give a special mention of
the efforts and the encouragement of the faculty members of Department of Computer Science
and Engineering, CMRIT and especially our faculty mentor Mr. Sudhakar K.N. for his able guidance,
without whom we could not have completed this article.
Last but not the least, we would like to thank Intel Corporation for conducting this contest,
participating in which we have learnt so much.
References
Web References:
Wikipedia, Intel Software Network Resources, IEEE articles and journals, MIT Open Courseware
Michael Barr. "Embedded Systems Glossary". Netrino Technical Library.
Embedded.com - Under the Hood: Robot Guitar embeds autotuning
John Catsoulis, Designing Embedded Hardware, O'Reilly, May 2005, ISBN 0-596-00755-8.
Anoop MS, Security needs in embedded systems, Tata Elxsi, India, May 2008.
International Journal of Critical Computer-Based Systems
Other References:
Operating System 3rd edition, by Gary Nutt
Operating System concepts 7th edition, by Galvin
Download