The partitioning of a multiprocessor program over multiple

ZigBee Smart Sensors Networks
Claudiu Lung, Sebastian Sabou, Ioan Orha
Andrei Buchman
Electronic and Computer Engineering Department
North University of Baia Mare
Baia Mare, Romania
[email protected]
Basis of Electronics Department
Tehnical University of Cluj Napoca
Cluj Napoca, Romania
Abstract— The main goal of this project is to implement a
multiprocessor system, which is able to control multiple wireless
ZigBee networks.
Multi-processor architectures have gained interest recently
because of their ability to exploit programmable silicon parallelism
at acceptable power-efficiency figures. Despite the potential benefit
they offer over single-processor architectures, it is unresolved how
one can write compact and efficient programs for multiple parallel
cores. In this paper, we propose the use of a synchronous hardware
description language to program a network of small PicoBlaze
processors [1].
describes the behavior of the logic circuit rather than writing
traditional Boolean logic equations. Computer-aided design
tools are used both to simulate the VHDL or Verilog design
and to synthesize the design to actual hardware.
The programmable logic circuits come to meet the
demands mentioned above, allowing the implementation of
both the control algorithms and the other auxiliary functions
on a single chip, at high speeds, all operations being carried
out in hardware. Another advantage brought by the
programmable logic circuits is the short development time.
This result in a fast time to market.
Keywords: FPGA, ZigBee, PicoBlaze
While parallel cores have been hailed as the next big step in
computer micro-architecture development, it has not been
obvious how to write efficient programs for multiprocessors.
The sequential programming model of a single core is
unable to address multiple cores at once. Designers therefore
have to make use of programming extensions such as threads
and message passing libraries, which quickly become
cumbersome to use. In addition, a systematic transformation of
a sequential program in C to a parallel program that can run on
multiple cores is unknown, and it remains a highly skilled
design activity.
We therefore looked at the use of a hardware description
language as a means to program multiple cores. The advantage
is that a hardware description language is inherently parallel.
The mapping problem from specification to architecture thus
does not require the detection of parallelism, but it rather does
require an efficient sequentialization. In this paper, we will
show a feasible solution for this problem.
Hardware description languages are often associated with a
low level of abstraction, unsuited for programming activities.
A major revolution in digital design has taken place over
the past decade. Field programmable gate arrays (FPGAs) can
now contain over a million equivalent logic gates and tens of
thousands of flip-flops. Today, digital systems are designed by
writing software in the form of hardware description
languages (HDLs). The most common HDLs used today are
VHDL and Verilog. Both are in widespread use. When using
these hardware description languages the designer typically
More and more often we need powerful computer
architectures to solve complex problems for short period of
time. From economy point of view it is less expensive to take
a dozen of processing cores working together than one stand
alone and state of the art high performance processor. So it is a
challenge to design hardware architecture and implement
algorithms in software for multiprocessor systems. Currently
we can see the nascence of multiprocessor cores combined in
a single chip. Thereby higher performance can be achieved
with lower frequency.
The partitioning of a multiprocessor program over multiple
cores is straightforward because the input specification is fully
parallel. A systematic transformation process converts the
parallel input specification into concurrent PicoBlaze
Proposed parallel architecture consists of up to four
PicoBlaze soft-core, which are supported free by Xilinx and
makes them to run a parallel algorithm.
PicoBlaze is chosen because it is compact, capable, and
cost effective fully embedded 8-bit RISC microcontroller core
optimized for several Xilinx FPGA families – Spartan 3, Virtex
2 and Virtex 2-Pro. Compared to other microcontrollers it is
extremely flexible. Its basic functionality can be easy extended
by connecting additional FPGA logic to its input/output ports
Spartan-3 has adopted the 18Kbit Block RAM elements
previously seen in the Virtex-II devices. This enables KCPSM3
to support programs up to 1024 locations which overcomes the
most commonly encountered limit of KCPSM with SpartanII(E).
PicoBlaze supports a program up to a length of 1024
instructions utilizing one block memory. Requirements for
larger program space are typically addressed by using multiple
KCPSM3 processors each with an associated block memory to
distribute the various system tasks.
The microcontroller effectively has 256 input ports and
256 output ports. The port being accessed is indicated by an 8bit address value provided on the ‘PORT_ID’. The port address
can be specified in the program as an absolute value (pp), or
may be indirectly specified as the contents of any of the 16
registers ( (sY) ).
During an ‘INPUT’ operation the value provided at the
input port is transferred into any of the 16 registers. An input
operation is indicated by a pulse being output on the
READ_STROBE. It is not always necessary to use this signal
in the input interface logic, but it can be useful to indicate that
data has been acquired by the processor. During an
‘OUTPUT’, the contents of any of the 16 registers are
transferred to the output port. An output operation is indicated
by a pulse being output on the WRITE_STROBE. This strobe
signal will be used by the interface logic to ensure that only
valid data is passed to external systems. Typically,
WRITE_STROBE will be used as a clock enable or write
Figure 1. PicoBlaze internal structure
A Universal Asynchronous Receiver Transmitter (UART)
component is used to convert serial data to parallel data, and
parallel data to serial data. A UART is extremely useful when
dealing with serial input and output between a gate array and
another component. The serial data transferred into the UART
is placed on an output bus after the UART converts it into
parallel information [3].
A UART is extremely useful when dealing with serial
input and output between a gate array and another component.
The serial data transferred into the UART is placed on an
output bus after the UART converts it into parallel
information. This bus can then be used as input to other logic
in the gate array. The resulting data can then be sent back out
serially by using the UART again.
The UART component includes two main functions: parallel
to serial conversion, and serial to parallel conversion. The
receiving portion of the UART handles the serial to parallel
conversion, while the transmitting portion of the UART
handles the parallel to serial conversion.
The UART has two main functional blocks that can be thought
of as two separate circuits packaged in one component. The
UART includes a circuit for receiving serial information, and
a circuit for transmitting serial information. The receiver takes
in a byte of serial data transmitted to the RXD port, and
converts it into one byte of parallel information. This byte is
then placed on the DBOUT port. The transmitter takes a byte
of parallel information found on the DBIN port, and converts
it into a byte of serially transmitted data. This data is
transmitted on the TXD port. The UART separated into its two
circuits can be seen in Figure 2
Figure 2. Receiver and Transmitter circuits.
Port_selector block, written in VHDL, allows the selection
of any 8-bit port of the 256 of ports that PicoBlaze
microcontroller can access depending on the address
generated. This block put in value the microcontroller’s ability
to access a large number of ports on 8 bits. The timing of data
transfer is done with WRITE_STROBE signal on the
ascending front of clock signal [2].
Figure 2.
Figure 3. Main module implementation
The main module act like a smart sensor from network and
the initiator of ZigBee network.
Main modules, implemented with PicoBlaze
microcontroller, sends serial data to Zigbee Coordinators and
sent them over wireless connections to the second control
modules witch filter information and sent a ACK respond to
The functionality of the MAIN module is give by the
PicoBlaze microcontroller. Program is stored on ROM block,
generated from the .PSM file.
First are set some constants and renamed some internal
registers to understand much better the program, easy upgrade
and service.
Next are displayed some important parameters of the radio
communication connection. In this part of the program are
initialized this parameters for start-up reasons.
Initialization parameter are sent to the Zigbee module over
the serial port, and displayed on the LCD.
At the first start, of each coordinator device, perform a
complete scanning of all available channels, from 11 to 26 and
create a network upon the most noise-free channel. This
network will characterized by a 14 bit identifier (PAN ID) alike
to 14 LSB of coordinator MAC address, and alterable by
proper command. At the network creation, coordinator choose,
through an algorithm of randomly number generator, 128 bit
key of data encryption.
When a remote device, Router or End Device, will be
started for the first time, it will perform a scanning upon all
channels, without any constraint, a network whose join in.
When a remote device had been joined, a Short address
(16 bit) will be awarded to it, and this is the way the network
packets are addressed.
The Short address will be the same for Routers, unless a
device reset will perform.
The End Device changes its Short address according to
the joined Router (called “Father”). So If the End device leaves
the radio field range of its “Father”, and join another Router,
the Short address surely change.
At the same time if it’s Father will be turned off, and the
End Device will found another router to join in, a new Short
address will award to it [4].
Regardless of the particular physical synthesis flow
chosen, the steps required to translate the VHDL or EDIF
output from an RTL synthesis software program into a
physically downloadable bit file are essentially the same and
are listed below:
1. Translate
2. Map
3. Place
4. Route
5. Generate accurate timing models and reports
6. Create binary files for download to device
Checking the proper functionality of the circuit was made,
in first step by simulation, with ISE Simulator included in
programs package, dedicated for programmable logic circuits
implementations. To achieve simulations was necessary to
generate a test bench file type, which must check all situations
that may arise in case of a real circuit [5].
Figure 4. Project flow
In practice, the test bench is simply a VHDL model that
generates the required stimuli and checks the responses. This
can be in such a way that the designer can view the waveforms
and manually check them, or by using VHDL constructs to
check the design responses automatically.[5]
Synthesis is the key stage between high-level design and
the physical place and route which is the final product of the
design flow. There are several different types of synthesis
ranging from behavioral, to RTL and finally physical synthesis
Figure 5. Synthesis flow
Experimental platform, which is presented in figure 6, is
developed by using Spartan 3E Starter Board, four
Coordinator ZigBee Modules and four Router ZigBee
Modules. Main module control all four ZigBee networks, each
network is made from one Coordinator module and one Router
Each wireless sensor is connected to solar panel which is
power supply.
Figure 7. Data read from wireless sensors connected to network.
Main module reads data from all four networks and
displays them on LCD. In this stage of project development,
temperature and battery voltage are read from each wireless
The application presented and performed, is a component
within a wider project, which proposes the creation of a
dedicated system implemented with intelligent programmable
logical circuits that will apply in assistance to persons with
Further development of the project involves:
Implementation of an algorithm for transmission of data
relating to the vital functions of patient on a regular basis, for
records on the evolution of the health status of the patient and
preventing an imbalance.
Pengyuan Yu, Patrick Schaumont, " Executing hardware as parallel
software for PicoBlaze networks",
KCPSM3_Manual.pdf ,
***, UART Component.pdf,
Figure 6. Experimental platform
Design recipes for FPGAs.pdf