Cost Efective and High Speed Design of DMA Memory Copy Accelerator

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
Cost Efective and High Speed Design of DMA Memory
Copy Accelerator
* Jhansirani Koti1
1
G. Mani Kumar
2
PG Student (M. Tech), Dept. of ECE, Chirala Engineering College, Chirala., A.P, India.
2 Assistant Professor, Dept. of ECE, Chirala Engineering College, Chirala., A.P, India.
Abstract: Direct memory access (DMA) is a feature of modern computers that allows certain
hardware subsystems within the computer to access system memory independently of the
central processing unit (CPU).Without DMA, when the CPU is using programmed input/output,
it is typically fully occupied for the entire duration of the read or write operation, and is thus
unavailable to perform other work. With DMA, the CPU initiates the transfer, does other
operations while the transfer is in progress, and receives an interrupt from the DMA controller
when the operation is done. This feature is useful any time the CPU cannot keep up with the
rate of data transfer, or where the CPU needs to perform useful work while waiting for a
relatively slow I/O data transfer. A Universal Asynchronous Receiver/Transmitter is a type of
"asynchronous receiver/transmitter", a piece of computer hardware that translates data
between parallel and serial forms. The universal designation indicates that the data format and
transmission speeds are configurable and that the actual electric signaling levels and typically
are handled by a special driver circuit external to the UART. A UART is usually an individual
(or part of an) integrated circuit used for serial communications over a computer or peripheral
device serial port. UARTs are now commonly included in microcontrollers. UART IP soft core
based on DMA mode is proposed and well elaborated using the characteristic of DMA. The
entire UART IP soft core in DMA mode mainly includes the following 5 sub-modules: UART
send controller, UART Receive controller, Register file with the Interface of Avalon-MM Slave,
Master Read type DMA controller with the interface of Avalon-MM Master and Master Write
type DMA controller with the interface of Avalon-MM Master. The design of UART IP Soft Core
based on DMA Mode is simulated using Modelsim tool, and synthesized using Xilinx tool .
microprocessors to an asynchronous
1. Introduction
This
thesis
portrays
a
novel
serial
data
channel.
The
receiver
Universal
converts serial start, data, parity and
Asynchronous Receiver Transmitter.
stop bits. The transmitter converts
UARTs are used for asynchronous
parallel data into serial form and
serial data communication between
automatically adds start, parity and
remote embedded systems. The UART
stop bits. The data word length can be
is
5, 6, 7 or 8 bits. Parity may be odd or
architecture
for
of
interfacing
ISSN: 2231-5381
computers
or
http://www.ijettjournal.org
Page 150
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
even. Parity checking and generation
to the serial data. An example of the
can be inhibited. The stop bits may be
UART frame
one or two or one and one-half when
Figure 1 below.
transmitting 5-bit code.
The
Receiver
format is
shown in
The UART can be used in a wide
Universal
Asynchronous
range
Transmitter
(UART)
modems, printers, peripherals and
is
a
of
applications
popular and widely-used device for
remote
data communication in the field of
Utilizing the Intersil advanced scaled
telecommunication.
SAJI
There
are
data
including
IV
acquisition
CMOS
systems.
process
permits
different versions of UARTs in the
operation clock frequencies up to
industry. Some of them contain FIFOs
8.0MHz
for
requirements,
the
receiver/transmitter
data
(500K
Baud).
by
comparison,
reduced
9 Data bits mode (Start bit + 9 Data
Status logic increases flexibility and
bits
simplifies the user interface.
Parity
+
Stop
bits).
This
300mW
to
are
buffering and some of them have the
+
from
Power
10mW.
application note describes a fully
configurable UART optimized for and
implemented in a variety of Lattice
devices,
which
performance
have
and
superior
architecture
compared to existing semiconductor
ASSPs (application-specific standard
products).
This
This
design
instantiated
can
many
times
also
be
to
get
multiple UARTs in the same device.
For easily embedding the design into
UART
reference
design
contains a receiver and a transmitter.
The
Figure 1 Basic Application of UART
receiver
parallel
performs
conversion
serial-toon
asynchronous data frame
the
received
from the serial data input SIN. The
transmitter
performs
parallel-to-
serial conversion on the 8-bit data
received from the CPU. In order to
synchronize the asynchronous serial
data and to insure the data integrity,
a larger implementation, instead of
using
tri-state
buffers,
the
directional data bus is separated into
two
buses,
DIN and
DOUT.
The
transmitter and receiver both share a
common internal Clk16X clock. This
internal clock which needs to be 16
times of the desired baud rate clock
frequency is obtained from the onboard clock through the MCLK input
directly.
Start, Parity and Stop bits are added
ISSN: 2231-5381
bi-
http://www.ijettjournal.org
Page 151
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
Serial Data- Transmission & Reception
The
Universal
(with or without wires). Examples are
Asynchronous
modulation
Receiver/Transmitter
(UART)
phone line modems, RF modulation
bytes
transmits
of
data
and
takes
the
individual bits in a sequential fashion.
the
bits
into
complete
audio
signals
with
with data radios, and the DC-LIN for
power line communication.
At the destination, a second UART reassembles
of
Communication may be simplex
(in
one
direction
only,
with
no
bytes. Each UART contains a shift
provision for the receiving device to
register, which is the fundamental
send
method of conversion between serial
transmitting device), full duplex (both
and
Serial
devices send and receive at the same
transmission of digital information
time) or half duplex (devices take
(bits) through a single wire or other
turns transmitting and receiving).
medium is much more cost effective
Receiver
than parallel transmission through
All
parallel
forms.
multiple wires.
The
operations
back
of
to
the
the
UART
hardware are controlled by a clock
does
not
signal which runs at a multiple of the
receive
the
data rate. For example, each data bit
between
may be as long as 16 clock pulses.
different items of equipment. Separate
The receiver tests the state of the
interface devices are used to convert
incoming signal on each clock pulse,
the logic level signals of the UART to
looking for the beginning of the start
and from the external signaling levels.
bit. If the apparent start bit lasts at
External signals may be of many
least one-half of the bit time, it is
different
of
valid and signals the start of a new
standards for voltage signaling are
character. If not, the spurious pulse is
RS-232, RS-422 and RS-485 from the
ignored. After waiting a further bit
EIA. Historically, current (in current
time, the state of the line is again
loops) was used in telegraph circuits.
sampled
Some signaling schemes do not use
clocked into a shift register. After the
electrical wires. Examples of such are
required number of bit periods for the
optical
and
character length (5 to 8 bits, typically)
(wireless) Bluetooth in its Serial Port
have elapsed, the contents of the shift
Profile (SPP). Some signaling schemes
register is made available (in parallel
use modulation of a carrier signal
fashion) to the receiving system. The
directly
external
UART
information
usually
generate
signals
forms.
fiber,
IrDA
ISSN: 2231-5381
or
used
Examples
(infrared),
and
http://www.ijettjournal.org
the
resulting
Page 152
level
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
UART will set a flag indicating new
data
is
generate
request
available,
a
also
Transmission operation is simpler
processor interrupt to
since it is under the control of the
that
the
and
may
Transmitter
host
processor
transfers the received data.
transmitting system. As soon as data
is deposited in the shift register after
The best UARTs "resynchronize"
completion of the previous character,
on each change of the data line that is
the UART hardware generates a start
more than a half-bit wide. In this way,
bit, shifts the required number of data
they
the
bits out to the line, generates and
transmitter is sending at a slightly
appends the parity bit (if used), and
different speed
appends
reliably
receive
when
than the
receiver.
the
stop
bits.
Since
(This is the normal case, because
transmission of a single character
communicating units usually have no
may take a long time relative to CPU
shared timing system apart from the
speeds, the UART will maintain a flag
communication
showing busy status so that the host
signal.)
Simplistic
UARTs may merely detect the falling
system
edge of the start bit, and then read
character for transmission until the
the center of each expected data bit. A
previous one has been completed; this
simple UART can work well if the data
may also be done with an interrupt.
rates are close enough that the stop
Since full-duplex operation requires
bits are sampled reliably.
characters to be sent and received at
It is a standard feature for a UART
does
not
deposit
a
new
the same time, practical UARTs use
to store the most recent character
two
while receiving the next. This "double
transmitted characters and received
buffering" gives a receiving computer
characters
an entire character transmission time
DMA CONTROLLER
to fetch a received character. Many
different
Transferring
shift
of
registers
data
for
bytes
to
UARTs have a small first-in, first-out
memory from devices like magnetic
FIFO
buffer memory between the
disk or optical disk are faster than
receiver shift register and the host
can be read by a software program. In
system interface. This allows the host
such applications we use a dedicated
processor even more time to handle
hardware
an interrupt from the UART and
controller
prevents loss of received data at high
transfer.The
rates.
temporarily borrows the data bus and
ISSN: 2231-5381
http://www.ijettjournal.org
device
to
called
manage
DMA
DMA
the
data
controller
Page 153
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
control bus and control bus from the
microprocessor and transfers the data
bytes directly from the disk controller
to a service of memory location.
Because the data transfer is handled
The transfer offer data is carried in
the following way:
1)
Acquires control of system bus
then
2)
Acknowledges
that
totally in hardware, it is much faster
peripheral
than it would be done by program
highest priority channel.
instructions. A DMA controller can
3)
which
is
requesting
connected
to
Outputs the LSB of the memory
also transfer data from memory to a
address on to system address lines
port. Some DMA devices even can do
A0-A7
memory
address to the 8212 I/O port via the
to
memory
transfer
to
outputs
and
MSB
of
the
complement fast block transfers.
data bus ( the 8212 process there bits
INTEL 8257 DMA CONTROLLER
on A8-A15 ) and
The 8257 is a programmable, 4channel
direct
controller,
in
memory
the
and I/O read/write control signals
that causes the peripherals to receive
peripherals can request data transfer.
or a data directly from or to the
Each
address location of memory.
can
that
Generates the appropriate memory
4
channel
sense
access
4)
transfer
16.384(16k) bytes, the MPU provides
The 8257 will retain control of the
a 16 bit starting address a 14 bit
system bus and repeat the transfer
count for the number bytes, direction
sequence as long as a peripheral
of data transfer and control words to
maintains its DMA request. Thus the
the 8257.
8257 can transfer a block of data
FUNCTIONAL DESCRIPTION
to/from a high-speed peripheral in a
The 8257 is programmed, DMA
single “burst”. When the specified
devices which when coupled with a
number of data bytes have been
single Intel 8212 I/O devices, provides
transferred, the 8257 activates its
a complete 4 channel DMA controller
terminal count(TC) output informing
for a use in Intel Micro Computer
the
Systems. After being initialized by
complete.
software, the 8257 can transfer the
MODES OF OPERATION
data to peripheral devices directly, on
receiving a DMA transfer request from
an enabled peripheral.
ISSN: 2231-5381
CPU
that
the
operation
is
The 8257 can operate in three
modes of operation
a)
DMA write
b)
DMA read
http://www.ijettjournal.org
Page 154
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
a) DMA WRITE
of
Avalon-MM
Master.
When
the
In this mode, data is transferred
NIOSII processor sends data through
from the peripheral device to the
serial port, firstly, it’s necessary to
memory (that is I/O Read memory
make configuration to the UART sent
write).
controller and the Master Read type
b) DMA READ
DMA controller through the register
In this mode data is transferred
file with the interface of Avalon-MM
from the memory to the peripheral
Slave to set the baud rate of the serial
device (that is, memory Read, I/O
port, the number of bytes of the data
Write).
to be sent and the base address of the
2. Block diagram for UART-DMA
data stored in the memory.
Secondly, write the data to be sent
to
the
specified
location
in
the
memory and then start the Master
Read type DMA controller, thus the
data stored in the memory is sent out
one by one through the UART sent
controller. When all the data that you
want to send has been sent, an
interrupt will be generated in the
NIOSII
UART IP soft core is designed
using DMA transmission here. Its
is
shown
in
Figure 2. The entire UART IP soft core
in DMA mode mainly includes the
following 5 sub-modules: UART send
controller, UART Receive controller,
Register file with the Interface of
Avalon-MM Slave, Master Read type
DMA controller with the interface of
Avalon-MM Master and Master Write
type DMA controller with the interface
ISSN: 2231-5381
inform
the
serial data is completed, so as to start
ABOUT ARCHITECTURE
architecture
to
processor that the transmission of
Figure 2 Block diagram of DMA controller
overall
processor
the next data transmission. Since the
whole
process
of
data
sent
is
managed by the Master Read type
DMA controller, NIOSII processor can
concentrate on other things and not
be disturbed, thus the utilization ratio
of
NIOSII
CPU
increases
greatly.
When the NIOSII processor need to
receive data through serial ports,
firstly, it’s necessary to fulfill the
configuration on the UART receive
controller and a Master Write type of
http://www.ijettjournal.org
Page 155
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
DMA controller through the register
state
machine
in
the
hardware
file with Avalon-MM Slave interface to
description language of Verilog HDL,
set baud rate of the serial port, the
thus completing the timing control of
number of bytes of data which will be
the data transmission of serial port.
received and the base address of the
Its state transition diagram is shown
data stored in the memory. Secondly,
in Figure 3.
start the Master Write type of DMA
controller, thus the data received
through the UART controller can be
stored in the specified location in the
memory one by one, when all the data
is
received,
an
interrupt
will
be
generated in the NIOSII processor to
inform
the
the
As can be seen from Figure 3, the state
is
machine firstly is in idle state. When
completed, so as to read the data that
the Master Read type of DMA controller
has been received from the memory
starts to conduct a data transmission,
for processing and start the next data
the
transmission. Since the whole process
data_valid state. Data_valid, read_fifo
of data reception is managed by the
and the load, these three states are
Master Write type of DMA controller,
mainly used to access a byte which is
NIOSII processor can concentrate on
read by the Master Read type of DMA
other things and not be disturbed,
controller from the memory. Splice the
thus the utilization ratio of NIOSII
byte with the start bit and stop bit
CPU increased substantially.
together
transmission
processor
of
that
Figure 3 the state transition diagram of the UART sending
controller
serial
data
3. DESCRIPTION OF MODULES
A. UART Tx controller Design:
state
machine
and
send
moves into
it
to
the
the
shift
register. After that, the state machine
enters into the send state. Send and
The transmission of serial port
finish, these two states are mainly used
uses the basic frame format. First of
to send the data in the transmit shift
all, send low to the start-bit, and then
register bit by bit under the control of
under the control of the clock in the
serial
baud rate, send 8-bit data from D0 to
process of sending the data in the
D7, finally sent high to the stop-bit. In
transmit shift register is completed; the
this paper, a UART sending controller
state machine will enter into the state
is designed using the way of finite
of block_finish.
ISSN: 2231-5381
port
http://www.ijettjournal.org
baud
clock.When
Page 156
the
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
In this state, the state machine makes
from figure 4, the state machine is in
a judgment of the number of bytes of
the idle state at the beginning.
the data which has been sent out. If the
number is less than the number of
bytes of data that should be sent, it
shows that all the data has not been
sent out, so the count plus 1 and the
state machine enters into the state of
data_valid to read and send the next
data byte. The state machine doesn’t
Figure 4 The state transition diagram of the UART receiver
controller
enter into the state of master_done,
When the Master Write type of
until all the data bytes are sent out. In
DMA controller is started to conduct a
this state, the state machine makes
reception of the data, the state machine
detection whether this DMA transfer is
enter into the start state. Start, ready,
completed.
state
these two states are mainly used to
machine will generate an interrupt
clear the receiver shift register and the
signal and enter into the idle state. At
bit counter and prepare for receiving a
this
data
byte of data. When it’s ready, the state
transfer
machine enters into the recv state.
If
it’s
point,
a
transmission
in
done,
full
the
the
serial
DMA
mode.
Racv,
finish,
these
two
states
are
B. UART receiver controller Design:
mainly used to receive the byte of data
The reception of serial port uses
bit by bit under the control of serial
the basic frame format. First of all,
port baud rate clock and store it in the
detect the start-bit is low-level. Then
reception
receive the bytes of data bit by bit
reception of a byte of data is completed,
under the control of the clock in the
the state machine enters into the load
baud rate. Finally, receive the high-level
state. Load, buffer_ready, these two
of the stop bit. In this paper, a UART
states are mainly used to move the byte
receiver controller is designed using the
data to the Master Write type of DMA
way of finite state machine in the
controller in order to complete write
hardware
operation from the bytes of data to the
Verilog
description
HDL,
thus
language
completing
of
the
shift
register.
When
the
memory.
timing control of the data reception of
Then the state machine enters into the
serial port. Its state transition diagram
block_finish state. In this state, the
is shown in Figure 4. As can be seen
state machine makes a judgment of the
ISSN: 2231-5381
http://www.ijettjournal.org
Page 157
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
number of bytes of the data that has
Salve interface. There area total of four
been received. If the number is less
32-bit registers in it, whose specific
than the number of bytes of data that
structure and function is shown in
should be received, it shows that all the
Table 1.The NIOSII processor accesses
data has not been received, so the
these 4 registers by the way of base
count plus 1 and the state machine
address plus address offset, which can
enters into the ready state to receive
control the configuration of the UART IP
the next data byte and send it to the
soft-core in the DMA mode as well as
Master Write type of DMA controller.
the reception and the transmission of
The state machine doesn’t enter into
the serial data.
the state of master_done, until all the
D. The design of the Master Read
data bytes are received. Master_done
type of DMA controller with the
and get_done, these two states detect
interface of Avalon-MM Master:
whether
is
The Master Read type DMA controller
state
with the Avalon- MM Master interface
this
completed.
If
DMA
it’s
Transfer
done,
the
machine will generate an interrupt
designed
signal and enter into the idle state. At
peripheral with the Avalon-MM Master
this point, a full serial data reception in
main ports. It finishes the basic reading
the DMA transfer mode.
transport through the switching fabric
C. The design of the register file with
between Avalon-MM Master main ports
the interface of Avalon-MM Slave:
and the AVALON, so that it can read
AVALON bus, an open interconnect
the specified length of data from the
bus; can be used to connect the main
memory
peripherals and the minor peripherals.
specified and send it out one by one to
The main peripheral can initiate bus
the UART send controller.
transfers on the AVALON bus. While
E. The design of the Master Write
the minor peripheral can only respond
type of DMA controller with the
to
interface of Avalon-MM Master:
the
bus
transfers.
The
main
in
this
whose
passage
starting
is
address
the
is
peripheral connects with the AVALON
The Master Write type DMA controller
bus
with the Avalon-MM Master interface
using
the
Avalon-MM
Master
interface, while the minor peripheral
designed
using the Avalon-MM Slave interface.
peripheral with the Avalon-MM Master
The register file with the Avalon-MM
main ports. It finishes the basic writing
Slave interface designed in this passage
transport through the switching fabric
is a peripheral with the Avalon-MM
between Avalon-MM Master main ports
ISSN: 2231-5381
in
http://www.ijettjournal.org
this
passage
is
Page 158
the
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
and
the
AVALON,
so
that
it
can
continuously store the specified length
of
data
receiving
received
from
controller
in
the
the
UART
memory
whose starting address is specified.
4. Results & Conclusions
Figure 6 Simulation Result for Top module Write operation
The entire UART IP soft core in DMA
mode is designed with the following 5
sub-modules: UART send controller,
UART Receive controller, Register file
with the Interface of Avalon-MM Slave,
Master Read type DMA controller with
the interface of Avalon-MM Master and
Master Write type DMA controller with
the interface of Avalon-MM Master.
Figure 7 Block diagram of UART_DMA soft core
The design of UART IP Soft Core
based on DMA Mode write and read
operations simulated using Modelsim
tool, and synthesized using Xilinx tool.
From the reports the conclusion can be
made that,the proposed design using
UART
can
achieve
better
speed
performance upto Ghz per Sec. So it
became
the
cost
effective
solution
Figure 8 RTL schematic of UART_DMA soft core
where cost is the major issue. Fig 5 & 6
Shows the simulation result of Read
and Write Operation of the designed
Acknowledgements
The authors would like to thank
DMA cycle. Fig 7 & 8 shows the RTL
Schematic of the designed DMA unit.
the
anonymous
reviewers
for
their
comments which were very helpful in
improving the quality and presentation
of this paper.
Figure 5 Simulation Result for Top Module Read operation
ISSN: 2231-5381
http://www.ijettjournal.org
Page 159
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 3- Dec 2013
References:
[1] Volnei A. Pedroni, ‘Circuit Design
Authors Profile:
with VHDL’, MIT Press, England.
Jhansirani
[2] Charless H. Roth, Jr (2005) ‘Digital
Pursuing
College,
Thomson Asia private limited, Singapore.
&
3rd Edition.
communication
based
on
Academic
Chirala
Efficient
and
DMA
its
in
Journal
UART
Tech
in
the
Communications
Engineering
fuguang.
M.
department of Electronics
[3] Bhaskar .J (2004) ‘A VHDL Primer’,
Yang
her
is
from Chirala Engineering
Systems Design Using VHDL’, 3rd edition,
[4]
Koti
(ECE)
with
specialization in VLSI & Embedded systems.
applications
ARM.
Web
Chinese
G.Mani Kumar is working
Publishing
as an Assistant Professor
General Library,2008.
in the Department of ECE
[5] S. Vassiliadis, F. Duarte, and S.
in CEC Chirala. He has 5
years
Wong, “A load/store unit for a memory
hardware accelerator,” in Proc. Int’l Conf.
Field
Programmable
Logic
and
of
Teaching
Experience and 1 year Industrial Experience
in various organizations
Applications, 2007, pp. 537-541
[6]
S.
Wong,
F.
Duarte,
and
S.
Vassiliadis, “A hardware cache memcpy
accelerator,” in Proc. IEEE International
Conference
in
Field
Programmable
Technology, 2006, pp. 141–147.
[7] K. Vaidyanathan, W. Huang, L. Chai,
and D. K. Panda, “Designing efficient
asynchronous memory operations using
hardware copy engine: a case study with
I/OAT,” IEEE International Parallel and
Distributed Processing Symposium, Long
Beach, CA, USA, 2007.
[8] Altera corp, UART Core User’s Guide.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 160
Download