Demonstration System for A Low ... Compression Integrated Circuit Charatpong Chotigavanich

advertisement
-4
Demonstration System for A Low Power Video
Compression Integrated Circuit
by
Charatpong Chotigavanich
Submitted to the Department of Electrical Engineering and Computer
Science
in partial fulfillment of the requirements for the degree of
Master of Engineering
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
February 2000
@ Charatpong Chotigavanich, MM. All rights reserved.
The author hereby grants to MIT permission to reproduce and
distribute publicly paper and electronic copies of this thesis document ftr
in whole or in part.
MASSACHUSETTS NSTITUTE
OFTECHNOLOGY
7 2000
JUL 2
L:BARIES
A uthor ...........................
Department of Electrical Engineering and Computer Science
January 20, 2000
. . . ................
Anantha P. Chandrakasan
Associate Profeseor~of Electrical Engineering
C ertified by .................................
Th0si§)Supervisor
Accepted by................
...............
Arthur C. Smhii.
Chairman, Department Committee on Graduate Theses
Demonstration System for A Low Power Video Compression
Integrated Circuit
by
Charatpong Chotigavanich
Submitted to the Department of Electrical Engineering and Computer Science
on January 20, 2000, in partial fulfillment of the
requirements for the degree of
Master of Engineering
Abstract
This thesis demonstrates a low power video compression integrated circuit which
consumes ultra low power. The system digitizes analog video signal and compresses
it using the video compression integrated circuit which utilizes wavelet transform and
zero tree coding algorithm to achieve high compression ratio and ultra low power.
The compressed data is then sent to a PC where it is decoded and played as a movie
in real time.
Thesis Supervisor: Anantha P. Chandrakasan
Title: Associate Professor of Electrical Engineering
2
Acknowledgments
My thesis could not be accomplished without the following people.
First of all, I would like to thank Prof. Anantha Chandrakasan for agreeing to
supervise this thesis. I greatly appreciate his help, suggestions, and also patience.
Throughout the term, he has been an excellent consultant who gives me advices on
not only academic issues but also life after school in general.
I also would like to thank Rex Min for his great assistance with almost everything.
I learned a lot from his past work and his hand-on experiences. He has been a great
instructor who leaves his desk at his busiest time of the day just to help me figure
out some minor bugs. This thesis would not even exist without him.
I thank Jim MacArthur. He is the guy who actually educated me and taught me
the real-world engineering lessons. In addition, Jim is a kind of entertainer whose
characters provide a great relief for me in the lab. We share a lot of thoughts about
many projects, business, laws, and even startups. If I ever become a millionaire in
the future, this is the man I will give my first million to.
I would like to thank Keith Fife for lending me an ISR cable to program CPLDs.
He also helped me with the board layout and other circuit problems.
I also thank Thomas Simon for his advices and his patience. Although I was
literally an annoyance to him during debugging, he was still calm and continued to
help me out.
I am thankful to many people in the lab. They have been very nice and helpful in
general. Thanks to: Alice Wang, Raj Amirtharajah, Vadim Gutnik, and Jim Goodman.
I thank all my friends who hang around me when I was not in the lab. Thanks to
Sandia Ren, Laurie Qian, and Duncan Bryce who also helped me review this thesis.
Thanks to Nuwong Chollacoop for his help in moving my stuff to my new apartment
while I was being busy with my thesis.
I owe tremendous gratitude to Preeyanuch Sangtrirutnugul. I could actually say
that she is a co-author of this thesis. After all, she stayed up with me until 7am
everyday making sure that I wouldn't do something foolish.
Finally, I am grateful to my parents and my family for their continuing support
throughout my entire life. Without them, I would not even be writing this thesis.
If I have left anyone I should be thankful to, I would like to apologize here.
3
Contents
1
Introduction
1.1 Background on Low-Power Video Processing . . . . . . . . . . . . . .
1.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
11
11
2
Hardware System
2.1 High-Level Block Diagram ........................
2.2 Image EPROM ..............................
2.3 Input Frame Buffer . . . . . . . . . . . . . . . . .
2.4 Video Compression Integrated Circuit(EZW Chip)
2.5 EZW Output Frame Buffer . . . . . . . . . . . . .
2.5.1 Bit Packing Finite State Machine . . . . .
2.5.2 Synchronizer Finite State Machine . . . .
2.5.3 Read/Write Buffer Switch . . . . . . . . .
2.6 Parallel Port Interface . . . . . . . . . . . . . . .
2.7 Complex Programmable Logic Devices(CPLD) . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
14
17
18
25
29
31
34
35
35
40
Software System
3.1 Direct Memory Access Device Driver
3.2 D ecoder . . . . . . . . . . . . . . . .
3.3 Video Player . . . . . . . . . . . . . .
3.4 User Interface(UI) . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
42
43
43
44
45
System Implementation
4.1 Hardware System . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Software System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
47
49
3
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 System Performance
52
6 Conclusion and Future Improvement
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Ideas for the Future . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
54
55
Bibliography
56
A Schematic Diagrams
57
4
64
B VHDL Code
64
B.1 VHDL Code for CPLD 1 .........................
64
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
B .I.1 i2c.vhd . . . . . . . . . . .
B .I.2 ntsc.vhd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
B.I.3 ezw-sram -drive. vhd . . . . . . . . . . . . . . . . . . . . . . . . 76
B .I.4 top-a.vhd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
B.2 VHDL Code for CPLD 2 . . . . . . . . . . . . . . . . . . . . . . . . . 83
B.2.1 sram-switch.vhd . . . . . . . . . . . . . . . . . . . . . . . . . . 83
B.2.2 ezw-out.vhd . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
B.2.3 parallel -sram. vhd . . . . . . . . . . . . . . . . . . . . . . . . . 91
B.2.4 top-b.vhd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
C Decoder Code
C-1 Decoder Code in C . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C .1.1 StdA fx.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C-1.2 StdAfx.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.1.3 vdodm a3.h . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.1.4 vdodma3.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . .
C .1.5 resource.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.1.6 m akefrm -c . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
100
100
100
101
101
101
141
142
List of Figures
2-1
Overview block digram of the hardware system
2-2
2-3
2-4
2-5
2-6
2-7
2-8
2-9
Timing diagram of output signals from Bt829A . . . . . .
An example of a buffer using a one-port SRAM . . . . . .
A mechanism used to prevent image corruption . . . . . .
Finite state machine of the left control logic module . . . .
Finite state machine of the right control logic module . . .
How a frame is loaded . . . . . . . . . . . . . . . . . . . .
Addressing scheme of the input buffer . . . . . . . . . . . .
Combinational logic which controls the tri-state outputs
. . . . . . . . . . . .
13
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
of SRAM
16
19
20
21
22
24
24
buffer and image EPROM . . . . . . . . . . . . . . . . . . . . . . . .
25
2-10 Timing Diagram of Input Signals to Video Compression Integrated
Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-11 Timing Diagram of Output Signals from Video Compression Integrated
C ircuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-12 64MHz to 500KHz clock divider . . . . . . . . . . . . . . . . . . . . .
2-13 Schematic Diagram of the Output Frame Buffer Controller . . . . . .
2-14 Bit Packing Finite State Machine . . . . . . . . . . . . . . . . . . . .
2-15 Bit Storage Format . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-16 An alternative of how to store a bit . . . . . . . . . . . . . . . . . . .
2-17 State diagram of parts of Read/Write Buffer Switch . . . . . . . . . .
2-18 Schematic Diagram of parts of Read/Write Buffer Switch . . . . . . .
2-19 Timing Diagram of the Parallel Port Protocol in ECP mode. . . . . .
2-20 State diagram of finite state machine that controls the parallel port
interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
28
29
30
31
32
33
34
36
37
39
2-21 Logic blocks in the first CPLD . . . . . . . . . . . . . . . . . . . . . .
40
2-22 Logic blocks in the second CPLD . . . . . . . . . . . . . . . . . . . .
41
3-1
3-2
Block Diagram of the Software System . . . . . . . . . . . . . . . . .
Double frame buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
45
3-3
The Working System . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
4-1
4-2
4-3
The front side of the unpopulated PCB . . . . . . . . . . . . . . . . .
The back side of the unpopulated PCB . . . . . . . . . . . . . . . . .
The Finished PCB with all components in place . . . . . . . . . . . .
48
49
50
A-1
Schematic Diagram of Analog to Digital Converter and Its Control Logic 58
6
A-2
A-3
A-4
A-5
A-6
A-7
A-8
A-9
Schematic diagram of EZW chip . . . . . . . .
Three instruction EPROMs for the EZW chip
Schematic diagram of the image EPROM . . .
Schematic diagram of the SRAM buffer and its
Schematic Diagram of i2c programmer . . . .
Schematic Diagram of Output Frame Buffer .
Schematic Diagram of Parallel Port Interface .
Schematic Diagram of Reset Control Logic . .
7
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
control logic modules
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
59
59
60
61
61
62
62
63
List of Tables
2.1
2.2
2.3
Bt829A pin descriptions . . . . . . . . . . . . . . . . . . . . . . . . .
Pin descriptions of the EZW chip . . . . . . . . . . . . . . . . . . . .
Mapping Between Centronics pinouts and D-SUB pinouts . . . . . . .
5.1
5.2
The relationship between the number of frames decoded and the latency 53
The performance of this EZW video chip . . . . . . . . . . . . . . . .
53
8
15
27
38
Chapter 1
Introduction
1.1
Background on Low-Power Video Processing
Video compression has been an important research topic for the past several years,
and a large number of algorithms and hardware devices were developped as a result.
Usually these algorithms and hardware devices target better quality, and more compression ratio. A lot of algorithms developed for video compression are for desktop
computers where the amount of power required for computation is not really a major concern. However, as portable applications become widespread, researchers have
become aware of power consumption as an important issue. The need for low-power
algorithms and hardware become inevitable. And for portable video devices, data
compression is one obvious way of saving power. Data compression reduces not only
the capacity required for data storage, but also power consumption because there is
less data to be transmitted. Wireless camera is an evident example of how a system
can benefit from data compression.In this case, the RF transmitter obviously consumes less power in sending smaller amount of data through a wireless network.
Thus, a lot of algorithms for video data compression have been developed such
as JPEG, MPEG, etc. Most algorithms fall under the category of "lossy compression" where image quality is reduced to compensate for the compression ratio gain.
However, having a powerful algorithm is not enough for low-power application. How
9
an algorithm is implemented in hardware actually accounts for most of the power
consumption.
There are a lot of general purpose processors(GPP) which are capable of running
virtually any of the video compression algorithms. But those GPPs usually consume
too much power, and portable devices which utilize such processors, eg. laptop computers, usually have large batteries and require a long re-charging time almost daily.
Therefore, a low-power integrated circuit is a necessity for this video application.
This thesis demonstrates the performance of a "wavelet transform and zero-tree
coding" video compression integrated circuit which requires low power consumption
and yields high compression ratio. Designed by Thomas Simon as a part of his PhD
thesis[4], this video compression integrated circuit uses the wavelet transform and
zero-tree coding algorithm[2] to compress and encode digital video signal. The chip
is a massively parallel SIMD processor which utilizes wavelet transform and zero tree
coding algorithm to compress and encode digital video data. The parallel nature of
the processor is key to achieving low power. When parallelism is introduced into a
system, the computation process can be done faster. But for an application whose
input or output is stream data with a fixed rate of arrival or departure, the speedup from parallelism is a waste. As a result, by reducing the power supply voltage,
the computation time can be lengthened just enough for a system with parallelism
to meet the required rate. At the lower power supply voltage, the circuit consumes
less power. This technique means that the speed-up can save some power consumption. Using this technique, the core of this video compression chip operates at 1.5V,
consumes 300-400/LW, and yields compression ratio up to 300:1 for acceptable image
quality.
10
1.2
System Overview
The system presented in this thesis will be a useful tool to demonstrate how much
power can be saved from the parallel architecture. The function of this system is to
digitize and compress a real-time video signal from a camera, send the data to a PC,
decode the data and finally, play the video signal on the PC's monitor. The entire
process can be performed in real-time with a few seconds of frame latencies.
This demonstration system comprises of hardware and software. The hardware
system is a 7.5"x7.5" printed circuit board with a video compression integrated circuit
and several other electrical components including the "wavelet transform and zerotree coding" video compression integrated circuit. The board handles NTSC signal
digitization, digital video compression, and data transmission to a PC.
The software part of this demonstration is developed for Microsoft Windows platform. The software runs on Windows 95, Windows 98, and Windows NT. The software is responsible for data reception from the board, data decompression, and video
playback.
1.3
Design Considerations
" Ease of Use.
Since this system will be frequently employed to demonstrate the
performance of the video compression integrated circuit, the system has to be
easy to operate. The hardware is designed to have a few controls on the circuit
board, and the user-interface of the software is developed to be user-friendly.
" Ease of Debugging.
The system is designed with modularity. There are
several small modules in this circuit and they are abstracted away from one
another. The interface between each module is consistent throughout the design
process. In addition, the circuit board has numerous accessible ports for logic
11
probes for debugging. The electrical components on the board are placed on
sockets so that they can be changed if damaged.
* Extensibility. This demonstration can potentially be a submodule of another
system, such as a wireless camera; therefore, it has to be extensible. The circuit
board utilizes surface-mounted complex programmable logic devices(CPLD)
which can be programmed on-board. These CPLDs provide great flexibility and
extensibility to modify the code in the future. The circuit also uses EPROMs
for data storage so the contents can be changed if necessary.
* Small Printed Circuit Board Area. The area of the printed circuit board
is minimized. All electrical components are placed tightly next to one another
to achieve minimal board area.
12
Chapter 2
Hardware System
This chapter discusses the hardware part of this system which handles NTSC signal
digitization, digital video compression, and data transmission to a PC. The architecture of the system is divided into several circuit modules. The interface specifications
between each modules are kept consistent throughout the design process so that an
internal change of one module would not effect the others. The modular design of
the hardware reduces the possibility of bugs and greatly speeds up the design process.
NTSC
Video
Source
A D
.-----------------------------------------------------------.
Input
Buffer
Video Compression
Integrated Circuit
Output
Buffer
Parallel
Port Interface
EPROM
A2D
Controller
Buffer
Controller
F
EPROM
Controller
Video IC
Controller
rCPLD
21
Figure 2-1: Overview block digram of the hardware system
13
TOC
2.1
High-Level Block Diagram
Figure 2-1 illustrates a high-level block digram of the hardware system. The system
can receive two sources of video input. One is analog video signal from an NTSC
video source, and the other is digital video signal programmed on an EPROM. For
analog video signal input, the signal is digitized into digital video stream. This video
stream is buffered and appropriately formatted for further processing. On the other
hand, the digital video input from the EPROM does not require analog-to-digital
conversion and frame buffering, and thus can be processed right away. Selected by
a controller, one of these two sources is then passed on to the video compression
integrated circuit, where the compression and encoding operations take place. The
output, as a stream of bits, is again buffered before it is sent to a PC through a
parallel port interface circuit which governs the transmission process.
The A2D converter receives NTSC signal from an NTSC source and produces an
8 bit gray-scale image that can be further processed. This NTSC source can be any
device, such as video player and video camera, which provides standard NTSC video
signal. Because NTSC is a widespread standard in North America, using the NTSC
interface allows this system to receive a variety of video input devices.
The design of the A2D module in this system is a slightly modified version of the
one from [3]. More details about this Bt829A can be found in [3] and [8].
Manufactured by Rockwell Semiconductor, the Bt829A chip is a widely used video
decoder in several video appliances including personal computers. In addition to its
ease of use, the chip supports a variety of video signal formats, such as NTSC and
PAL. It is also capable of adjusting frame size, frame resolution, and zooming. In
short, this chip is very powerful, versatile, and cheap. Table 2.1, replicated from [3],
shows the pin descriptions of this chip.
14
Pin
YIN
SCL, SDA
Input/Output
Analog Input
I2CCS
RST
XTOI
VD[15..8]
VD[7..0]
I
I
I
0
0
DVALID
0
I/O
Description
NTSC video signal input.
Clock and data lines for 12C serial bus for device
programming.
LSB of 8-bit 12C device address.
Reset.
NTSC clock (28.636 MHz).
Digitized luminance output in 8-bit mode.
Digitized luminance output in addition to
VD[15..8] in 16-bit mode.
Data Valid. High when a valid data (image pixel
or blanking) data is being output. Low during
blanking intervals or when no pixel is output due
to scaling.
ACTIVE
0
VACTIVE
FIELD
0
0
HRESET
0
VRESET
QCLK
0
0
CLKx1
0
Active Video. High when an active image area is
being output.
Vertical blanking. Low during active vertical lines.
Odd/even field indication. "1" denotes odd that
an odd field is being digitized.
Horizontal Reset. Falling edge denotes new
horizontal scan line.
Vertical Reset. Falling edge denotes a new field.
"Qualified Clock." Gated such that edges occur
only when valid, active image pixels are being
output.
14.31818 MHz clock output. All output signals
are timed with respect to this clock.
OE
I
Tri-state control for certain outputs.
Table 2.1: Bt829A pin descriptions
15
VD[15. .0]
DVALID
ACTIVE
CLKxl
a) Pixels are valid when both DVALID and ACTIVE are both high. All signals are
synchronized to CLKxl which is 14.31818MHz. The falling edge of DVALID signifies
the new field is being output.
HRESET
DVALID
ACTIVE
b) This timing diagram, the zoomout diagram of (a),
between HRESET, DVALID, and ACTIVE.
displays the relationship
VRESET
HRESET
VACTIVE
C) This timing diagram, the zoomout diagram of
between VRESET, HRESET, and VACTIVE
(b),
shows the relationship
Figure 2-2: Timing diagram of output signals from Bt829A
16
The Bt829A digitizes analog NTSC signal from input YIN. Synchronized to CLKxl,
the digital output VD[15..8] and VD[7..0] represent luminance and chrominance respectively. Because this system only needs 8 bits of data, the chrominance information
is ignored. So only VD[15..8] is used as displayed in Figure A-1. The timing diagram
of the output signal is illustrated in Figure 2-2.
Since this Bt829A requires programming upon startup, a control logic provides
programming interface to the chip using 12C protocol[7]. This control logic is implemented on a CPLD to allow flexible design and ease of debugging. The control
logic is a finite state machine which programs the Bt829A on SCL and SDK pins with
desired parameters. The implementation of the FSM was obtained from [3] with some
parameters adjusted for this demonstration system. There are several sets of these
parameters which yield 128x128 output frames, and each set of parameters results in
different image size and quality.
In this demonstration system, two sets of parameters were tested. The first parameter set programs the Bt829A to digitize a frame at resolution 256x256, and then scale
down to 128x128 vertically and horizontally. This method creates jagged horizontal
lines on every frame because the Bt829A does not scale interlaced frames very well.
The other set of parameters eliminated this problem by using non-interlace mode or
decimating the even fields of the digitized frame, so the frame then has 128x256 video
resolution which is then horizontally scaled down to 128x128. As a result, the second
parameter set yields much better and clearer video output.
2.2
Image EPROM
One of the significant challenges in implementing a system that involves real time
data is debugging. In a system whose input is real time and nondeterministic, the
output is also nondeterministic. It is exceptionally difficult to determine whether or
17
not the system is operating correctly from the nondeterministic output because there
is nothing to compare the output to. A good method for debugging such a system
is to input some test vectors and determine whether the output is as expected. The
image EPROM precisely serves this purpose. The schematic diagram of this image
EPROM is shown in Figure A-4.
This EPROM stores digital pixels of a video frame. It provides an alternative
input in addition to NTSC signal as mentioned in the previous section. The data to
be programmed on the EPROM is extracted from an 8-bit 128x128 black-and-white
PGM image. The PGM image format is basically an array of raw digital pixels with
a string header line which specifies the dimension of the image and level of grey-scale
depth. This header line is simply ignored to extract the raw digital pixels. A program
written in C is used to convert the PGM image into an appropriate size and format
for the EPROM. This program is included in the Appendix.
2.3
Input Frame Buffer
This input frame buffer is simply SRAM used to ensure that pixels are delivered in
an appropriate order from the A2D to the video compression chip. Furthermore, to
prevent video image corruption, the buffer also handles the rate difference between
the output rate of the A2D and the input rate of the video compression chip. The
image EPROM, however, does not need this buffer because an EPROM is itself a
pre-programmed buffer. The pixels from the image EPROM are sent directly to the
video compression chip without buffering. By eliminating the buffering process of the
EPROM, the datapath is less complicated and, therefore, more efficient.
There are several ways to implement a buffer. One typical way is as displayed
in Figure 2-3. In this implementation, the controller determines when to read and
write the the SRAM using two tri-state buffers. The controller cannot read and write
18
ADDR
DATA ADDR
R / W
R/W
WRITE DATA
I/o
READ DATA
SRAM
Controller
Figure 2-3: An example of a buffer using a one-port SRAM
simultaneously because there is only data port. Although this implementation might
be sufficient, the controller has to operate at very high frequency to keep up with the
data rate.
Another way of implementation is to use dual-port SRAM. One port is used solely
to write data from A2D, and the other is used to read data. Figure A-5 shows the
schematic diagram of the SRAM and its control logic modules. The module on the
left, UCYP_1:D controls the write buffer, and the right one, UCYP-1:E, controls the
read buffer and also the output enable for the image EPROM discussed earlier. To
reduce the complexity of the system, both control logic modules are implemented on
a CPLD. With this implementation, the buffer can be read and written simultaneously and the controllers can operate at half of the frequency required for one-port
SRAM(or 28.63MHz).
One of the main functions of the buffer is to handle the rate difference between
the A2D and the video compression chip. The output rate of the A2D converter is 30
frames per second, but the input rate of the video compression chip is 30.518 frames
per second(will be discussed below). If the write address and read address are not
controlled, every once in a while both addresses can overlap, thus causing image corruption. Therefore, a control mechanism is necessary to ensure that a frame cannot
be read and written at the same time. To prevent the occurrence of image corruption,
19
FRAME #0, lowest address
From 0 to (2^17 - 1)
01100111001...
FRAME #1
From 2^17 to (2^18 -
1)
READ DATA to video compression chip
110001111000...
When this frame is completely read,
the next frame is read if it is not
being written. If it is being written,
this frame is repeated.
FRAME #2
From 2^18 to (2^19 -1)
WRITE DATA from
2D
0101101100...
When this frame is
completely written,
frame #3 is written next.
If the current frame is #3,
frame #0 is next.
FRAME #3, highest address
From 2^19 to (2^20 - 1)
110011101 ...
Figure 2-4: A mechanism used to prevent image corruption
20
Waiting
Set Address Count=O
Address Count = 16383 (128x128-1)
If BANK = 3
BANK = 0
else BANK = BANK + 1
If VRESET = 0 and field = 1
Loading
Figure 2-5: Finite state machine of the left control logic module
the SRAM buffer is divided into 4 memory banks, each capable of storing exactly 1
frame. The read and write control logic modules are designed not to access the same
bank simultaneously. Figure 2-4 explains how the mechanism works.
From Figure 2-4, there are four frames, each frame contains 128x128 pixels and
occupies address bits 0 - 13 or 214 bytes of space in the SRAM buffer. The highest
address bits 14 and 15 are used to identify the number of each frame. The frame
being written is always kept at least one frame ahead of the frame being read. Since
the reading rate is faster than the writing rate, every 1 or 2 seconds a frame is done
reading whereas the next one is still being written. In this case, that read frame has
to be repeated to provide more time for the next frame to finish writing. This frame
repetition happens only once every few seconds, and it does not have significant effect
on the final image. In fact, it is not noticeable at all in the final video output.
In the schematic diagram in Figure A-5, there are two controllers implemented
on a CPLD for design flexibility. The controller on the left of the SRAM is a "write
controller" and on the right is a "read controller". The write controller is a simple
finite state machine which receives inputs from A2D and writes the data to the SRAM
buffer starting from the frame 0. From Figure 2-5, the machine has only 2 states:
WAITING and LOADING. The machine is synchronized to the CLKx1 output of
Bt829A chip. At first, the machine waits in the WAITING state. When a pixel is
21
Reset
Send reset signal
to the video chi
After one clock cycle
Delay Start
Idle for 4 cycles
After 4 clock cycles
Waiting
After the left control logic
has finished writing a frame
Start Signal
Send a pulse of STARTFRM
to the video chip
One clock cycle
Loading
Looping forever in this state to delive
pixels continuously.
The machine repeats a frame
when necessary
Figure 2-6: Finite state machine of the right control logic module
delivered from the Bt829A, or when VRESET and FIELD from Bt829A equal to 0
and 1 respectively, the machine proceeds to LOADING state where it starts loading
pixel into the SRAM buffer. When all pixels are stored, the machine goes back into
WAITING state and prepares to write to the next frame of the buffer.
Unlike the write controller, the read controller has 2 functions: delivering pixels in
a specific order to the video compression chip using the mechanism above to prevent
image corruption, and controlling the tri-state outputs of the image EPROM and of
the SRAM buffer. The part that delivers pixels is essentially a finite state machine,
shown in Figure 2-6. Synchronized to the clock of the video chip, the machine starts
22
in the RESET state where the video chip is also reset, and then goes to the DELAY START state. The DELAY START state actually does not do anything but
idle for a few cycles to allow some time for the video chip to become ready.
After
the DELAY START state, the machine waits in the WAITING state until the write
control logic finishes writing the frame 0, then it moves to the START state. In this
state, the machine asserts the start signal for the video chip and begins loading pixels
continuously in the LOADING state. Looping forever in the LOADING state, the
machine repeats loading a frame when the previously described condition occurs.
Due to the SIMD architecture of the video chip, frame pixels cannot be loaded
in order. Instead, the pixels are loaded as shown in Figure 2-7. One image frame is
divided into 1024 4x4 sub-frames. The first pixel of each sub-frame is loaded first one
by one, starting from the top left sub-frame(eg. pixel Al, then B1, then C1, ..., then
D1, then El, then F1, ...) After that, the next pixel of each sub-frame is loaded(eg.
A2, then B2, then C2, ..., then D2, then E2, then F2, ...) and so on until all the pixels
are loaded. Although this method of pixel loading seems complicated, it is actually
easy to implement. The addressing scheme is simply a crossing of the address lines
as shown in the Figure 2-8. Using this address crossing scheme, when the Count
Address counts in an increasing order from 0 to 16383, the SRAM is accessed in a
fashion described above automatically.
The other function of the read control logic, which controls the tri-state outputs
of SRAM and EPROM, is implemented as a set of combinational logic as illustrated
in Figure 2-9. Since the output pins of the image EPROM are connected directly
to the output ports of the SRAM buffer, this control logic is necessary to prevent
bus contention problem. Obtaining input from a switch, the logic simply determines
whether the source of data is from the SRAM buffer or the image EPROM.
'Although the idle state is not required, it is recommended in [4]
23
Al
A2
A3
A4
B1
B2
B3
B4
Cl
C2
C3
C4
AS
A6
A7
A8
B5
B6
B7
B8
C5
C6
C7
C8
A9
A10
All
A12
B9
B10
B11
B12
C9
CIO
Cl1
A13
A14
A15
A16
B13
B14
B15
B16
C13
C14
C15
El
E2
E3
E4
Fl
F2
F3
F4
E5
E6
E7
E8
F5
F6
F7
F8
E9
E10
Ell
E12
F9
E10
F11
F12
E13
E14
E15
E16
F13
F14
F15
F16
D1
D2
D3
D4
D5
D6
D7
D8
C12
D9
D10
D11
:D12
C16
D13
D14
D15
:D16
0@@
0
0
0
0
0
0
0
0
0
----
0 00-------
0
00
Figure 2-7: How a frame is loaded
Count Addr 0
SRAM Addr 0
Count Addr 1
SRAM Addr 1
Count Addr 2
SRAM Addr 2
Count Addr 3
Count Addr 4
SRAM
SRAM
Count Addr 5
SRAM Addr 5
Count Addr 6
SRAM
Count Addr 7
SRAM Addr 7
Count Addr 8
SRAM Addr 8
Count Addr 9
SRAM
Addr 3
Addr 4
Addr 6
Addr 9
Count Addr 10
SRAM Addr 10
Count Addr 11
SRAM Addr 11
Count Addr 12
SRAM
SRAM
Count Addr 13
Addr 12
Addr 13
Figure 2-8: Addressing scheme of the input buffer
24
0........
...
...
CS of SRAM
Input from
a switch
OE of SRAM
CS of image EPROM
OE
of image EPROM
Figure 2-9: Combinational logic which controls the tri-state outputs of SRAM buffer
and image EPROM
2.4
Video Compression Integrated Circuit(EZW
Chip)
The video compression integrated circuit is the heart of this demonstration system.
Its core circuit processes the most complexity yet consumes the least power. Utilizing
wavelet transform and zero-tree coding algorithm[2], this chip is a massively parallel
video processor designed by Thomas Simon as part of his PhD thesis[4]. This chip is
designed especially to compress 8-bit 128x128 pixel digital video stream with 8 levels
of adjustment for image quality (ie. compression ratio). It compresses a group of 16
frames at a time and outputs a series of bits for the entire 16 compressed frames. It
also consumes 300-4O0pW of power with compression ratio of approximately 200:1
for good image quality.
The key to the low power of this EZW chip is its parallel SIMD architecture.
Contrary to intuition, parallelism sometimes can save more power especially when
the required output rate is fixed. A circuit with parallelism generally can finish a
calculation more rapidly. However, it is unnecessary for the circuit to compute faster
when the data rate is fixed. Therefore, the circuit can spare the extra time for power
consumption by decreasing its operating voltage. When the operating voltage becomes lower, the computation time is lengthened. The voltage is reduced just enough
that the circuit can satisfy the required output rate. Although parallelism archi25
tecture usually consumes more power, a low operating voltage can offset the power
increase and generally result in overall lower power consumption.
Packaged in a 208 pin PGA, this chip requires 3 external instruction EPROMs
which store several sets of instructions used to compute different levels of compression. Each instruction set is burned onto the EPROMs at different locations which
can be accessed by 4 on-board switches. These switches allow real-time adjustment
of compression ratio and image quality.
Despite the complexity of the integrated circuit itself, its input and output interface is considerably simple to build a system around. Table 2.2 shows the pin
descriptions of the chip. In addition to the input and output pins described in the
Table, the EZW chips also have many other debugging pins which are not used in
this demonstration system.
RESET
START
EZW CLK IN
wait
a few cycles
EZW DATA IN
(8 bit bus)
I
Pixel 0
Pixel I
Pixel 2
Figure 2-10: Timing Diagram of Input Signals to Video Compression Integrated
Circuit
The input interface of this chip is very simple. Figure 2-10 shows the timing diagram of input signals to the video compression integrated circuit. When the chip is
powered up, it has to be reset before any computation is performed. All input signals
are synchronized to EZW CLK IN clock signal which is fixed at 500KHz. The reset
26
Pin
CLK
Input/Output
I
RESETFRM
I
STARTFRM
I
Description
500KHz clock input. All inputs are synchronized
to this clock.
Reset signal. It puts the chip into a reset state
and wait for the start signal.
Start pulse. It signifies that the next data is valid
at the next clock cycle.
PIX[7..0]
INST[19..0]
I
I
CLKOUT
DOUT
ENDGRP
0
0
0
VCC
Vdd
I
I
VWW
I
Digital video pixel.
Instructions for computation. In this system,
the EZW chip receives instructions from 3
EPROMs.
Clock for DOUT.
Data output bit with respect to CLKOUT.
A pulse that signifies the end of a compressed
group.
5V power supply for output driver.
Power supply for the core of the chip. It can
range from 1.5V-2.5V.
Power supply. VWW should be VDD + 1V.
Table 2.2: Pin descriptions of the EZW chip
27
EZW DATA OUT
X
EZW DATA OUT CLK
EZW GROUP CLK
Figure 2-11: Timing Diagram of Output Signals from Video Compression Integrated
Circuit
signal, RESETFRM, has to be at least 1 clock cycle. After the chip is reset, it waits
for the STAR TFRM signal which should be also at least 1 clock cycle. After the
STARTFRM signal is asserted, each pixel is read continuously into the chip at the
next rising edge of the CLK input signal. When all pixels of a frame are completely
delivered, the first pixel of the next frame is immediately delivered at the next clock
cycle. At 500Khz, the chip can process 500x10 3 /(128x128) = 30.518 frames per second.
The output interface, Figure 2-11, of the EZW chip is even simpler than the input
interface. There are only 3 output pins from the chip, as described in Table 2.2. The
DOUT is the output data bit and should be read at the rising edge of the CLKOUT
signal. The ENDGRP determines the end of 16 frame group. The chip can assert
CLKOUT sparingly or in bursts depending on spatial and temporal content of the
input frames.
As described previously, the rate of the output bits depends on the voltage level
of Vdd. The period of the EZW DAT CLK increases when the Vdd decreases. At
the lowest Vdd of 1.5V, the EZW DAT CLK has the longest period of 200 ns. The
period of the EZW DAT CLK decreases to 90 ns when Vdd is at its highest value of
2.5V.
This EZW video compression chip requires only a small controller. It needs a
reset signal and a start signal which are provided by the Input Frame Buffer module.
28
+1
D[6. .0]
64MHz
7 bit
register
Q[6..01
Q[6]
500KHz
clock
Figure 2-12: 64MHz to 500KHz clock divider
The 500KHz clock is generated by dividing the 64MHz system clock by 128 as shown
in Figure 2-12. To save some circuit board space, the clock divider is implemented
on a CPLD.
2.5
EZW Output Frame Buffer
The EZW Output Frame Buffer basically buffers the output from the video compression chip before it is sent to a PC. Similar to the Input Frame Buffer, the Output
Frame Buffer is comprised of SRAM chips and control logic implemented on a CPLD.
The EZW Output Frame Buffer receives serial data output from the video chip and
writes it to one part of the memory buffer, while at the same time, another part of the
buffer is being read to the PC. In other words, this buffer is essentially a ping-pong
buffer.
Unlike the Input Buffer described in the previous section, the size of the buffer
needs to be estimated. Since the number of bits from the video compression chip can
be varied, the capacity of the buffer has to be large enough to hold all the output bits
in the worst case. A conservative estimation is computed through a simple calculation:
29
READ/WRITE
BUFFER SWITCH
WRITE
DATA
EZW ATAOLEBIT
EZW DATA
CLK
10PACKNG
EZW GROUP CLK -
-
16
s
oloone.
ADDR
1Mb SRAM #0
(8bit wide)
8DATA
DATA
CS OEWE
FSM
Write Finish
OK
x0
Read Finish
READ
ADDR.
ik1
Parallel Port Interface
Module
DATA
8
eve me.. osifb
CS,OE,WE
n
o"",
orr ...
tre ...
pw
d ....... a:.-
J..
-q- 7 ----------- p
CEDOE, WEj
J
Mb SRAM #1
1(8bit
wide)
Figure 2-13: Schematic Diagram of the Output Frame Buffer Controller
The video chip compresses a group of 16 or 24 frames.
Each frame has 128 x 128 x 8 or 2(7+7+3
bits.
Therefore, uncompressed 16 frames should have 221 = 2M bits.
For both read and write buffers, the SRAM should have 4M bits of capacity.
However, a dual port SRAM with 4M bits of capacity is not easy to find. In fact,
the estimation done above is too conservative. Practically, the video compression has
compression ratio of about 100:1 or more, and even in the worst case, the ratio is
certainly greater than 2:1, thus reducing the capacity requirement by at least a half.
Nonetheless, even 2M-bit dual port SRAMs are still rare. To get around with this
problem, a simpler buffer structure is designed. Since 1Mx8 bit SRAMs are commercially available from IDT, the buffer is, then, separated into 2 chips of SRAM with
another controller implemented on a Cypress CPLD. The final design of the EZW
Output Frame Buffer is shown in Figure 2-13. Furthermore, choosing 8bit wide datapath also simplifies the system greatly because the parallel port datapath is 8 bits
as well.
In Figure 2-13, there are three additional control modules worth discussing here:
Bit Packing FSM, Synchronizer FSM, and Read/Write Buffer Switch. Similar to all
30
EZW DAT CLK=O or EZW GRP CLK=O
EZW CLK =
EZW DAT CLK=1,
C
RESET
Set Addr
0
Wri
WAIT FOR ONE
EZW DATA CLK =1
titionI
et
EZW CLK
1
i !=7
,
WAIT FOR ZERO
=0
AVd1
WIT LAST BYK
Assertser
sga
EZ WTFREZW FO
GRP CLK=
CLK=
ZW
THOWYGOPEZW
sinGR
Aser
WEE
DAT
DAT EZW
L
CLK=1
CLK=1
GRP CLK=
EW
RPCL= O
Figure 2-14: Bit Packing Finite State Machine
other parts of the system, these three modules are implemented on a CPLD for design
flexibility and ease of debugging.
2.5.1
Bit Packing Finite State Machine
As described earlier in the previous section, the output from the video compression
chip is serial. However, the data path of this system is 8 bits wide. Implemented
on a Cypress CPLD, this Bit Packing Finite State Machine essentially packs 8 bits
together to form a byte and pulses a WE signal to write the bits to a buffer.
Figure 2-14 shows how the finite state machine functions. Although the state
diagram looks complicated, the conceptual idea of the machine is very easy. Basically
the machine waits for output data from the video chip and writes to the buffer. The
31
1
0
bit EZW DATA OUT
X
C
EZW DATA OUT CLK
*
000
MSB
0
7
1
1
6
5
X
X
X
4
3
2
LSB
1
0
1 BYTE BUFFER
Figure 2-15: Bit Storage Format
machine starts from the RESET state, then moves directly to the WAIT FOR ONE
state in order to wait for data. When the EZW DAT CLK=1, a signal which means
that the EZW DAT BIT holds a valid bit, the machine grabs that bit and writes
it to a temporary buffer, as shown in Figure 2-15. From Figure 2-15, the machine
writes that bit to the most significant slot available. The following incoming bits are
written to the next slot on the right. When the temporary space is full, a condition
which means all 8 bits have been stored, the machine asserts a WE signal to the
Read/Write Buffer Switch and clears the temporary space. If the temporary space is
not full, it simply stores that new bit. The machine then waits for EZW DAT CLK
to become 0 in the state WAIT FOR ZERO. The process repeats all over again until
EZW GRP CLK=1. When EZW GRP CLK=1 or all the bits have been received
from the EZW video chip, the machine asserts the WE signal to write whatever it
has to the buffer, and the machine notifies Read/Write Buffer Switch by asserting
a READY signal. The machine then waits for the OK signal from the Read/Write
Buffer Switch module in the WAIT FOR OK state. The assertion of the OK signal,
which is done only after both Parallel Port Interface and Bit Packing FSM have finished transmitting and storing bits respectively, means that the Read/Write Buffer
Switch has swapped the read and write buffer. Only after the OK signal is received,
the Bit Packing FSM can begin capturing output bits again. In some cases, a new bit
can arrive from the EZW video chip while the machine is waiting for the OK signal.
This situation occurs because the EZW video chip operates independently from the
32
1
bit EZW DATA OUT
10
EZW DATA OUT CLK
MSB
x
x
7
6
5
4
XX
1
1
3
2
1
T
1
LSB
0
1 BYTE BUFFER
Figure 2-16: An alternative of how to store a bit
EZW Output Frame Buffer. In such a case, the machine simply throws all the bits
away in the THROW AWAY GROUP state.
Since the Bit Packing FSM captures bits by sampling EZW DAT CLK, the clock
that drives the FSM must be fast enough so that the FSM does not lose any data.
From the finite state machine diagram in Figure 2-14, the longest loop between 2
consecutive incoming bits is when the machine traverses the following states in order:
WAIT FOR ONE, WRITE SRAM, and WAIT FOR ZERO. This means that the
clock that drives this state machine has to be fast enough to finish writing a byte to
the SRAM while the EZW DAT CLK still maintains its value, otherwise the system
would lose 1 bit of information. At the maximum Vdd of the video compression chip,
Vdd = 2.5V, the EZW DAT CLK yields the shortest positive pulse, which is 90 ns.
So, to guarantee that the system captures all the bits, this FSM has to traverse at
least 3 states within 90 ns. Because 64MHz corresponds to 5.76 cycles within 90
ns, this clock frequency was chosen for this FSM. In addition, 64M is conveniently
multiple of 2, and it can be divided to generate other clock speed easily.
An alternative to implementing this finite state machine is to write the incoming
bit to the least significant bit instead of the most significant bit in an 8 bit temporary
space, Figure 2-16. However, this alternative would increase the complexity of the
system, especially the decoding software. The decoding software expects a stream of
33
WAIT FOR DONE
WRITE FINISH
READ FINISH
=
=
A
WRITE FINISH = 1
READ FINISH = 1
Set toggle =
0
0
!toggle
WAIT FOR CLEARl
t O
ig
Figure 2-17: State diagram of parts of Read/Write Buffer Switch
bits that are in the same order as the output bits from the video compression chip.
Therefore, if the least significant bit was written first, the software would have to
reverse the order of bits every time it received a byte of data from a parallel port.
2.5.2
Synchronizer Finite State Machine
As the name suggests, this finite state machine synchronizes the Bit Packing FSM
and the Parallel Port Interface module so that they start their processes at the same
time. It ensures that the read buffer has completely been read and that the write
buffer has completely been written before they are swapped. Figure 2-17 shows the
state transition diagram of this FSM.
This machine simply consists of two states, the WAIT FOR DONE and the WAIT
FOR CLEAR states. The machine waits for a finish signal from both Bit Packing
FSM and Parallel Port Interface in the WAIT FOR DONE state. When both of
them send the finish signals to the Synchronizer(not necessarily at the same time), it
proceeds to the WAIT FOR CLEAR state. In this state, the machine sends a swap
signal to the Read/Write Buffer Switch and simultaneously sends back an OK to the
two circuit modules to signify that the read and write buffers have been swapped.
The machine goes back to the WAIT FOR DONE state when the two circuit modules
34
de-assert their finish signals.
2.5.3
Read/Write Buffer Switch
Implemented on a CPLD, this circuit module is basically a switch whose function
is to alternate the roles between the read and write SRAM buffers upon receiving a
toggle signal from Synchronizer FSM. The swapping process is nothing but re-routing
signals as shown in Figure 2-18.
From the schematic diagram in Figure 2-18, the circuit utilizes tri-state buffers
to route the signals to their appropriate destinations. The tri-state buffers are controlled by the TOGGLE signal sent from the Synchronizer FSM. The roles of the
buffer 1 and 0 alternate in accord with the TOGGLE signal. In other words, when
the TOGGLE = 1, the buffer 1 and 0 are write and read buffers, respectively, and
vice versa.
2.6
Parallel Port Interface
Implemented in a Cypress CPLD, this circuit module is the connection between the
demonstration system and the computer. The main function of this module is to
transmit data from the EZW Output Frame Buffer to the PC via parallel port. Although the parallel port is normally used to transfer data from a PC to a peripheral,
in this demonstration system, it can transmit data from the board to the PC by operating in ECP mode. The protocol to transfer data is simply a hand-shaking protocol
shown in Figure 2-19.
Here are the steps to transfer data from the board to a PC:
1. When the board is ready, it brings AckReverseReq high and PeriphAck low.
35
TOGGLE
(from Synchronizer FSM)
WRITE ADDRESS[15..0]
WRITE DATA[7..0]
WRITE CS,OE,WE
I
A
1:
READ ADDRESS(15. .0)
READ DATA[7..0]
READ CS,OE,WE
______I..
r-r-i
0u
>
SRAM
BUFFER 0
0
*a
0
SRAM
BUFFER 1
Figure 2-18: Schematic Diagram of parts of Read/Write Buffer Switch
36
DO-D7 (bidirectional)
PeriphAck(board to PC)
Ack(board to PC)
HostAck(PC to board)
HostCLK(PC to board)
ReverseReq(PC to board)
AckReverseReq(board to PC)
Figure 2-19: Timing Diagram of the Parallel Port Protocol in ECP mode.
2. When the PC is ready to transmit the data, it brings HostClk high and HostAck
low.
3. After a delay of at least 0.5 ms, the PC pulls ReverseReq low to request for
incoming data.
4. The board acknowledges by bringing AckReverseReq low.
5. The parallel port can send either "data" or "command" depending on the PeriphAck signal. The difference between a "data" byte and a "command" byte
is that the byte is written into a different memory space on a PC. PeriphAck
is high for transmitting "data", and low for "command". Because this demonstration board always sends data, PeriphAck is always high.
6. The boards brings Ack low to notify the PC that the data is ready to be read.
At this point the data bus DO-D7 must hold valid value.
7. The PC acknowledges that it has read the data by pulling HostAck high.
8. The board confirms that the PC has accepted the data by bringing Ack high.
9. The PC finishes one byte transmission by bringing HostAck low.
37
Pin:
Signal
Function
Source
D-SUB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18-25
HostClk
DO
D1
D2
D3
D4
D5
D6
D7
Ack
PeriphAck
AckReverseReq
Select
HostAck
Error
ReverseReq
P1284
Gnd
Strobe DO-D7
Data Bit 0
Data Bit I
Data Bit 2
Data Bit 3
Data Bit 4
Data Bit 5
Data Bit 6
Data Bit 7
Acknowledge
Printer Busy
Out of Paper
Printer Online
Automatic Line Feed
Error
Initialize Printer
Select Printer
Ground
PC
PC
PC
PC
PC
PC
PC
PC
PC
Printer
Printer
Printer
Printer
PC
Printer
PC
PC
Register
Pin:
Name
Centronics
Control
Data
Data
Data
Data
Data
Data
Data
Data
Status
Status
Status
Status
Control
Status
Control
Control
1
2
3
4
5
6
7
8
9
10
11
12
13
14
32
31
36
19-30
Table 2.3: Mapping Between Centronics pinouts and D-SUB pinouts
10. The process repeats starting from step 5 until all the bytes are transfered. Since
this demonstration system only sends data to the PC, the ReverseReq is always
low.
The state diagram of the finite state machine that controls this transfer process
is illustrated in Figure 2-20. While maintaining a hand-shaking process with the PC,
the machine reads from the read buffer of the EZW Output Frame Buffer. Since the
maximum size of each DMA transfer is 32K bytes, the data has to be divided into 4
blocks of 32K bytes so the total number of transfered bits is IM bits.
This demonstration system uses a 35-pin Centronics(IEEE 1284-B) connector because it is easy to wire and compatible with many other type of connectors. Usually, a
standard PC uses a 25-pin D-SUB(IEEE 1284-A) receptacle, the Parallel Port Inter38
else
WAIT FOR REQ .P
REVERSE REQ =
HOST ACK =
0 and
SEND DATA
ACK =
HOST ACK =
0
0
I
WAIT FOR HSTACK UP
HOST ACK =
1
WAIT FOR HOSTACK DOWN
ACK
HOST ACK =
UPDATE ADDR
else
ONE
1
0
ONE BYTE
read addr = 2^15(32K)
nReverseReq =
0
WAIT FOR NEXT
DMA
else
read adr = 2^17(lM)
else
WAIT FOR OK
OK = 1
read finish=l
Figure 2-20: State diagram of finite state machine that controls the parallel port
interface
39
Bt829A
Cypress CPL
REST
Bt829A Control Logic
Input Frame Buffer
Controller
I
1--
Image EPROM
EZW Controller
and Clock Divider
EZW
video Compression
Chip
Figure 2-21: Logic blocks in the first CPLD
face Module has to map Centronics pinouts to D-SUB pinouts using information from
Table 2.3. This table is originally for a printer pinout mapping, but this mapping is
compatible with this demonstration system as well.
2.7
Complex Programmable Logic Devices(CPLD)
The programmable capability of the Complex Programmable Logic Device(CPLD)
allows flexibility of design and ease of implementation and debugging. As mentioned
throughout several sections discussed before, most of the logic blocks are implemented
on a CPLD. These logic blocks are usually control logic modules, finite state machines,
or other complicated logic blocks. The CPLD used in this system is Ultra 37256 manufactured by Cypress Semiconductor. The device has 16 logic blocks, each with 16
macrocells, for a total of 256 macrocells. The code is written in VHDL, which is compiled by Galaxy VHDL compiler and then programmed to the device by Cypress's
ISR programming software tool. The VHDL code is simulated using Cypress's Warp
VHDL simulation to ensure the functionality and timing property of the generated
logic. In addition, each Cypress CPLD can be re-programmed on-board via a JTAG
connector. This on-board re-programmable capability simplifies the debugging process tremendously.
40
Cypress CPLD
Data From____________
EZW Video Compression
Chip
Bit Packing FSM
SRAM BUFFER
RESET
AUX
Synchronizer
FSM
Read/Write
Buffer Switch
SRAM BUFFER
TO PC
Parallel Port
Interface
Figure 2-22: Logic blocks in the second CPLD
Due to the limited size and the I/O pins of a Cypress CPLD, this demonstration
system uses 2 of Ultra 37256 Cypress CPLDs. The first CPLD contains logic blocks
as shown in Figure 2-21, and the second one is shown in Figure 2-22. All the logic
blocks shown in both figures have been discussed in the previous sections. The number of macrocells used in each CPLD is balanced so that both of them still have some
space left for debugging and future expansion. In this system, both CPLDs have
approximately 60% macrocell utilization.
The only device which can control these two CPLDs are two on-board buttons,
RESET and A UX. The RESET button resets both CPLDs to their default or start
state. The A UX button is an extra button for debugging purposes. The A UX button
has proven to be extremely useful for debugging as it was heavily used to generate
several test vectors. The ability to use this A UX button would not be possible if the
CPLDs were unable to be programmed on-board.
41
Chapter 3
Software System
This chapter discusses the software part of this demonstration system. The function
of the software is to read data from the parallel port, decode the data, and display
the decoded image as a movie.
Similar to the hardware system, modularity of the software system helps in developing the software so that it is fast, readable, and easy to debug. The software
for this demonstration system is divided into 4 parts: Direct Memory Access(DMA)
Device Driver, Decoder, Video Player, and User Interface. The software is built into
an executable(.exe) file which can be executed on any Microsoft Windows platform.
User Interface
Data from
Furae31lPok
Data
aegc
Decode
Display
Figure 3-1: Block Diagram of the Software System
42
3.1
Direct Memory Access Device Driver
This software module is the connection between the hardware and the PC. In order
to display the video in real time, the transmission rate from the circuit board to the
PC has to be fast enough. For an uncompressed video stream, the rate required is
128x128x8x30 or about 4M bits per second. Given the compression ratio of about 10,
the expected data rate for compressed video is approximately 0.4M bps which can be
handled by a parallel port in Direct Memory Access(DMA) mode.
In DMA mode, the parallel port controller circuit in a PC bypasses the operating
system memory routines and directly accesses the PC's main memory(thus, direct
memory access). The DMA mode helps reduce the workload of the processor and
increase the speed of data transfer. While the processor is decoding the data, the
DMA controller can receive data from the parallel port and then store the data in a
memory. The processor only needs to initiate the transfer, after which it can then perform other tasks while waiting for all the data to be completely stored in the memory.
The DMA device driver used in this demonstration system is WinRT DMA driver
from BlueWater Systems Incoperation. BlueWater Systems Inc. provides tool suite
to use together with Microsoft Visual C++ compiler to build the software. The driver
has two versions: WinRT for Windows 9x and WinRT for Windows NT.
3.2
Decoder
This module takes data from the memory, stored by the WinRT DMA driver, and
decodes it into raw images. It is based on a decoder developed by Thomas Simon[4]
in C on a Unix platform. The functionality of this decoder is very limited. It can
only decode a stream of "0" and "1", which represent bit 0 and bit 1 respectively,
and that stream must be in a file with a specific name. The output of this decode is
43
a series of 128x128 PGM image files. In addition, the original decoder decodes only
one group of frames and terminates without freeing memory it has allocated.
Modified for compatibility, the decoder used in this demonstration system is designed so that it loops forever and frees all the allocated memory so as to prevent
a memory overflow. However, after the modification, the output of the decoder was
still not fast enough to keep up with the input from the circuit board. To resolve
this difficulty, some frames had to be ignored. So, instead of decoding all 16 images,
the decoder decodes only the first few images, stops, discards the rest of the data,
and starts with the new group of frames. This solution works well and speeds up the
process significantly although the final video output shows some discontinuities.
3.3
Video Player
This module of the software displays the final stream of image frames on the display
device. The final image is a 128x128 grey-scale pixel bitmap. To display the images
as a movie, the PC simply draws a series of output images one after another. If the
software just draws the images one by one, the video will not be smoothly continuous.
In fact, if such a drawing algorithm were implemented, one would actually be able
to see each pixel being drawn. To prevent such a problem, the image should be first
drawn in background and then displayed when it is finished. Figure 3-2 illustrates
how using a background page helps smoothen the video play.
As shown in Figure 3-2, there are two copies of memory pages for the video display. The "primary page" is being displayed while the "background page" is being
drawn. During the vertical sync, the monitor's electron gun is brought back to the
upper left corner. While no image is being drawn on the screen during this vertical
sync, the processor can swap the "primary page" and the "background page" pointers. As a result, every video frame displayed is clear and complete. The video display
module of this demonstration system utilizes the Microsoft DirectDraw library which
44
Pointer to the
page being displayed
Pointer to the
page being displayed
U222
The 2 pages are
swapped during
vertical sync.
Pointer to the
page in background
Pointer to the
page in background
Figure 3-2: Double frame buffer
supports such a drawing method.
3.4
User Interface(UI)
The user interface is the layer on top of all the modules. It is a connection between a
user and the software which makes the software easily manageable. The UI is written
conforming to Microsoft Windows API so that it is compatible with all Microsoft
Windows operating systems.
The look and feel of the UI is as shown in Figure 3-3. The UI is composed of 3
Windows components:
1. Radio buttons. These buttons are for users to select the image quality(or compression ratio) of the image.
2. Text area. The text area is located at the top left corner of the window. It
shows the frame number being displayed and the current compression ratio.
3. Video panel. It is a 128x128 image panel displaying the actual video image.
45
Figure 3-3: The Working System
46
Chapter 4
System Implementation
After the design was completed, the system was implemented. The circuit board was
especially designed to facilitate the testing and debugging process. The software was
developed to be portable and user-friendly. After testing and debugging, the system
operated properly, and the functionality of the EZW video chip was verified.
The implementation process started immediately after the completetion of the
initial design. The hardware system was more complicated and, therefore, prone to
more mistakes than the software. Unlike in the software, a bug that occurs in the
hardware is usually difficult and expensive to fix. Thus, it is very crucial to pay
special attention to the hardware system to minimize the number of bugs.
4.1
Hardware System
Due to a careful design with modularity, the implementation is very straight-forward.
The hardware system was implemented on a single printed circuit board using a design
tool suite from Accel Technologies. A schematic diagram was graphically drawn into
Accel Schematic. The Accel PCB then extracted the board layout from the schematic
diagram. All the components were hand-placed to optimized the area of the printed
circuit board. The connections between components were generated automatically
47
Figure 4-1: The front side of the unpopulated PCB
by Accel Auto-Route tool. Unfortunately, most of the time the Accel Auto-Route
was unable to find traces for all the connections, and the remaining connections were
finished manually. The result was a 7.5" x 7.5" 8-layered PCB as shown in Figure
4-1, Figure 4-2, and 4-3.
The board consists of 8 signal layers. Since there are numerous power and ground
connections, the power and ground planes are necessary to help reduce ground bounce.
In addition to 5V power supply and ground planes, the board has another power supply plane specifically designed for the EZW chip because it requires a stable power
supply. The ground plane is placed in between the 2 power supply planes to form
bypass capacitors to stabilize the system even more. Located at the top left corner of
the board is the Bt829A chip which contains both analog and digital pinouts. To provide isolation to the analog signals from the digital noise, all the planes are notched
around the Bt829A chip.
48
Figure 4-2: The back side of the unpopulated PCB
Signal traces are 6 mils in width. the power supply and ground traces(necessary
for surface mounted components) are 8 mils. The board has 101 elecrical components, 1382 pads, and 621 via holes(each with 14 mils in diameter). The board was
fabricated by Compunetics Incorporation.
4.2
Software System
The software was implemented with Microsoft Visual C++, which has good technical
support and easily accessed on-line documentation.
The DMA device driver chosen for this system is the WinRT DMA Driver from
BlueWater Systems Incorporation. The WinRT DMA Device Driver is designed for
the parallel port in DMA mode. The driver has a set of C libraries which can be easily included in the program. Reading data from the parallel port is done through a
49
Figure 4-3: The Finished PCB with all components in place
50
simple function call, and the data is, then, transferred without adding more workload
on the processor.
The decoder was ported from Thomas Simon's original code to Microsoft Visual
C++. Since there is not much difference between Unix C and Microsoft Visual C++,
the porting was easy and seamless.
The user interface is written conforming to Microsoft Windows application programming interface(API). Using a standard API, the software can operate on any
Microsoft Windows operating system, thus making it more portable.
51
Chapter 5
System Performance
The circuit can capture images from a camera, compress and send the images to a
PC, in which the PC can then display the video in real time. There is about a 2
second frame latency because of the delay from the software decoder.
Although the EZW chip can keep up with the real time input, the decoder written
for the PC is extremely slow. The software spent approximately 30 seconds to decode
about half a second of compressed video. Clearly, the software could not maintain
real time video play back at this rate of decoding. Therefore, to achieve the real time
result, the decoder was modified so that only a few frames are decoded and displayed
every 0.5 second.
The delay of the decoder also constitutes the frame latencies. While the PC is decoding the data, the board has to wait until the process is completed before a new set
of bits can be transmitted to the PC. The latency depends on the number of frames
decoded every 0.5 seconds as previously mentioned. Table 5.1 shows the relationship
between the number of frames decoded and the latency. The latency increases linearly
as the number of frames increases. By keeping the number of frames decoded at 1
or 2, the system can sustain the real-time video playback at an acceptable rate of 2
frames per second with approximate latency of 2 seconds.
52
Number of Frames Decoded] Approximate Latency (in second)
1
2
2
3
3
5
4
7
5
9
6
11
7
12
8
13
9
14
10
15
Table 5.1: The relationship between the number of frames decoded and the latency
EZW Passes
6
5
4
3
2
J Compresion Ratio
82:1
130:1
250:1
490:1
1020:1
Table 5.2: The performance of this EZW video chip
This EZW video chip yields an excellent compression ratio as shown in Table 5.2.
For good image quality, the chip gives compression ratio of 82:1 on average. The
compression ratio increases as the number of EZW passes decreases. From the Table,
the chip can even achieve the ratio of 1000:1 with acceptable image quality. The
power consumed by this chip was in a range of 450-750pW depending on spatial and
temporal content of the video image.
53
Chapter 6
Conclusion and Future
Improvement
6.1
Conclusion
This thesis investigates a demonstration system of a ultra low-power video compression and encoding integrated circuit. The system was designed and implemented by
using the video chip and off-the-shelf parts. The datapath of the system was carefully
planned to achieve the real-time video playback. The most challenging process was
designing both the hardware and the software together so that they were compatible
and resulted in a smooth, reliable system. Built from this design, this system was
user-friendly and required only minimal setup.
A substantial number of design methodologies were utilized while designing this
demonstration system. Modularity, abstraction, and hierarchy significantly helped
ease the implementation and debugging process of this system. The system was separated into several small modules. The interface between these modules was designed
at the beginning and maintained throughout the design process so that a change in
one module would not affect others. These design principles also provide the possibilities for further improvements and extension modules in the future.
54
6.2
Ideas for the Future
Although the system functions properly, it can be further improved by the following
suggestions:
1. Another idea is reducing the size of the circuit board. A number of components
such as resistors and by-pass capacitors can be substituted with surface-mounted
ones. The empty space left in the CPLDs can be used to store the instruction
for the EZW chip instead of the EPROMs.
2. To change the compression level, a user must flip a set of dip switches on board.
This method is, however, rather inconvenient because the user has to change
the compression level both in the software and the hardware. An alternative
to changing the compression level is to have the circuit read the value from the
software on the PC instead of dip switches.
3. This system can be made wireless. Although a parallel port cable connects
between the board and the PC, it can be replaced by a pair of wireless modules.
One module is attached to the board and the other to the PC. These 2 modules
communicate with each other using IR or RF signal. The data can then be sent
from the board to the PC without any cable.
4. Because the current software decoder is extremely slow, many frames are thrown
away and the final video can display only two frames per second. If the decoder
is optimized more, the system can potentially display more frames in one second.
The above ideas are just examples. There are also others ideas which can be applied to and immensely improve the system.
55
Bibliography
[1] Jan Axelson. ParallelPort Complete. Programming,Interfacing, & Using the PC's
ParallelPrinterPort. Lakeview Research. 1997.
[2] Jerome M. Shapiro. Embedded image coding using zerotree of wavelet coefficients.
IEEE Transactions on Signal Processing,41:3445-3462, December 1993.
[3] Rex K. Min. DemonstrationSystem for a Low-Power Video Coder and Decoder.
MS thesis, Massachusetts Institute of Technology, 1999.
[4] Thomas Simon. A Low Power video Compression Chip for Portable Applications.
PhD thesis, Massachusetts Institute of Technology, 1999.
[5] Jon Bates, Tim Tompkins. Using Visual C++ 6. Que Corporation, 1998.
[6] Viktor Toth. Visual C++ 5 Unleashed. Sams Publishing Inc., 1997.
[7] Rockwell Semiconductor Systems. P C-Bus Reference Guide. Jine 1997.
[8] Rockwell Semiconductor Systems. Bt829A/827A VideoStream II Decoders. Preliminary product datasheet, November, 1997.
[9] Integrated Device Technology, Inc. IDT7008S/L High-Speed Dual-Port Static
RAM. Preliminary product datasheet, June 1998.
[10] Integrated Device Technology, Inc. IDT71124 CMOS Static RAM. Product
datasheet, September 1999.
[11] Fairchild Semiconductor.
Datasheet, July 1998.
NM27C256
56
High Performance
CMOS EPROM.
Appendix A
Schematic Diagrams
All the files in this thesis are in charatc/Thesis/ directory.
57
BT1
m
J2
=
ECg
-------
BT829A
SYNCOET
Y1N
55
57
45 MUX
MUXJO
2
58 MUX
Mux 3
67 CIN
BTCC31
ENC2
MUXOIJT
VD15
VD13
VD12
,
14ir
VD7
506
505
V04
-c
CLEVEL
9.1
8.1
.
out
T.M
F
501
VD8
24-n
23
CREF-
173
BTOS1
am-
VDlS
ACAP
0.1
-"" " PWRDN
NUMXTAL
I. .
141
OVAUI)
/HRESEF
/VRESEr T
m,.
,
ig
]
P6:E
CCVALIC
FIELL 78
XTs
OSC
Ja3
VACTIVE
nwETa
36:F
CBFLAG
QCLI
CLe
CLKx2
17 IT1O
W
1a
OE
SCL
S.DA
TCK 94
TDI
TDO
TMS
TRS1
12CCS
DEPOLE
S3
bt- rt
W_-
143
rPil
123
t
bt- 2cce
bt_
bt pwrdn
,
c kx1
clk 146
bt_
.id 147
bt vact tive
bt ve
bt_~hr
bt- c
UCYP_ 1:F
99
bt_-
l!48
got
aet
ie 150
152
Wr..a"On
-_amx
Nr
MB
Or-wnM
-r wr
w DOLM
UCYP_A
Figure A-1: Schematic Diagram of Analog to Digital Converter and Its Control Logic
58
}
_
~
0
I III I1I
4
raase
III1'
CL
0
N4
0-
44
D
a.
ONI
R.
c( J
"cc
ta21QVV
004
~
3C
!2
C4'Pint
§Rcgg
t o
0
f:
C)
0~
04
PIX
1
GND
VP
EPROMCE
EPRMOE
20 C
22 dE
DCT-AM
10
cT _AMR
ocT-AD4
AO
7 A2
6 A4
A
DCT_-AM
DCT_-AMn4
Wrc-AM
3 A7
DM _ADR1O
DCT-ADR1 1
DC _ADR12
DCT-ADR1M
23 A10
2 Al1
2 A12
26 A13
A6
ocr_-AM
25 A
ocT_-AM24 A9
PM
11DQO
PMX1
13 DQ1
13 DQ2
PMx
PMx
pw
15 DQ3
16 DQ4
pwx
17 DQ5
18 DQ6
1
DQ7
PMxG
PW-
27C256
Figure A-4: Schematic diagram of the image EPROM
60
UCYP A
UCYP 1:E
UCYPA
-
ezw_ clk in
UCYP_ 1:D
ezwcl
ezw
Ao
reset 78
ezs_ start
btcel
IDT2
125
2
btr es126
bt-rw
bt adrl5
bt-adr14
btaodr3
bt adr12
bt adr 1
bt-adrlC
bt odr9
bt adr9
btaodr7
bt-odr6
bt adr5
btodr5
btadr4
bt odr3
124
122
119
118
117
115
114
113
112
110
169
btAodrO
W
AMIiS
mr-a
frfawA
IrmA12
Ur-M11
arM"NmmA=
I-MM
WaWr.
mff _A
168
rm
167
166
*rAGM
m-M
mrm
.AO
165
btodr2 104
103
cI
8rc
I
4 7
iCs CEl-L
46CEO-L
_ I OE-L
Rfw-i
40Al5-L
39A14-L
A13-L
38
A12-L
37
36 Al1 -L
OF-AMU
SLAWn 35 All-L
A9-L34
mmAin
mm
33 AB-L
A7-L
A6-L
A5-L
A4-L
A3-L
N am
IsADM
29261
A2-L
A1-L
AO-L
Ar 14
UaM"
aIs
mi1
W-Azzi32
Aw
AM
irwIr?
ffloua'
urwus
BLOMi
marO4
1m3
eOsra
rwn
mmcum
54
545
07-L
55
56106-L
5705-L
5804-1
59 103-L
6102-L
62 01-L
00-L
.sar
72
nCrncr
65
81
CEl-R 82
CEO-R 78
OE-R
R/W-R
3
A15-R 4
A14-R 5
A13-R
A12-R 6
Al1-R 7
A10-R 89
A9-R
A8-R 16
A7-R 11
A6-R 12
A5-R 13
A4-R 14
A3-R 15
A2-R 16
17
Al -R 18
A6-R
73
72
107-R
106-R
105-R
104-R
103-R
102-R
69
67
101-R 65
100-R
crqcs
Ocr-aS
,
E rIJM4
EtAJI
,
OSaMr
ocr war
crm
DEM
,
WJW
Ocrm
EMrAM
5
OCIJO
OcraU
r=
ErIJ
Erm
E
3e srom
68
crwslS
98
adr15
Er m4
97 dctd-adr14
Q-A13
96 dct~ drl3
=cr1m n cr_.M2dcta
95 dr3
dct
1c mi
94 adr2
n cr awe
93 dct-odrl 1
EClAW
92 dct- dr
ncAm
91 dct adr9
WcmV,
89 dctadrB
OCTA
88 dct adr7
ErAM
87 dct adr6
OCr5f
86 dctl odr5
=_AND
85
r
84 dct adr3
ErMn
83 dct adr2
Er n
82 dct adrl
dctad7
nr6
Psi
Poo
E~0Mm
oeprom
cOeprom
POM
FM
:_
A
-railr
E-"
IDT7008
Figure A-5: Schematic diagram of the SRAM buffer and its control logic modules
UCYPy_ :C 74HC125
UCYPA
.
U6:C
74HC125
U6:8
7414C128
A
74HC125
U6:D
Figure A-6: Schematic Diagram of i2c programmer
61
UCY91:C CYPB2
UCY_81:0 Cyp_92
-
V
.ooO.0U
.ootOS
.ouff
.&AT.,
7n
27
ME
AS
o~O~O=...""1644
T
A' 6
A15
.dMO&A
0PlB
e.rO4T. 1
9
.0.,IS13
od*KckSt"14
M3
U
11
A10
A9
Al
A7
AG
A5
A4
A3
A2
Al
M
eArOuffopll
0.OT.p7
A'0wff~p2
JL
1-U
v
Al
I.-.
8.00.4al3.
iii--
V07
V06
V05
V04
V03
V02
Vol
V09
-i
V/02
IDT71124
IDT71
124
Figure A-7: Schematic Diagram of Output Frame Buffer
UCYB1:E CYPB2
dk
Ul
74
MinO'
-
a
9- Al
A2
A3
periphClh
perphAch
ockRewvmR.q
saw
A4
UU2AS
77
75
72-
nError
'-- -
A
:2
-U
dout7 63
doutS
65
dout4 66
dout3 67
dout2
dout1 69
doutO 76
-
dout5
A13
HOSMNu A14
=nuc 21 A15
M364 22E A16
23see--- A17
81
41
8: 40
36
37
i4 36
B.'
Be 357
87 33
BE32
U7
5
6D5
D6
9
07
Y1C
45
13ACK
12 =k
Y12
Y13
43
select
32
nError
Y11
C14 21
C15 2
Cie 27
C17 263
P_ CON
2
3 DO
4D1
D2
D3
6C4
Y9 47
19 PLn PH30
24.MLHG 25
I
hote k
p1284&Iode 99
nRswra.R~q 50
A7
AS
3
2N A9
his
4 411
6 k12
74H161254
.
q
hostCik
hostAck
pn 2O4tode
nRevereReq
=
1 7 GND
Figure A-8: Schematic Diagram of Parallel Port Interface
62
UCY_-B1:B
S2
_59
CYP
~ R
B2
Cl 1
GNG9
Figure~~~~~~~~~4
A-:Shmt
ODMga
63
fRst
oto
o
Appendix B
VHDL Code
B.1
B.1.1
VHDL Code for CPLD 1
i2c.vhd
LIBRARY ieee;
USE ieee.stdjlogic_1164.ALL;
USE work.stdarith.ALL;
entity i2c-drive is port (
sclW, sdaW: out std-ulogic;
sclR-raw, sdaR: in stdulogic;
clk3: in stdulogic;
reset: in std-ulogic
end entity i2cdrive;
architecture i2carch of i2cdrive is
signal sclR: stdulogic; --
synchronized version of sclR_raw
type sclkStates is (low, high);
signal sclkState, next-sclkState: sclkStates;
constant maxPhase: integer := 31;
-- SCK changes phase every "maxPhase" cycles of clk3
constant midPhase: integer := 15;
-- phase where data changes, start/stop conditions are sent, etc.
signal phase, next-phase: std-logicvector (4 downto 0);
signal sendGo: std-ulogic;
64
signal bitNum, next-bitNum: integer (0 to 7);
-- 0 to 7 addresses bits in a byte
signal byteToSend: std-logic-vector(7 downto 0);
type sdaStates is (idle, startCond, sendByte0, getAckO, ackOPassed,
sendBytel, getAcki, ackiPassed, sendByte2, getAck2, success,
ackFailed, stopCond);
signal sdaState, next-sdaState: sdaStates;
type sendStates is (sendCmd, waitForReply, done);
signal sendState, nextsendState: sendStates;
signal cmdNum, next-cmdNum: integer (0 to 15);
constant lastCmdNum: integer := 11;
signal oldReset: std-ulogic;
begin
sclkFsm: process(sclkState, phase, sclR)
begin
case sclkState is
when low =>
-- drive SCL low for "phase" cycles of clk3
sclW <= '0';
if (phase = maxPhase) then
-- time to flip SCL?
next-sclkState <= high;
next-phase <= (others => '0');
else
next-sclkState <= low;
next-phase <= phase + 1;
end if;
when high =>
-- drive SCL high for "phase" cycles of clk3
sclW <= '1';
if (phase = maxPhase) then
-- time to flip SCL?
next-sclkState <= low;
next-phase <= (others => '0');
else
next-sclkState <= high;
if (sclR = '0') then
-- is Brooktree holding SCL line LOW?
next-phase <= (others => '0');
65
--
wait for BT to release SCL
else
nextPhase <= phase + 1;
end if;
end if;
end case;
end process;
sdaFsm: process (sdaState, phase, sclR, sdaR, byteToSend,
sendGo, bitNum)
begin
case sdaState is
when idle =>
-- wait until we are ready to generate a start condition
sdaW <= '1';
next-bitNum <= 7;
if (sendGo = '1')
and (sclR = '1')
and (phase = midPhase) then
next-sdaState <= startCond;
--
drop SDA to signal a start condition
else
next-sdaState <= idle;
end if;
when startCond =>
-- drop SDA now to signal start condition.
sdaW <= '0';
next-bitNum <= 7;
if (sclR = '0') and (phase = midPhase) then
next-sdaState <= sendByteO;
else
next-sdaState <= startCond;
end if;
when sendByteO =>
if (bitNum = 7) or (bitNum = 3) then
-- send "10001000" to device
sdaW <= '1';
-- this initiates a write to the BT
else
sdaW <=
'0';
end if;
if (sclR = '0') and (phase = midPhase) then
66
nextbitNum <= bitNum - 1;
if (bitNum = 0) then
next-sdaState <= getAckO;
-if that was the last bit,
-switch states
else
next-sdaState <= sendByte0;
end if;
else
next-sdaState <= sendByteO;
next-bitNum <= bitNum;
end if;
when getAckO =>
sdaW <= '1';
-drive high onto bus, so that BT can pull
-the line low to acknowledge
next-bitNum <= 7;
-- set to 8 so that sendBytel will
if (sclR = '1') and (phase = midPhase) then
if
(sdaR = '0') then
next-sdaState <= ackOPassed;
-- successful ACK, send another byte
else
nextsdaState <= ackFailed;
-- no ACK. give up.
end if;
else
next-sdaState <= getAckO;
end if;
when ackOPassed =>
-- wait for middle of SCK low before sending next bit
sdaW <= '1';
nextbitNum <= 7;
if (sclR = '0') and (phase = midPhase) then
nextsdaState <= sendBytel;
else
nextsdaState <= ackOPassed;
end if;
when sendBytel =>
sdaW <= byteToSend(bitNum);
-- drive bit "bitNum" onto the bus
if (sclR = '0')
and (phase = midPhase) then
67
nextbitNum <= bitNum - 1;
if (bitNum = 0) then
next-sdaState <= getAcki;
-if that was the last bit,
switch states
--
else
next-sdaState <= sendBytel;
end if;
else
nextsdaState <= sendBytel;
nextbitNum <= bitNum;
end if;
when getAcki =>
sdaW <= '1';
drive high onto bus, so that BT can
-pull the line low to acknowledge
nextbitNum <= 7;
if (sclR = '1') and (phase = midPhase) then
if (sdaR = '0') then
--
nextsdaState <= ackiPassed;
-- successful ACK, send another byte
else
nextsdaState <= ackFailed;
--
no ACK. give up.
end if;
else
nextsdaState <= getAcki;
end if;
when ackiPassed =>
--
wait for middle of SCK low before sending next bit
sdaW <= '1';
next-bitNum <= 7;
if (sclR = '0') and (phase = midPhase) then
nextsdaState <= sendByte2;
else
nextsdaState <= ackiPassed;
end if;
when sendByte2 =>
sdaW <= byteToSend(bitNum);
-- drive bit "bitNum" onto the bus
if (sclR = '0') and (phase = midPhase) then
nextbitNum <= bitNum - 1;
68
if (bitNum = 0) then
nextsdaState <= getAck2;
-- if that was the last bit,
-- switch states
else
next-sdaState <= sendByte2;
end if;
else
next-sdaState <= sendByte2;
next-bitNum <= bitNum;
end if;
when getAck2 =>
sdaW <= '1';
-drive high onto bus, so that BT can pull
-the line low to acknowledge
next-bitNum <= 7;
if (sclR = '1') and (phase = midPhase) then
if (sdaR = '0') then
next-sdaState <= success;
else
next-sdaState <= ackFailed;
end if;
else
next-sdaState <= getAck2;
end if;
when success =>
-- signal that we successfully completed a transaction
sdaW <= '1';
next-bitNum <= 7;
if (sclR = '0') and (phase = midPhase) then
next-sdaState <= stopCond;
else
next-sdaState <= success;
end if;
when ackFailed =>
-- signal that we could not complete the transaction
sdaW <= '1';
next-bitNum <= 7;
if (sclR = '0') and (phase = midPhase) then
next-sdaState <= stopCond;
else
next-sdaState <= ackFailed;
69
end if;
when stopCond =>
sdaW <= '0';
-pull line down during low SCK.
--
returning to idle state during
--
middle of SCK high pulls SDA up
and generates stop condition.
--
next-bitNum <= 7;
if (sclR = '1') and (phase = midPhase) then
next-sdaState <= idle;
else
next-sdaState <= stopCond;
end if;
end case;
end process;
sendCommands: process(sendState, sdaState, cmdNum)
begin
case sendState is
when sendCmd =>
-- tell process above that we wish to send data
if (sdaState = startCond) then
-- has it kicked off yet?
next-sendState <= waitForReply;
next-cmdNum <= cmdNum;
else
-- keep asserting our send request
next-sendState <= sendCmd;
next-cmdNum <= cmdNum;
end if;
when waitForReply =>
if (sdaState = success) then
if (cmdNum = lastCmdNum) then
next-sendState <= done;
nextcmdNum <= cmdNum;
else
next-sendState <= sendCmd;
next-cmdNum <= cmdNum + 1;
end if;
elsif (sdaState = ackFailed) then
70
next-sendState <= sendCmd;
next-cmdNum <= cmdNum;
else
next-sendState <= waitForReply;
next-cmdNum <= cmdNum;
end if;
when done =>
-all done! don't ever execute this code again
--
(until next reset)
next-sendState <= done;
next-cmdNum <= cmdNum;
end case;
end process;
sendGo <= '1' when (sendState = sendCmd) else '0';
btProgram: process(sdaState, cmdNum)
begin
if (sdaState = sendBytel) then --
BT register addresses
case cmdNum is
when 0 => byteToSend <= "00011111";
-- software reset (SRESET)
when 1 => byteToSend <= "00000001";
-- input format (IFORM)
when 2 => byteToSend <= "00000010";
-- temporal decimation (TDEC)
when 3 => byteToSend <= "00000011";
-- MSB cropping (CROP)
when 4 => byteToSend <= "00000100";
-- vertical delay (VDELAYLO)
when 5 => byteToSend <= "00000101";
-- vertical active (VACTIVELO)
when 6 => byteToSend <= "00000110";
-- horizontal delay (HDELAYLO)
when 7 => byteToSend <= "00000111";
-- horizontal active (HACTIVELO)
when 8 => byteToSend <= "00001000";
-- horizontal scale hi (HSCALEHI)
when 9 => byteToSend <= "00001001";
-- horizontal scale lo (HSCALELO)
when 10 => byteToSend <= "00010011";
-- vertical scale hi (VSCALE-HI)
when 11 => byteToSend <= "00010100";
71
-- vertical scale lo (VSCALE-LO)
when others => byteToSend <= "11110000";
end case;
elsif (sdaState = sendByte2) then
--
command arguments
case cmdNum is
when 0 => byteToSend <= "00000000";
-- don't care
when 1 => byteToSend <= "01001001";
-- force NTSC mode
when 2 => byteToSend <= "00000000";
-- no decimation
when 3 => byteToSend <= "00010000";
-- VD VA HD HA
when 4 => byteToSend <= "01001000";
-- VDELAY = 136
when 5 => byteToSend <= "10000000";
-- VACTIVE = 384 (high bit in R3)
when 6 => byteToSend <= "10010000";
-- HDELAY = 144
when 7 => byteToSend <= "10000000";
-- HACTIVE = 128 (high bit in R3)
when 8 => byteToSend <= "00001110";
-- HSCALE HI = OxOE
when 9 => byteToSend <= "11101110";
-- HSCALE LO = OxEE
when 10 => byteToSend <= "01111111" ;
-- VSCALE HI = Ox1F
when 11 => byteToSend <= "00000000" ;
-- VSCALE LO = OxOO
when others => byteToSend <= "111100 00 ;
end case;
else
byteToSend <= (others =>
'0');
end if;
end process;
--sclR <= sclW; --
enabled for simulation only
clockUpdate: process(clk3, sclRraw)
begin
sclR <= sclR-raw;
-- synchronize the incoming SCLK signal
if (clk3'event) and (clk3 = '1') then
72
if (reset = '1')
then
phase <= (others => '0');
sclkState <= low;
sdaState <= idle;
bitNum <= 7;
sendState <= sendCmd;
cmdNum <= 0;
else
phase <= next-phase;
sclkState <= nextsclkState;
sdaState <= nextsdaState;
bitNumn <= nextbitNum;
sendState <= nextsendState;
cmdNum <= nextcmdNum;
end if;
end if;
end process;
end;
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.stdarith.ALL;
package i2c-pack is
component i2cdrive port (
sclW, sdaW: out std-ulogic;
sclR-raw, sdaR: in std.ulogic;
clk3: in std-ulogic;
reset: in std-ulogic
end component;
end package i2c.pack;
B.1.2
ntsc.vhd
LIBRARY ieee;
USE ieee.std-logic-1164.ALL;
USE work.std-arith.ALL;
entity ntscsramdrive is port (
-- SRAM left side addresses
73
--whichRam: buffer stdulogic;
--whichRamNot: buffer stdulogic;
ntscAdr: buffer std-logic-vector(15 downto 0);
cel: out std-ulogic;
-- Always tie to '1';
cs: out std.ulogic;
rw: out std.ulogic;
--
Brooktree 829A
dvalid, active, hreset, vreset, vactive: in std-ulogic;
field, qclk, clkxl: in stdulogic;
oe, rst, i2ccs, pwrdn: out std-ulogic;
--
misc. inputs
reset: in std-ulogic
end entity ntscsramdrive;
architecture ntscarch of ntscsram-drive is
type States is (waiting, loading);
signal state, nextState: States;
signal ntscAdrCount: std-logic-vector(13 downto 0);
signal page: stdlogic-vector(1 downto 0);
signal aboutToLoad: stdulogic;
-- pulses high on wait->load state transition, for Mealy outputs
signal useless: std-ulogic;
begin
stateMachine: process (state, nextState, vreset, field, ntscAdrCount)
begin
case state is
when waiting =>
-- CHECK: is first field odd or even? (assumed odd)
if (vReset = '0') and (field = '1') then
nextState <= loading;
aboutToLoad <= '1';
else
nextState <= waiting;
aboutToLoad <= '0';
end if;
when loading =>
aboutToLoad <= '0';
74
if (ntscAdrCount = "11111111111111") then
nextState <= waiting;
else
nextState <= loading;
end if;
end case;
end process stateMachine;
cs <= '0' when (dvalid = '1') and (active = '1') and (clkxl = '0')
else '1";
ntscAdr <= page &
ntscAdrCount(12 downto 7) &
ntscAdrCount(13) &
ntscAdrCount(6 downto 0);
cel <= '1';
i2ccs <= '0';
pwrdn <= '0';
rw <= '0';
oe <= '0';
--
remember FIELD should be routed as an address bit!
clockUpdate: process (clkxl, reset)
begin
if (reset =
'1')
then
state <= waiting;
rst <= '0';
page <= (others => '1');
elsif (clkxl'event) and (clkxl
'1')
=
then
state <= nextState;
rst <= '1';
if (aboutToLoad = '1') then
page <= page + 1;
end if;
if (state = waiting) then
ntscAdrCount <= (others => '0');
elsif (dvalid = '1')
and (active = '1') then
ntscAdrCount <= ntscAdrCount + 1;
end if;
end if;
75
end process clockUpdate;
end architecture ntscarch;
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.stdarith.ALL;
package ntsc-pack is
component ntscsramdrive port (
-- SRAM left side addresses
ntscAdr: buffer stdjlogic-vector(15 downto 0);
cel: out std-ulogic; -- Always tie to '1';
cs: out stdulogic;
rw: out stdulogic;
--
Brooktree 829A
dvalid, active, hreset, vreset, vactive: in std-ulogic;
field, qclk, clkxl: in std-ulogic;
oe, rst, i2ccs, pwrdn: out std-ulogic;
--
misc. inputs
reset: in std-ulogic
end component;
end package ntsc-pack;
B.1.3
ezw-sram-drive.vhd
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.stdarith.ALL;
entity ezw-sramdrive is port(
Source of input(EPROM, or CAMERA)
-FROM EXTERNAL BUTTON
source: in stdulogic;
--
-- input from left side SRAM
pagel, pageO: in std-ulogic;
-- EZW driver signals
ezwClk: out std-ulogic;
76
ezwReset: out std-ulogic;
ezwStart: out stdulogic;
-- SRAM RIGHT SIDE and EEPROM control signals
dctAdr: buffer stdlogic-vector(15 downto 0);
cel, csSRAM, oeSRAM: out std-ulogic;
csEPROM, oeEPROM: out std_ulogic;
ezwClkIn: in stdulogic;
clk16: in stdulogic;
reset: in std.ulogic
end entity ezwsram-drive;
architecture ezwsram-drive-arch of ezwsramdrive is
type States is (waiting, loading, startSignal, resetState, delayStart);
signal currentState, nextState: States;
signal readPage,decReadPage,minus: std-logic-vector(l downto 0);
signal cntAdr: std-logic-vector(14 downto 0);
signal pageOSync: std-ulogic;
signal clkCnt: stdlogicvector(6 downto 0);
signal delayVec:
stdjlogic-vector(8 downto 0);
signal delayEZWStart: stdlogic-vector(2 downto 0);
begin
ezwClkGenerator: process(clk16)
begin
if (clkl6'event) and (clk16 = '1') then
clkCnt <= clkCnt + 1;
end if;
end process ezwClkGenerator;
ezwClk <= clkCnt(6);
-- source = 1 is to select EPROM
cel <= '1';
csSRAM <= source;
oeSRAM <= source;
csEPROM <= not (source);
oeEPROM <= not (source);
readPage <= pagel & pageo when (source = '0') else (others
minus <= readPage - decReadPage;
-- Rearrange output address
dctAdr <= decReadPage(1 downto 0) &
77
> '0');
cntAdr(9 downto 5) &
cntAdr(13 downto 12) &
cntAdr(4 downto 0) &
cntAdr(11 downto 10);
FSM: process(currentState, pageOSync, pageO,
cntAdr, source, delayVec, delayEZWStart)
begin
case currentState is
when resetState =>
ezwStart <= '0';
nextState <= delayStart;
when delayStart =>
ezwStart <= '0';
if (delayEZWStart = "100") then
nextState <= waiting;
else nextState <= delayStart;
end if;
when waiting =>
ezwStart <= '0';
-- if EPROM, no need to wait for BT
if (page0Sync = not pageO) or (source = '1')
nextState <= startSignal;
end if;
when startSignal =>
ezwStart <=
'1';
nextState <= loading;
when loading =>
ezwStart <= '0';
nextState <= loading;
end case;
end process FSM;
clkUpdate: process(ezwClkIn, reset,cntAdr)
begin
if (rising-edge(ezwClkIn)) then
if (reset = '1') then
currentState <= resetState;
ezwReset <= '1';
cntAdr <= (others => '0');
delayEZWStart <= (others => '0');
decReadPage <= "10";
else
ezwReset <= '0';
78
then
pageOSync <= pageO;
currentState <= nextState;
if (currentState = loading) then
if (cntAdr(13 downto 0) = "11111111111111") then
cntAdr <= (others => '0');
if (source = '0') then
if (minus = "01") or
((readPage = "00") and (decReadPage = "11")) then
decReadPage <= decReadPage;
else decReadPage <= decReadPage + 1; end if;
else decReadPage <= not decReadPage; end if;
else cntAdr <= cntAdr + 1; end if;
elsif (currentState = waiting) then
cntAdr <= (others => '0');
delayVec <= (others => '0');
delayEZWStart <= (others => '0');
elsif (currentState = delayStart) then
delayEZWStart <= delayEZWStart + 1;
end if;
end if;
end if;
end process clkUpdate;
end architecture ezwsramdrivearch;
LIBRARY ieee;
USE ieee.std-logic1164.ALL;
USE work.stdarith.ALL;
package ezw-pack is
component ezwsramdrive port (
---
Source of input(EPROM, or CAMERA)
FROM EXTERNAL BUTTON
source: in std-ulogic;
--
input from left side SRAM
pagel, pageo: in stdculogic;
-- EZW driver signals
ezwClk: out stdulogic;
ezwReset: out std-ulogic;
ezwStart: out stdulogic;
79
--
SRAM RIGHT SIDE and EEPROM control signals
dctAdr: buffer std-logic-vector(15 downto 0);
cel, csSRAM, oeSRAM: out std-ulogic;
csEPROM, oeEPROM: out std-ulogic;
ezwClkIn: in stdulogic;
clk16: in std.ulogic;
reset: in std-ulogic
end component;
end package ezw-pack;
B.1.4
top-a.vhd
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.stdarith.ALL;
USE work.i2c-pack.ALL;
USE work.ntsc-pack.ALL;
USE work.ezw.pack.ALL;
entity topleva is port (
--
Input from switches
globalReset: in std-ulogic;
source: in std-ulogic; -- select b/w camera and EPROM
aux: in std-ulogic;
--
12C protocal
i2cscldrive, i2csdadrive: out stdulogic;
i2csclread, i2c-sda-read: in std-ulogic;
--
Brooktree 829A
btdvalid, bt-active, bt_hreset, bt_vreset,
btvactive: in stdulogic;
btfield, btqclk, bt_clkxl: in std.ulogic;
bt-oe, bt-rst, btji2ccs, btpwrdn: out std.ulogic;
--
LEFT NTSC SRAM
btadr: buffer std-logic.vector(15 downto 0);
btcel: out stdulogic;
btcs: out stdulogic;
btrw: out stdulogic;
--
RIGHT NTSC SRAM
dctAdr: buffer std-logic-vector(15 downto 0);
80
cel, csSRAM, oeSRAM: out std-ulogic;
csEPROM, oeEPROM: out stdulogic;
--
EZW driver signals
ezwClk: out stdulogic;
ezwReset: out stdulogic;
ezwStart: out stdulogic;
ezwClkIn: in stdulogic;
-- Some clock signals(12C also uses this.. check?)
clk3out: out stdulogic;
clk3,clkl6: in stdulogic
end entity toplev-a;
architecture toplev-a_arch of toplev-a is
signal adrBufEZW: stdjlogic-vector(16 downto 0);
signal endAdrEZW: stdlogic-vector(16 downto 0);
signal datBufEZW: std-logicvector(7 downto 0);
signal csEZW, oeEZW, weEZW: stdulogic;
signal adrBufPB: std-logic-vector(16 downto 0);
signal datBufPB: std-logic-vector(7 downto 0);
signal csPB, oePB, wePB: std-ulogic;
signal globalResetSy: stdulogic;
signal doneEZW, donePB: std-ulogic;
signal okToGo: std-ulogic;
signal clk3Cnt: std-logic-vector(4 downto 0);
begin
clk3outGen: process(clk16)
begin
if (clkl6'event) and (clk16 = '1') then
clk3Cnt <= clk3Cnt + 1;
end if;
end process;
clk30ut <= clk3Cnt(4);
i2cPart: i2cdrive port map (
sclW => i2cscldrive,
sdaW => i2csdadrive,
sclRraw => i2cscl-read,
sdaR => i2csdaread,
81
clk3 => clk3,
reset => globalResetSy
ntscPart: ntscsramdrive port map (
ntscAdr => bt-adr,
cel => btcel,
cs => btcs,
rw => bt_rw,
-- Brooktree 829A
dvalid => btdvalid,
active => btactive,
hreset => bthreset,
vreset => btvreset,
vactive => btvactive,
field => btjfield,
qclk => bt.qclk,
clkxl => bt-clkxl,
oe => btoe,
rst => btrst,
i2ccs => bt-i2ccs,
pwrdn => bt-pwrdn,
--
misc. inputs
reset => globalResetSy
ezw-sram-drive-part: ezwsramdrive port map(
source => source,
pagel => btadr(15),
pageO => btadr(14),
ezwClk => ezwClk,
ezwReset => ezwReset,
ezwStart => ezwStart,
dctAdr => dctAdr,
cel => cel,
csSRAM => csSRAM,
oeSRAM => oeSRAM,
csEPROM => csEPROM,
oeEPROM => oeEPROM,
clk16 => clk16,
ezwClkIn => ezwClkIn,
reset => globalResetSy
82
);
clkEdge: process(clk16)
begin
if (clkl6'event) and (clk16 = '1')
then
globalResetSy <= not globalReset;
end if;
end process clkEdge;
end architecture toplev-a-arch;
B.2
B.2.1
VHDL Code for CPLD 2
sram-switch.vhd
-- Basically, this is a huge mux
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.std-arith.ALL;
entity sramswitch is port (
-- Shakehand signals
doneEZW, donePB: in stdulogic;
okToGo: out stdulogic;
--
Output
adrOutTop, adrOutBut: buffer std-logic-vector(16 downto 0);
datOutTop, datOutBut:
inout stdlogic-vector(7 downto 0);
csOutTop, oeOutTop, weOutTop: out stdulogic;
csOutBut, oeOutBut, weOutBut: out stdulogic;
--
Input from ezw-out
adrBufEZW: in stdlogic-vector(16 downto 0);
datBufEZW: in stdlogic-vector(7 downto 0);
csEZW, oeEZW, weEZW: in std-ulogic;
-- Input from parallelbuf
adrBufPB: in stdlogicvector(16 downto 0);
datBufPB: out stdjlogic-vector(7 downto 0);
csPB, oePB, wePB: in stdulogic;
toggleOut: out std-ulogic;
reset: in std-ulogic;
83
clk16: in std-ulogic
end entity sramswitch;
architecture sramswitch-arch of sramswitch is
type States is (waitForBothDone,waitForBothClear,resetState);
signal currentState, nextState: States;
signal toggle, changeToggle: std-ulogic;
begin
toggleOut <= toggle;
FSM: process(currentState,doneEZW,donePB)
begin
case currentState is
when resetState =>
nextState <= waitForBothDone;
changeToggle <= '0';
when waitForBothDone =>
okToGo <= '0';
if (doneEZW = '1') and (donePB = '1') then
nextState <= waitForBothClear;
changeToggle <= '1';
else
nextState <= waitForBothDone;
changeToggle <= '0';
end if;
when waitForBothClear =>
changeToggle <= '0';
okToGo <= '1';
if (doneEZW = '0') and (donePB = '0') then
nextState <= waitForBothDone;
else nextState <= waitForBothClear;
end if;
end case;
end process FSM;
switch: process(toggle,adrBufEZW,datBufEZW,csEZW,
oeEZW,weEZW,adrBufPB,datOutBut,csPB,oePB,wePB,adrBufPB,
datOutTop)
begin
if (toggle = '0') then
adrOutTop <= adrBufEZW;
datOutTop <= datBufEZW;
84
csOutTop <= csEZW;
oeOutTop <= oeEZW;
weOutTop <= weEZW;
adrOutBut <= adrBufPB;
datOutBut <= (others =>
datBufPB <= datOutBut;
csOutBut <= csPB;
oeOutBut <= oePB;
weOutBut <= wePB;
'Z');
else
adrOutTop <= adrBufPB;
datOutTop <= (others => 'Z');
datBufPB <= datOutTop;
csOutTop <= csPB;
oeOutTop <= oePB;
weOutTop <= wePB;
adrOutBut <= adrBufEZW;
datOutBut <= datBufEZW;
csOutBut <= csEZW;
oeOutBut <= oeEZW;
weOutBut <= weEZW;
end if;
end process switch;
clkEdge: process(clk16)
begin
if (rising-edge(clkl6)) then
if (reset =
'1') then
currentState <= resetState;
'0';
else
toggle <=
currentState <= nextState;
if (changeToggle = '1') then
toggle <= not (toggle);
end if;
end if;
end if;
end process clkEdge;
end architecture sram_switcharch;
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.stdarith.ALL;
85
package sram.switch-pack is
component sramswitch port(
-- Shakehand signals
doneEZW, donePB: in stdulogic;
okToGo: out stdulogic;
--
Output
adrOutTop, adrOutBut: buffer std-logic-vector(16 downto 0);
datOutTop,
datOutBut:
inout stdlogic-vector(7
csOutTop, oeOutTop, weOutTop: out stdulogic;
csOutBut, oeOutBut, weOutBut: out stdulogic;
--
Input from ezwout
adrBufEZW: in stdlogic.vector(16 downto 0);
datBufEZW: in stdlogicvector(7 downto 0);
csEZW, oeEZW, weEZW: in std-ulogic;
--
Input from parallelbuf
adrBufPB: in stdlogic.vector(16 downto 0);
datBufPB:
out std-logic-vector(7 downto 0);
csPB, oePB, wePB: in stdulogic;
toggleOut: out stdulogic;
reset: in std-ulogic;
clk16: in stdulogic
end component;
end package sramswitch-pack;
B.2.2
ezw-out.vhd
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.stdarith.ALL;
entity ezwout is port
-- Output of ezw
(
datBit: in stdulogic;
datClk: in stdulogic;
grpClk: in stdulogic;
--
SRAM Buffer
adrBuf: buffer stdjlogic-vector(16 downto 0);
datBuf: buffer std-logic-vector(7 downto 0);
86
downto 0);
cs, oe, we: out std-ulogic;
--
Shakehand signal
okToGo: in stdulogic;
done: out std-ulogic;
--
Clk
16MHz because bit rate of ezw is 5MHz max
clk16: in std-ulogic;
--
--DEBUG
datBitSyncOut, datClkSyncOut: out stdulogic;
reset:
in stdulogic
end entity ezwout;
architecture ezwout-arch of ezw-out is
type States is (resetState,writeLastByte ,waitForOkToGo,writeSRAM,
waitForOne, waitForZero,waitGrpClkZero,throwAwayGroup);
signal currentState, nextState: States;
signal datBitSync, datClkSync, grpClkSync: stdulogic;
signal bitCnt:
stdjlogic-vector(3 downto 0);
signal updateAdrBuf, updateBitCnt: stdulogic;
begin
datBitSyncOut <= datBitSync;
datClkSyncOut <= datClkSync;
oe <=
cs <=
'1';
'0';
FSM: process(currentState,grpClkSync,datBitSync,datClkSync,bitCnt,
okToGo)
begin
case currentState is
when resetState =>
datBuf <= (others => '0');
we <= '1';
done <= '0';
updateAdrBuf <= '0';
updateBitCnt <= '0';
nextState <= waitForOne;
when waitForOne =>
87
updateAdrBuf <= '0';
we <= '1';
done <= '0';
if (grpClkSync = '1') then
nextState <= writeLastByte;
elsif (datClkSync = '1') then
if (bitCnt = "0000") then
datBuf(7) <= datBitSync;
elsif (bitCnt = "0001") then
datBuf(6) <= datBitSync;
elsif (bitCnt = "0010") then
datBuf(5) <= datBitSync;
elsif (bitCnt = "0011") then
datBuf(4) <= datBitSync;
elsif (bitCnt = "0100") then
datBuf(3) <= datBitSync;
elsif (bitCnt = "0101") then
datBuf(2) <= datBitSync;
elsif (bitCnt = "0110") then
datBuf(1) <= datBitSync;
elsif (bitCnt = "0111") then
datBuf(0) <= datBitSync;
end if;
if (bitCnt = "0111") then
updateBitCnt <= '0';
nextState <= writeSRAM;
else
updateBitCnt <= '1';
nextState <= waitForZero;
end if;
else
updateBitCnt <= '0';
nextState <= waitForOne;
end if;
when waitForZero =>
updateAdrBuf <= '0';
updateBitCnt <= '0';
we <= '1';
done <= '0';
if (datClkSync = '0') then
nextState <= waitForOne;
else nextState <= waitForZero; end if;
when writeSRAM =>
88
we <= '0';
done <= '0';
updateBitCnt <=
'0';
updateAdrBuf <= '1';
if (datClkSync = '1') then
nextState <= waitForZero;
else nextState <= waitForOne; end if;
when waitGrpClkZero =>
we <= '1';
done <= '0';
if (grpClkSync = '0') then
nextState <= waitForOkToGo;
else nextState <= waitGrpClkZero; end if;
when writeLastByte =>
we <= '0';
done <= '0';
nextState <= waitGrpClkZero;
when waitForOkToGo =>
we <= '1';
done <= '1';
datBuf <= (others => '0');
if (okToGo = '1') and (grpClkSync = '0') then
nextState <= waitForOne;
--
elsif (datClkSync = '1') then
okToGo too slow, throw away until next grpClkSync
nextState <= throwAwayGroup;
else nextState <= waitForOkToGo; end if;
when throwAwayGroup =>
we <= '1';
done <=
'1';
datBuf <= (others => '0');
if (grpClkSync = '0') then
nextState <= throwAwayGroup;
else
nextState <= waitFor0kToGo;
end if;
end case;
end process FSM;
clkEdge: process(clk16)
begin
89
if (rising-edge(clkl6)) then
if (reset = '1') then
bitCnt <= (others => '0');
adrBuf <= (others => '0');
currentState <= resetState;
else
currentState <= nextState;
datBitSync <= datBit;
datClkSync <= datClk;
grpClkSync <= grpClk;
if (updateAdrBuf = '1') then
adrBuf <= adrBuf + 1;
end if;
if
(updateBitCnt = '1') then
bitCnt <= bitCnt + 1;
end if;
if (currentState = waitForOkToGo) then
bitCnt <= (others => '0');
adrBuf <= (others => '0');
elsif (currentState = writeSRAM) then
bitCnt <= (others => '0');
end if;
end if;
end if;
end process clkEdge;
end architecture ezw_out_arch;
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.std-arith.ALL;
package ezwout-pack is
component ezwout port(
-- Output of ezw
datBit: in stdulogic;
datClk: in stdulogic;
grpClk: in stdulogic;
--
SRAM Buffer
90
adrBuf: buffer stdilogic-vector(16 downto 0);
datBuf: buffer std-logic-vector(7 downto 0);
cs, oe, we: out std-ulogic;
--
Shakehand signal
okToGo: in std-ulogic;
done: out std-ulogic;
--
Clk
clk16: in std-ulogic;
--DEBUG
datBitSyncOut, datClkSyncOut: out stdulogic;
reset: in std-ulogic
end component;
end package ezw-out-pack;
B.2.3
parallel-sram.vhd
LIBRARY ieee;
USE ieee.std-logic-1164.ALL;
USE work.stdarith.ALL;
entity parallel-sram is port(
reset: in std-ulogic;
--
Output to sram buffer
adrBuf: out std-logic-vector(16 downto 0);
datBuf: in stdlogic-vector(7 downto 0);
cs, oe, we: out std-ulogic;
-- Parallel port interface pins
dout: inout std-logicvector(7 downto 0);
pl284Mode,nReverseReq,hostAck,hostClk: in std-ulogic;
nAckReverseReq,dir,nAck,periphAck: out stdulogic;
periphClk: buffer stdulogic;
signal to inidicate that entire buffer's been transfered
okToGo: in stdulogic;
done: out stdulogic;
clk16: in stdulogic
--
91
end entity parallel-sram;
architecture parallel-sramarch of parallel-sram is
type States is (waitForNextDMA,updateAdr,resetStatewaitForReq,
sendDat,waitFor0kwaitForHostAckUp,waitForHostAckDown,
doneOneByte,sendDone);
signal currentState, nextState: States;
signal
signal
signal
signal
signal
signal
signal
signal
signal
hostClkA: std-ulogic;
hostAckA,nReverseReqA: stdulogic;
doutSync, dat, cmd, datBufTmp: std-logic-vector(7 downto 0);
cntAdr: std-logic-vector(17 downto 0);
blockCnt: stdjlogic-vector(15 downto 0);
port2PC,port2PCA,pc2Port: stdulogic;
datOrNCmdOut: stdulogic; -- 1 if sending data, 0 if sending cmd
hostAckDown,hostAckUpreverseReqUp: stdulogic;
periphClkDelay: std-ulogic;
begin
cs <= '0';
oe <= '0';
we <= '1';
port2PC <= (p1284Mode) and (not nReverseReqA);
pc2Port <= not port2PC;
adrBuf <= cntAdr(16 downto 0);
dir <= port2PCA or port2PC;
dout <= datBufTmp;
datBufTmp <= datBuf when (datOrNCmdOut = '1') and
((port2PCA and port2PC) = '1') else
"01010111" when (datOrNCmdOut = '0')
and ((port2PCA and port2PC) = '1') else (others => 'Z');
nAckReverseReq <= pc2Port;
periphAck <= datOrNCmdOut;
FSM: process(currentState,nReverseReqA,hostAckA,datOrNCmdOut,
okToGo,blockCnt,cntAdr)
begin
case currentState is
when resetState =>
datOrNCmdOut <= '0';
done <= '0';
92
periphClkDelay <= '1';
nextState <= waitForReq;
when waitForReq =>
datOrNCmdOut <= '0';
done <= '0';
periphClkDelay <= '1';
if (nReverseReqA = '0') and (hostAckA = '0')
nextState <= sendDat;
then
else nextState <= waitForReq;
end if;
when waitForNextDMA =>
if (nReverseReqA = '0') then
periphClkDelay <= hostAckA;
nextState <= waitForNextDMA;
elsif (cntAdr = "100000000000000000")
then
nextState <= waitFor0k;
else nextState <= waitForReq; end if;
when sendDat =>
done <= '0';
datOrNCmdOut <= '1'; -- sending data
periphClkDelay <= '0';
nextState <= waitForHostAckUp;
when waitForHostAckUp =>
if (hostAckA = '0') then nextState <= waitForHostAckUp;
else nextState <= waitForHostAckDown; end if;
when waitForHostAckDown =>
periphClkDelay <=
'1';
if (hostAckA = '1') then nextState <= waitForHostAckDown;
else nextState <= doneOneByte; end if;
when doneOneByte =>
--
Or done DMA
if (blockCnt(15 downto 0) = "1000000000000000") then
nextState <= waitForNextDMA;
else
nextState <= updateAdr;
end if;
when updateAdr =>
nextState <= waitForReq;
93
when sendDone => -- Have to send the ending byte
datOrNCmdOut <= '0';
periphClkDelay <= '0';
nextState <= waitForHostAckUp;
when waitForOk =>
datOrNCmdOut <= '0';
done <= '1';
if (okToGo = '1') then
nextState <= waitForReq;
else nextState <= waitFor0k; end if;
end case;
end process;
clkEdge: process(clk16)
begin
if (rising-edge(clkl6)) then
hostClkA <= hostClk;
hostAckA <= hostAck;
nReverseReqA <= nReverseReq;
doutSync <= dout;
port2PCA <= port2PC;
periphClk <= periphClkDelay;
nAck <= periphClkDelay;
if (reset = '1') then
currentState <= resetState;
cntAdr <= (others => '0');
blockCnt <= (others => '0');
else
currentState <= nextState;
end if;
if (currentState = updateAdr) then
cntAdr <= cntAdr + 1;
blockCnt <= blockCnt + 1;
elsif (currentState = waitFor0k) then
cntAdr <= (others => '0');
elsif (currentState = waitForNextDMA) then
blockCnt <= (others => '0');
end if;
end if;
end process clkEdge;
94
end architecture parallel-sramarch;
LIBRARY ieee;
USE ieee.std-logic_1164.ALL;
USE work.stdarith.ALL;
package parallel-sram.pack is
component parallel-sram port(
reset: in std.ulogic;
--
Output to sram buffer
adrBuf: out std-logic-vector(16 downto 0);
datBuf: in stdlogic-vector(7 downto 0);
cs, oe, we: out std-ulogic;
-- Parallel port interface pins
dout: inout std-logic-vector(7 downto 0);
pl284Mode,nReverseReq,hostAck,hostClk: in stdulogic;
nAckReverseReq,dir,nAck,periphAck: out stdulogic;
periphClk: buffer stdulogic;
--
signal to inidicate that entire buffer's been transfered
okToGo: in stdulogic;
done: out stdulogic;
clk16: in std.ulogic
end component;
end package parallel-sram-pack;
B.2.4
top-b.vhd
LIBRARY ieee;
USE ieee.stdlogic_1164.ALL;
USE work.std-arith.ALL;
USE work.ezw-out-pack.ALL;
USE work.sramswitchpack.ALL;
USE work.parallel-sram-pack.ALL;
entity toplevb is port (
--
Input from switches
globalReset: in stdulogic;
aux: in std-ulogic;
95
-- EZW output signals
ezwDatBit: in stdulogic;
ezwDatClk: in stdulogic;
ezwGrpClk: in stdulogic;
-- EZW SRAM buffer signal
adrOutTop, adrOutBut: buffer std-logic-vector(16 downto 0);
datOutTop, datOutBut: inout stdlogic.vector(7 downto 0);
csOutTop,oeOutTop,we~utTop: out std-ulogic;
csOutBut,oeOutBut,weOutBut: out std-ulogic;
--
Parallel port interface pins
dout: inout stdlogicvector(7 downto 0);
pl284Mode,nReverseReq,hostAck,hostClk: in stdulogic;
nAckReverseReq,dir,nAck,periphAck: out std-ulogic;
periphClk: buffer stdulogic;
--
Some clock signals(12C also uses this.. check?)
clk3,clkl6: in stdulogic;
--
debug port
debugO, debugi, debug2: out stdulogic;
oldDatOutBut7,oldDatOutBut4:
in std-ulogic
end entity toplev-b;
architecture toplev_b_arch of toplevb is
signal tmpl, tmp2: stdulogic;
signal adrBufEZW: std-logic-vector(16 downto 0);
signal datBufEZW: std-logic-vector(7 downto 0);
signal doutTmp: std-logic-vector(7 downto 0);
signal csEZW, oeEZW, weEZW: stdulogic;
signal adrBufPB: std-logicvector(16 downto 0);
signal datBufPB: std-logic-vector(7 downto 0);
signal csPB, oePB, wePB: std-ulogic;
signal globalResetSync: std.ulogic;
signal
signal
signal
signal
signal
doneEZW, donePB: stdulogic;
okToGo: stdculogic;
toggleOut: std.ulogic;
tmp: stdulogic;
datBitSyncOut, datClkSyncOut: stdulogic;
begin
tmp1 <= oldDatOutBut7;
96
tmp2 <= oldDatOutBut4;
debugO <= doneEZW;
debugi <= datClkSyncOut;
debug2 <= datBitSyncOut;
ezw-out-part: ezw-out port map(
done => doneEZW,
okToGo => okToGo,
-- Output
datBit =>
datClk =>
grpClk =>
of ezw
ezwDatBit,
ezwDatClk,
ezwGrpClk,
-- SRAM Buffer
adrBuf => adrBufEZW,
datBuf => datBufEZW,
cs => csEZW,
oe => oeEZW,
we => weEZW,
-- Clk
clkl6 => clkl6,
datClkSyncOut => datClkSyncOut,
datBitSyncOut => datBitSyncOut,
reset => globalResetSync
parallel-sram-part: parallel-sram port map(
reset => globalResetSync,
--
Input from ezwout
okToGo => okToGo,
-- Output to sram buffer
adrBuf => adrBufPB,
datBuf => datBufPB,
cs => csPB,
oe => oePB,
we => wePB,
97
-- Parallel port interface pins
dout => dout,
p1284Mode => p1284Mode,
nReverseReq => nReverseReq,
hostAck => hostAck,
hostClk => hostClk,
nAckReverseReq => nAckReverseReq,
dir => dir,
nAck => nAck,
periphAck => periphAck,
periphClk => periphClk,
--
signal to inidicate that entire buffer's been transfered
done => donePB,
clk16 => clk16
sramswitch-part: sramswitch port map(
donePB => donePB,
doneEZW => doneEZW,
okToGo => okToGo,
--
Output
adrOutTop => adrOutTop,
adrOutBut => adrOutBut,
datOutTop => datOutTop,
datOutBut => datOutBut,
csOutTop => csOutTop,
oeOutTop => oeOutTop,
weOutTop => weOutTop,
csOutBut => csOutBut,
oeOutBut => oeOutBut,
weOutBut => weOutBut,
--
Input from ezwout
adrBufEZW => adrBufEZW,
datBufEZW => datBufEZW,
csEZW => csEZW,
oeEZW => oeEZW,
weEZW => weEZW,
-- Input from parallelbuf
adrBufPB => adrBufPB,
csPB => csPB,
98
oePB => oePB,
wePB => wePB,
-- Output to parallelbuf
datBufPB => datBufPB,
toggleOut => toggleOut,
reset => globalResetSync,
clkl6 => clkl6
clkUpdate: process(clkl6)
begin
if (rising-edge(clkl6)) then
globalResetSync <= not globalReset;
end if;
end process clkUpdate;
end architecture toplev-b-arch;
99
Appendix C
Decoder Code
C.1
Decoder Code in C
Included here are files from Microsoft Visual C++ 6.0. Note that some standard files
are not included in here.
C.1.1
//
//
//
StdAfx.h
stdafx.h : include file for standard system include files,
or project specific include files that are used frequently, but
are changed infrequently
//
#if
!defined(AFXSTDAFXH__A9DB83DBA9FD_11DO_BFD1_444553540000\
__INCLUDED-)
#define AFXSTDAFX_H__A9DB83DB.A9FD_11DOBFD1_444553540000__INCLUDED_
#if _MSC_VER > 1000
#pragma once
#endif // _MSC_VER > 1000
#define WIN32_LEANANDMEAN
// Exclude rarely-used stuff from Windows headers
// Windows Header Files:
#include <windows.h>
// C RunTime Header Files
#include <stdlib.h>
#include <malloc.h>
#include <memory.h>
#include <tchar.h>
100
//
Local Header Files
//
TODO: reference additional headers your program requires here
//{{AFXINSERT_LOCATION}}
// Microsoft Visual C++ will insert additional declarations
// immediately before the previous line.
#endif //
//
StdAfx.cpp
C.1.2
//
//
//
!defined(AFXSTDAFX_H__A9DB83DBA9FD_11DOBFD1_444553540000\
__INCLUDED_)
stdafx.cpp : source file that includes just the standard includes
vdodma3.pch will be the pre-compiled header
stdafx.obj will contain the pre-compiled type information
#include "stdafx.h"
// TODO: reference any additional headers you need in STDAFX.H
// and not in this file
C.1.3
vdodma3.h
#if !defined(AFXVDODMA3_H__0D3F1D66_43AE_11D3_B3OA_9D05D47BE08A\
_-INCLUDED_)
#define AFXVDODMA3_H__0D3F1D66_43AE_11D3_B30A-9D05D47BE08A__INCLUDED_
#if _MSC_VER > 1000
#pragma once
#endif // _MSC_VER > 1000
#include
#endif //
//
C.1.4
//
"resource.h"
!defined(AFXVDODMA3_H__0D3F1D66_43AE_11D3_B30A_9D05D47BE08A\
_INCLUDED_)
vdodma3.cpp
vdodma3.cpp : Defines the entry point for the application.
//
101
#include "stdafx.h"
#include
#include
#include
#include
#include
#include
"resource.h"
"WinRTctl.h"
"ioaccess.h"
<winioctl.h>
<ddraw.h>
<windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <conio.h>
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
NAME "EZW DEMO WITH DMA"
TITLE "EZW DEMO WITH DMA"
IMAGEWIDTH 256
FILESIZE 16384
FILESIZEO 65536
HEADERFILESIZE 15
TIMERID 1
TIMERRATE 33
BITSIN_A_FRAME 131072
MAXLOADSTRING 100
//*****
DMA STUFF **********//
//Parallel port stuff
#define BLOCKCNT 3 // Or 4 blocks of DMA transfers
#define DMASIZE 32768 //16384
#define EcpAFifo 0x0378;
#define LptDsr 0x0379;
/ bit 7 = inverted version of Busy
// bit 6 = nAck
// bit 5 = PError
// bit 4 = Select
// bit 3 = nFault
#define PeriphRequestMask 0x008;
#define LptDcr 0x037A;
//
bit 5 = direction. 0 = out, 1 = in
//
//
//
//
//
bit
bit
bit
bit
bit
4
3
2
1
0
=
=
=
=
=
ackIntEn. 1 enables an interrupt on rising edge of nAck
inverted nSelectIn
nInit
inverted nAutoFd
inverted nStrobe
#define EcpDFifo 0x0778;
102
#define LptCnfgA 0x0778;
/ bit 7: 1= interrupts are level, 0 = interrupts are pulses
/ bit 6-4: P-word size:
/ OxOO PWord = 2 bytes
/ Ox01 PWord = 1 byte
/ Ox02 PWord = 4 bytes
/ bit 3: reserved
/ bit 2: nByteIntransceiver (for recovery)
/ bit 1-0: fractional Pword count (for recovery)
#define LptCnfgB 0x0779;
/ bit 7: 1 = compression enabled
/ bit 6: value of ISA iReq line (read only)
/ bit 5-3: selects IRQ:
/ 111 = 5, 110 = 15, 101 = 14, 100 = 11,
/ 011 = 10, 010 = 9, 001 = 7 (default), 000 = jumpered
// bits 2-0: selects DMA channel:
//
//
111 = 7, 110 = 6, 101 = 5 (16-bit default), 100 = jumpered 16-bit,
011 = 3, 010 = 2, 001 = 1, 000 = jumpered 8-bit
#define LptEcr 0x077A;
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
bits 7:5 = mode
000: standard parallel port mode
001: PS/2 parallel port
mode (direction tri-states data lines)
010: parallel port FIFO mode (direction = 0 only)
011: ECP mode.
100: undefined
101: undefined
110: test mode. data not sent to port
111: configuration mode: cnfga and cnfgb regs
are accessible
bit 4: 0 enables interrupt pulse on falling edge of nFault
1 disables interrupts
bit 3: dmaEn: 0 disables DMA, 1 enables DMA (when serviceIntr = 0)
bit 2: serviceIntr: 1 disables DMA and service interrupts
enables service interrupts
(which set serviceIntr to 1)
If dmaEn= 1, int when terminal count is reached
If dmaEn = 0, FIFO service int.
bit 1: FIFO full
bit 0: FIFO empty
103
static char cCmd;
static char cStat=O;
static char cTemp;
static char cBitO,cBitl,cBit2,cBit3,cBit4,cBit5,cBit6,cBit7;
static char LowAdd,MidAdd,HighAdd;
static char DmaBuf[DMASIZE];
static char * LookHere = DmaBuf;
static char BlockBuf[64];
//temp
//WinRT variables
static HANDLE hWinRT;
iWinRTlength;
static DWORD
// length of returned buffers
//
DMA buffer information returned from the driver
static WINRTDMABUFFERINFORMATION DmaInformation;
static ULONG Length, DmaLength;
// length of buffers from API calls
'X'
#define DAS16DMAKNOWNBUFFERFILLER
//
known information to used to fill the buffer
//static USHORT NumberOfBytes = 16384;
//size of each actual DMA transfer
//static USHORT NumberOfBytes = 64;
#define NUMBEROFBYTES 32768
static USHORT NumberOfBytes = NUMBEROFBYTES;
static
static
static
static
TCHAR szErrorMsg[128];
TCHAR szErrorTitle[128];
int j = 0;
int block = 0;
//***** END OF DMA STUFF ***********//
II***** Display DDRAW STUFF
********//
static BOOL pause = FALSE;
//***** END OF Display DDRAW STUFF *******II
//***** EZW and DECODE Constants *******//
#define NUMGRP 1 // or 1 of 16 frames
#define GRPSIZE 16
int NUMFRM = 1;
//Number of frames to be displayed; 1 is 1(< than 16)
int EZWPASSES = 6; // default value
#define BITBUFSIZE NUMGRP * NUMBEROFBYTES * (BLOCKCNT+1) * 8 + 256
104
#define valuebits 11
#define top-value ((1 << valuebits) - 1) //2^11 - 1 = 2047
#define firstqtr ((top-value >> 2) + 1) //2047 >> 2 = 511 + 1
(firstqtr << 1)
#define half
#define thirdqtr (firstqtr + half)
=
512
//512 << 1 = 1024
//512 + 1024 = 1536
#define maxfreq 255
#define tbls 3
#define root-sym 0
#define pos.sym 1
#define neg-sym 2
#define zerosym 3
char filenamejin[]
= "C:\\Users\\charatc\\VC++\\RF";
char filename-out[]
char filenameorg[]
char filenamerpt[]
//****** END of EZW
= "C:\\Users\\charatc\\VC++\\RF";
= "C:\\Users\\charatc\\VC++\\org.000";
= "C:\\Users\\charatc\\VC++\\diff.rpt";
and DECODE Constants ********//
// DirectDraw object
lpDD;
static LPDIRECTDRAW
lpDDSPrimary;
static LPDIRECTDRAWSURFACE
primary
surface
// DirectDraw
// DirectDraw back surface
lpDDSBack;
static LPDIRECTDRAWSURFACE
// is application active?
bActive;
BOOL
static HCURSOR hArrowCursor, hWaitCursor; // Mouse
// buttons to choose ezw passes
static HWND hBtnP2;
static HWND hBtnP3;
static HWND hBtnP4;
static HWND hBtnP5;
static HWND hBtnP6;
HINSTANCE hInst; // current instant
TCHAR szWindowClass[MAXLOADSTRING];
TCHAR szTitle[MAXLOADSTRING];
//static DDSURFACEDESC
ddrval;
//HRESULT
ddscaps;
//DDSCAPS
ddsd;
short bitBuf[BITBUFSIZE];
static unsigned long int bitBufIndex, totalBits;
static long int ratio;
static int frame = 0;
static int framedisplay = 0;
static int BitsPerPixel = 8;
105
static int BitsPerPixel0 = 32;
static HBITMAP hbm;
static char ImageData[20] [FILESIZE0];
static char ImageDataO[FILESIZE];
char *framePtr;
char *erStr;
hdcImage = NULL;
HDC
static
char pHBuffer[HEADERFILESIZE];
char rptBuf[NUMGRP][256];
//********* EZW Routines ***********//
int imagecols, imagerows, sblvls, top-bit, image-mean;
int group-size, groups, bottombit, alubits, sign-bitsmask;
int zero-poss, ezw-passes;
int filter-args[5] = {0, 0, 0, 0, 0};
int id = 0;
int freq[tbls][2];
int places[tbls];
int code[tbls];
void check-precision(int val) {
if (val < 0) {
if (~val & sign-bitsmask)
wsprintf(erStr,
"ERROR = %d precision overflow val = %d\n", id, val);
else
if (val & sign-bitsmask)
wsprintf(erStr,
"ERROR = %d precision overflow val = %d\n", id, val);
}
int fadd(int a,int b) {
int ans = a + b;
id = id * 10;
checkprecision(ans);
id = id / 10;
return ans;
}
int fsub(int a,int b) {
a + 1 + (~b);
int ans
id = id * 100;
check-precision(ans);
id = id / 100;
return ans;
}
106
}
int faddh(int a,int b) {
int ans = (a >> 1) + (b >> 1) + (a & b & 1);
id = id * 10;
checkprecision(ans);
id = id / 10;
return ans;
}
int fsubh(int a,int b) {
int ans = (a >> 1) + 1 + (~b >> 1);
id = id * 100;
check-precision(ans);
id = id / 100;
return ans;
}
/* 5 x 5 */
double decjlo-pass5[] = {-0.0761, 0.3536, 0.8593, 0.3536, -0.0761};
double dec-hi.pass5[] = {-0.0761, -0.3536, 0.8593, -0.3536, -0.0761};
int enclo5() {
int a;
a
a
a
a
a
a
=
=
=
=
=
=
faddh(filter-args[0],
,
fsub(a
>> 3,
fsub(a
>> 1,
fadd(a
,
faddh(a
,
faddh(a
filterargs[4]);
filter-args[2]);
a);
filter-args[1]);
filter-args[3]);
filter-args[2]);
return a;
}
int enchi5()
int a;
a
a
a
a
a
a
=
=
=
=
=
=
{
faddh(filter-args[0],
,
fsubh(a
>> 3,
fsub(a
fsubh(a
,
,
fsub(a
,
faddh(a
filter-args[4]);
filter-args[2]);
a);
filter-args[1]);
filter-args[3] >> 1);
filter-args[2]);
return a;
}
int decfilterlen = 5;
double *dec-lo-pass = dec-lo-pass5;
107
double *dec-hi-pass = dechi-pass5;
typedef int (*MYPROC)();
int encfilterlen
encrenorm[]
int
MYPROC enclo-pass
MYPROC enchi-pass
5;
{3, 2, 1, 0};
= enc-lo5;
= enc-hi5;
char *strappend(char *strl,char *str2) {
char *result;
result =
(char *) calloc((strlen(strl)
strcpy(result, stri);
strcat (result, str2);
return result;
+ strlen(str2) + 1),
sizeof(char));
}
char *makefrm-name(char *name, int frnum) {
char *ans,*ret;
ans = (char *) calloc(5, sizeof(char));
if (frnum < 10) wsprintf(ans, ".00%d", frnum);
else if (frnum < 100) wsprintf(ans, ".0%d", frnum);
else wsprintf(ans, ".%d", frnum);
ret = strappend(name,ans);
free(ans);
return ret;
}
int **make2dintarray(int dl, int d2) {
int **array;
int index;
array = (nt **) calloc(dl, sizeof(int *));
for (index = 0; index < dl; index++)
array[index] = (nt *) calloc(d2,sizeof(int));
return array;
}
void destroy2dintarray(int **ptr, int dl, int d2) {
int i;
for (i=0;i<dl;i++) free(ptr[i]);
free(ptr);
}
double **make2ddblarray(int dl, int d2) {
108
double **array;
int index;
array = (double **) calloc(dl, sizeof(double *));
for (index = 0; index < dl; index++)
array[index] = (double *) calloc(d2,sizeof(double));
return array;
}
void destroy2ddblarray(double **ptr, int dl, int d2) {
int i;
for (i=O;i<dl;i++) free(ptr[i]);
free(ptr);
}
int ***make3dintarray(int dl, int d2, int d3) {
int ***array;
int index;
array = (nt ***) calloc(dl, sizeof(int **));
for (index = 0; index < dl; index++)
array[index] = make2dintarray(d2,d3);
return array;
}
void destroy2dintarray(int ***ptr, int dl, int d2, int d3) {
int i;
for (i=0;i<dl;i++) destroy2dintarray(ptr[i],d2,d3);
free(ptr);
}
double ***make3ddblarray(int dl, int d2, int d3) {
double ***array;
int index;
array = (double ***) calloc(dl, sizeof(double **));
for (index = 0; index < dl; index++)
array[index] = make2ddblarray(d2,d3);
return array;
}
void destroy3ddblarray(double ***ptr, int dl, int d2, int d3) {
int i;
for (i=0;i<dl;i++) destroy2ddblarray(ptr[i],d2,d3);
free(ptr);
}
int mag(int arg) {
109
if (arg < 0) arg = 0 - arg;
return arg;
}
int reflect(int arg,int bound) {
arg = mag(arg);
if (arg >= bound) arg = (2 * (bound - 1)) - arg;
return arg;
}
void check_p-sgn(int val, int p, int sgn) {
if (sgn) {
if (~val & (0 - (1 << (p - 1))))
wsprintf(erStr,"ERROR = precision overflow val = Xx\n", val); }
else
if (val & (0 - (1 << (p - 1))))
wsprintf(erStr,"ERROR = precision overflow val = %x\n", val);
}
void update.model(int sym,int table) {
int cums0, cumsl, total;
freq[table][sym]++;
places [table] = freq [table] [1] > freq [table] [0];
check-p.sgn(freq[table] [0], 9, 0);
check-p-sgn(freq[table][1], 9, 0);
total = freq[table] [0] + freq[table] [1]
check-p-sgn(total, 9, 0);
for (cums0 = valuebits - 4; ((1 << cums0) & total) == 0; cums0--);
for (cumsl = valuebits - 4;
((1 << cums1) & freq[table][1 - places[table]]) == 0;
cumsl--);
check-p-sgn(cums0, 4, 0);
check-p-sgn(cumsl, 4, 0);
code[table] = cums0 - cumsl;
if (code[table] < 1) code[table] = 1;
checkp.sgn(code[table], 4, 0);
if (total == maxdfreq) {
freq[table][0] = (freq[table][0] >> 1)
freq[table][1] = (freq[table][1] >> 1)
}
}
void initarith-model() {
int j;
for (j = 0; j < tbls; j++) {
110
1 1;
1 1;
code[j] = 1;
places[j]
=
0;
freq[j][0] = 1;
freq[j][1] = 1;
}
}
void flag-dscndnts(int col,int row,int lvl,int **part-of-tree) {
if (lvi > 0) {
part-of-tree[col][row] = 1;
flagdscndnts((col << 1)
,(row << 1)
,lvl - 1,part-oftree);
flag-dscndnts((col << 1) + 1,(row << 1)
flag-dscndnts((col << 1)
,lvl - 1,part-oftree);
,(row << 1) + 1,lvl - 1,part-oftree);
flag-dscndnts((col << 1) + 1,(row << 1) + 1,lvl - 1,part-oftree);
}
}
//************* END of EZW Routines
************/
//************* DECODE Routines ***************II
FILE *inFile,*outFile;
double ***image-syn;
int low, high;
int value;
int **found, **mags, **signs, **prev-frame;
int even(int arg) {
return 1 - (arg & 1);
}
void synthesize-image(int lvl) {
int row, col, coef, index, cols, rows;
cols = imagecols >> (lvl - 1);
rows = imagerows >> (lvl - 1);
for (row = 0; row < rows; row++)
for (col = 0; col < cols; col++) {
imagesyn[col][row][1] = 0;
for (coef = 0; coef < dec-filterlen; coef++) {
index = reflect(row + (decfilterlen / 2) - coef, rows);
if (even(index))
image-syn[col][row][1] +=
declopass[coef] * image-syn[col][index / 2][0];
index = reflect(row + (decfilterlen / 2) - coef - 1, rows);
if (even(index))
image-syn[col][row][1]
+=
dechipass[coef] * image-syn[col][(index / 2)
111
+ (rows /
2)][0];
}
}
for (row = 0; row < rows; row++)
for (col = 0; col < cols; col++) {
image-syn[col][row][0] = 0;
for (coef = 0; coef < decfilterlen; coef++) {
index = reflect(col + (decfilterlen / 2) - coef, cols);
if (even(index))
image-syn[col][row][0] +=
declopass[coef] * imagesyn[index / 2][row][1];
index = reflect(col + (decfilterlen / 2) - coef - 1, cols);
if (even(index))
imagesyn[col][row][0]
+=
dechi_pass[coef] * imagesyn[(index / 2)
+ (cols /
2)][row][1];
}
}
}
int input-bit() {
short next-bit = bitBuf[bitBufIndex];
bitBufIndex++;
if (next-bit == 1) return 1; else return 0;
}
void initarith-decode() {
int i;
value = 0;
for (i = 0; i < valuebits; i++)
value
=
(value << 1) + input-bito;
}
int decodesym(int table) {
int range = high - low + 1;
int neg-size = (- range) >> code[table];
int trunc = ((- range) & ((1 << code[table])
int place = (1 << (8 - code[table])) >
((((value - low + !trunc) << 8) - 1) / range);
int sym = places[table] ^ place;
if (place) high = low + ~neg-size;
else low = low + ~neg-size + 1;
for (; 1;)
{
if (high < half)
else if (low >= half) {
112
-
1)) > 0;
value -=
low
-=
high -=
half;
half;
half;
}
else if ((low >= first.qtr) && (high < thirdqtr)) {
value -= firstqtr;
low
-= first-qtr;
-= first.qtr;
high
}
else break;
low = 2 * low;
high = (2 * high) + 1;
value
=
(2 * value) + input-bito;
}
if (table) update.model(sym, table);
return sym;
}
int decodedomsym() {
int symi = 0;
if (zero-poss) symi = decode-sym(1);
if (symn)
return zero.sym;
if (decode-sym(2))
if (decode-sym(0)) return neg-sym;
else return pos-sym;
return rootsym;
}
int decodesubbit() {
return decode-sym(O);
}
void decode-pass(int bit) {
int row, col, sym, x, y, rowy, colx;
int **part-oftree = make2dintarray(imagecols,imagerows);
int rows = imagerows >> sblvls;
int cols = imagecols >> sblvls;
int subbit = bit >> 1;
initarithmodel();
zero-poss = 1;
for (row = 0; row < rows; row++)
for (col = 0; col < cols; col++) {
if (found[col][row] == 0) {
sym = decodedomsymo;
113
if (sym == pos-sym) {found[col][row] = 1;
mags[col] [row] = bit; }
else if (sym == negsym) {found[col][row] = 1;
signs[col][row] = 1; mags[col][row] = bit; }
else if (sym == root-sym) {
flag-dscndnts(col + cols,row
,sblvls,part-oftree);
flag-dscndnts(col
,row + rows,sblvls,part-oftree);
flag-dscndnts(col + cols,row + rows,sblvls,part-oftree); }
}
if (subbit && found[col] [row])
if (decodesubbitO) mags[col][row]
=
mags[col][row]
I subbit;
}
initarithmodel();
if
(bit
==
(1
<< (alu.bits - 2 - encrenorm[2]
- bottom-bit)))
zero-poss = 0;
for (row = 0; row < rows; row++) {
for (col = 0; col < cols; col++) {
if ((found[col + cols][row] == 0) &&
(part-oftree[col + cols][row] == 0)) {
sym = decodedomsymO;
if (sym == possym) {found[col + cols][row]
= 1;
mags[col + cols][row] = bit; }
else if (sym == negsym) {found[col + cols][row] = 1;
signs[col + cols][row] = 1;
mags[col + cols][row] = bit; }
else if (sym == root-sym)
flagdscndnts(col + cols,row,sblvls,part-of-tree);
}
if (subbit && found[col + cols][row])
if (decodesubbit()
mags[col + cols][row] = mags[col + cols][row]
I subbit;
}
for (col = 0; col < cols; col++) {
if ((found[col][row + rows] == 0) &&
(part-oftree[col][row + rows] == 0)) {
sym = decodedomsymo;
if (sym == pos.sym) {found[col][row + rows] = 1;
mags[col][row + rows] = bit; }
else if (sym == neg-sym) {found[col][row + rows] = 1;
signs[col][row + rows] = 1;
mags[col][row + rows] = bit; }
else if (sym == root-sym)
flag-dscndnts(col,row + rows,sblvls,part-of-tree);
}
114
if
(subbit
&& found[col] [row + rows])
if (decode-sub-bito)
mags[col] [row + rows] = mags[col] [row + rows] I sub-bit;
if ((found[col + cols][row + rows] == 0) &&
(part-oftree[col + cols][row + rows] == 0)) {
sym = decode-domsymO;
if (sym == pos-sym) {found[col + cols][row + rows] = 1;
mags[col + cols][row + rows] = bit; }
else if (sym == neg-sym) {found[col + cols][row + rows] = 1;
signs[col + cols][row + rows] = 1;
mags[col + cols][row + rows] = bit; }
else if (sym == root-sym)
flag-dscndnts(col + cols,row + rows,sblvls,part-of-tree);
}
if (subbit && found[col + cols][row + rows])
if (decode-subbitO)
mags[col + cols][row + rows]
mags[col + cols][row + rows]
=
I sub-bit;
}
}
initarithmodel();
if (bit == (1 << (alu.bits - 2
zero-poss = 0;
if (bit < (1 << (alu.bits - 1 -
-
encrenorm[1]
encrenorm[1]
-
-
bottom-bit)))
bottombit)))
int rows = imagerows >> 2;
int cols = imagecols >> 2;
for (row = 0; row < rows; row++)
for (col = cols; col < (cols << 1); col++) {
if ((found[col][row]
== 0) && (part-oftree[col][row]
==
0))
{
==
0))
{
sym = decodedom-symO;
if (sym == possym) {found[col][row] = 1;
mags[col][row] = bit; }
else if (sym == neg.sym) {found[col][row] = 1;
signs [col] [row] = 1; mags[col] [row] = bit; }
else if (sym == rootsym)
flag-dscndnts(col,row,2,part-oftree);
}
if (sub-bit && found[col][row])
if (decodesubbit()
mags[col] [row] = mags[col][row]
I sub-bit;
}
for (row = rows; row < (rows << 1); row++)
for (col = 0; col < cols; col++) {
if ((found[col][row]
== 0) && (part-oftree[col][row]
sym = decodedom-sym(;
115
{
if (sym == possym)
{found[col] [row] = 1; mags[col] [row] = bit; }
else if (sym == negsym) {found[col][row] = 1;
signs[col] [row] = 1; mags[col] [row] = bit; }
else if (sym == rootsym)
flag-dscndnts(col,row,2,part-oftree);
}
if (sub-bit && found[col][row])
if (decodesub-bit()
mags [col] [row] = mags [col] [row]
I subbit;
}
for (row = rows; row < (rows << 1); row++)
for (col = cols; col < (cols << 1); col++) {
if ((found[col][row]
== 0) && (partoftree[col][row]
sym = decodedomsym(;
if (sym == pos-sym) {found[col] [row] = 1;
mags[col] [row] = bit; }
else if (sym == negsym) {found[col][row]
signs [col] [row]
=
=
==
0))
{
1;
1; mags[col] [row] = bit; }
else if (sym == rootsym)
flag-dscndnts(col,row,2,part-oftree);
}
if (sub-bit && found[col][row])
if (decodesubbitO)
mags [col] [row] = mags [col] [row]
I sub-bit;
}
}
initarithmodel();
zero-poss = 0;
if (bit < (1 << (alu.bits - 1 - encrenorm[0]
-
bottom-bit)))
int rows = imagerows >> 1;
int cols = imagecols >> 1;
for (y = 0; y < 2; y++)
for (x = 0; x < 2; x++)
for (rowy = 0; rowy < rows; rowy += 2)
for (colx = cols; colx < (cols << 1); colx += 2) {
row = rowy + y;
col = colx + X;
if ((found[col][row]
== 0) && (part-oftree[col][row]
sym = decodedomsym(;
if (sym == possym)
= 1; mags[col] [row] = bit; }
{found[col] [row]
else if (sym == neg-sym)
{found[col] [row] = 1;
signs[col][row]
=
1;
116
==
0))
{
{
mags [coil [row] = bit; }
else if (sym == rootsym)
flag-dscndnts(col,row,1,part-oftree);
}
if (sub-bit && found[col][row])
if (decodesubbitO)
mags[col][row] = mags[col][row]
I subbit;
}
for (y = 0; y < 2; y++)
for (x = 0; x < 2; x++)
for (rowy = rows; rowy < (rows << 1); rowy += 2)
for (colx = 0; colx < cols; colx += 2) {
row = rowy + Y;
col = colx + X;
if ((found[col][row]
== 0) && (part-oftree[col][row]
==
0)) {
sym = decodedom-symO;
if (sym == pos-sym)
= 1; mags[col][row] = bit; }
{found[col][row]
else if (sym == neg-sym)
{found[col][row] = 1;
signs[col][row] = 1;
mags[col][row] = bit; }
else if (sym == rootsym)
flag-dscndnts(col,row,1,part-oftree);
}
if
(sub-bit && found[col] [row])
if (decodesubbitO)
mags[col][row] = mags[col][row]
I subbit;
}
for (y = 0; y < 2; y++)
for (x = 0; x < 2; x++)
for (rowy = rows; rowy < (rows << 1); rowy += 2)
for (colx = cols; colx < (cols << 1); colx += 2) {
row = rowy + y;
col = colx + X;
if ((found[col][row]
== 0) && (part-oftree[col][row]
sym = decodedom-symO;
if (sym == possym)
= bit; }
= 1; mags[col][row]
{found[col][row]
else if (sym == neg-sym)
{found[col][row]
signs[col][row]
=
= 1;
1; mags[col][row]
= bit; }
else if (sym == root-sym)
flag-dscndnts(col,row,1,partof_tree);
}
117
==
0))
{
if (sub-bit && found[col][row])
if (decodesubbit()
mags[col][row] = mags[col][row]
I sub-bit;
}
}
destroy2dintarray(part-of-tree,imagecols,imagerows);
}
void dumpimage(int frm) {
unsigned char temp-char;
int row, col;
for (row = 0; row < imagerows; row++)
for (col = 0; col < imagecols; col++) {
image-syn[col][row][0] += image-mean;
tempchar = (char) image-syn[col][row][0];
if (image-syn[col][row][0] < 0) temp.char = 0;
if (image-syn[col][row][0] > 255) temp.char = 255;
ImageData[frm][((row*imagecols)+col)*4 ] = temp-char;
ImageData[frm][((row*imagecols)+col)*4+1]
ImageData[frm][((row*imagecols)+col)*4+2]
= temp-char;
= temp-char;
}
}
/I************* END of DECODE Routines *********/
//*************
DMA Routines *******************/I
BOOL OpenWinRT(VOID)
{
hWinRT = WinRTOpenDevice(0, FALSE); //open device 0, no sharing
if (hWinRT == INVALIDHANDLE-VALUE)
{
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,"Can't Start HWinRT Driver");
return(FALSE);
}
return(TRUE);
}
BOOL PointEcpOut(VOID)
{
118
//%% //
start up the preprocessor
//
//#SetSize 8
//#SetAbsolute On
//#OnError pointouttrap
//
// DimB cTemp;
//
//
//
cTemp = 0x034; // mode 001, disable int, dma, serviceint
outp(LptEcr,cTemp);
//
//
//
//
//
cTemp = inp(LptDcr); //get device control reg
cTemp = cTemp & OxOOC4; // bring bits 5,4,3,1,0 low
cTemp = cTemp I 0x0004; // and bit 2 high.
This sets direction (bit 5) to OUT,
// //
disables nAck int (bit 4),
// // brings nSelectin (1284mode) high (bit 3)
//
//
//
//
//
brings nInit (nReverseRequest) high (bit 2)
// brings nAutoFd (nCmd/Data) high (bit 1)
// brings nStrobe high (bit 0)
// Note bits 0, 1, and 3 are inverted.
outp(LptDcr,cTemp);
//
//
//
//
cTemp = 0x0074; //go to mode 011 (ECP)
outp(LptEcr,cTemp);
I///7
{
WINRTCONTROLITEM _WinRTpp0l[ =
{// command parami param2
{DIM,0x10040001,0x0074},// OxOO constant
{DIM,0x10040001,0x0004},// 0x01 constant
{DIM,0x10040001,0x00C4},// Ox02 constant
{DIM,0x10040001,0x0034},// Ox03 constant
{DIM,0x00010001, Ox0},// Ox04 cTemp
{MATH,0x000D0004 ,0x00030000},
{MATH,0x000D0004 ,0x00042000},
{OUTPBA,0,0},
{INPBA,0,0},
{MATH,0x000D0004,0x00044000},
{MATH,0x00080004,0x00040002},
{MATH,0x00090004 ,0x00040001},
{MATH,0x000D0004,0x00042000},
119
{OUTPBA,0,0},
{MATH,0x000D0004,0x00000000},
{MATH,0x000D0004,0x00042000},
{OUTPBA,0,0},
};
_WinRTppO1 [ 4] value = (ULONG) cTemp;
_WinRTpp0l[ 7].port = LptEcr;
_WinRTpp0l[ 8].port = LptDcr;
_WinRTpp0l[13].port = LptDcr;
WinRTpp0l[16].port = LptEcr;
if (!WinRTProcessIoBufferDirect(hWinRT,
sizeof(_WinRTpp0l), &iWinRTlength))
_WinRTppOl,
goto pointout-trap;
cTemp = (UCHAR)_WinRTpp0l[ 4].value;
//
}
return (TRUE)
pointout-trap:
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,"HWinRT error in PointEcpOut");
WinRTCloseDevice(hWinRT);
return(FALSE);
}
BOOL PointEcpIn(VOID)
{
//
//
need to set direction to 0, strobe to 0, autoFD to 0,
mode to 011 (ECP mode)
//switch directions by first switching to mode 001,
// negotiating for the channel, and setting
// mode back to 011
//%% //
start up the preprocessor
//
//#SetSize 8
//#SetAbsolute On
//#OnError pointintrap
/i
// DimB cStat;
//
120
//
cStat = 0x034;
//
outp(LptEcr,cStat);
//
mode 001, disable int, dma, serviceint
//
//
cStat = inp(LptDcr); //get device control reg
//
//
//
//
//
//
//
//
//
cStat = cStat & OxOOC4; // bring bits 5,4,3,1,0 low
cStat = cStat I 0x0004; // and bit 2 high.
This sets direction (bit 5) to OUT,
// disables nAck int (bit 4),
// brings nSelectin (1284mode) high (bit 3)
// brings nInit (nReverseRequest) high (bit 2)
// brings nAutoFd (nCmd/Data) high (bit 1)
// brings nStrobe high (bit 0)
// Note bits 0, 1, and 3 are inverted.
//
outp(LptDcr,cStat);
//
/I/we're
in the default mode.
Now switch the direction to in
//
//
cStat = cStat I 0x0020;
//bring bit 5(dir) high
//
//
//
cStat = cStat & OxOQFB; //
outp(LptDcr,cStat);
bring bit 2 (nInit, nReverseRequest) low
//
//
cStat = 0x0074; //go to mode 011 (ECP)
//
outp(LptEcr,cStat);
/7/
{
WINRTCONTROLITEM _WinRTpp02[
{// command parami param2
{DIM,0x10040001,0x0074},//
=
OxOO constant
{DIM,0x10040001,0x00FB},// 0x01 constant
{DIM,0x10040001,0x0020},// Ox02 constant
{DIM,0x10040001,0x0004},// Ox03 constant
{DIM,0x10040001,0x00C4},// 0x04 constant
{DIM,0x10040001,0x0034},// OxO5 constant
{DIM,Ox0010001, Ox0},// Ox06 cStat
{MATH,0x000D0006,0x00050000},
{MATH,0x000D0006,0x00062000},
{OUTPBA,0,0},
{INPBA,0,0},
{MATH,0x000D0006,0x00064000},
{MATH,0x00080006 ,0x00060004},
{MATH,0x00090006,0x00060003},
{MATH,OxOOOD0006,,0x00062000},
{OUTPBA,0,0},
121
{MATH,0x00090006,0x00060002},
{MATH,OxOOOD0006,0x00062000},
{OUTPBA,0,0},
{MATH,0x00080006,0x00060001},
{MATH,OxOOOD0006,0x00062000},
{OUTP-BA,0,0},
{MATH,OxOOOD0006,OxOOOOOOOO},
{MATH,OxOOOD0006,0x00062000},
{OUTPBA,0,0},
};
_WinRTpp02[ 6].value = (ULONG)cStat;
_WinRTpp02[ 9].port = LptEcr;
_WinRTppO2[10].port = LptDcr;
_WinRTpp02[15].port = LptDcr;
_WinRTpp02[18].port = LptDcr;
_WinRTpp02[21].port = LptDcr;
_WinRTpp02[24].port = LptEcr;
if (!WinRTProcessIoBufferDirect(hWinRT, _WinRTpp02,
sizeof(_WinRTpp02), &iWinRTlength))
goto pointin-trap;
cStat = (UCHAR)-WinRTpp02[ 6].value;
//
}
return(TRUE);
pointin-trap:
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,"HWinRT error in PointEcpIn");
WinRTCloseDevice(hWinRT);
return(FALSE);
}
BOOL StartDmaIn(VOID)
{
//
//
//
LowAdd=(BYTE)(SramAddress);
MidAdd=(BYTE)(SramAddress>>8);
HighAdd=(BYTE)(SramAddress>>16);
if(OpenWinRT() ==FALSE)
{
return(FALSE);
}
122
/can I assume ECP is already pointing out?
//if(PointEcpOut0==FALSE)
//
return(FALSE);
//%% // start up the preprocessor
//
//#SetSize 8
//#SetAbsolute On
//#OnError StartDmaIn-trap
//
// DimB LowAdd;
//
//
DimB MidAdd;
DimB HighAdd;
//
//
//
outp(EcpAFifo,0x030);
//write SRAM low address to HOSTCMD
// outp(EcpDFifo,LowAdd);
// outp(EcpAFifo,0x034);
//write SRAM mid address to HOSTCMD
// outp(EcpDFifo,MidAdd);
// outp(EcpAFifo,0x038);
//write SRAM high address to HOSTCMD
// outp(EcpDFifo,HighAdd);
// outp(EcpAFifo,0x040);
/mnit HOSTCMD with SRAMWR
I-
if(PointEcpIn()==FALSE)
{
return(FALSE);
}
II
I/
prepare the WinRT DMA common buffer and
get the DMA buffer information
if (!WinRTSetupDmaBuffer(hWinRT, &DmaInformation, &Length))
{
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,"WinRTSetupDmaBuffer failed");
123
WinRTCloseDevice(hWinRT);
return(FALSE);
}
if (DmaInformation.Length < DMASIZE)
{
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,
"WinRT can't allocate enough buffer memory");
WinRTFreeDmaBuffer(hWinRT, &DmaInformation, &Length);
WinRTCloseDevice(hWinRT);
return(FALSE);
}
//%% // start up the preprocessor
//#SetSize 8
//#SetAbsolute On
//#OnError StartDmaIn-trap
//
//
//start
the printer port dma
//
DimB cStat;
//
DimW NumberOfBytes;
//
//
cStat = OxO7C; //set bit 3 to enable DMA
//
outp(LptEcr,cStat);
//
//
cStat = 0x078; //clear bit 2 to start DMA
//
outp(LptEcr,cStat);
I////
//
start the DMA
DmaStart(FALSE,NumberOfBytes); //
start DMA in
{
WINRTCONTROLITEM -WinRTpp03[ =
{/ command parami param2
{DIM,0x10040001,0x0078},// OxOO constant
{DIM,0x10040001,0x007C},// 0x01 constant
{DIM,Ox0010001, Ox0},// 0x02 cStat
{DIM,0x00020001, Ox0},// Ox03 NumberOfBytes
{MATH,OxOOOD0002,OxOOO10000},
{MATH,0x000D0002,0x00022000},
{OUTPBA,0,0},
124
{MATH,OxOOOD0002,OxOOOOOOOO},
{MATH,OxOOOD0002,0x00022000},
{OUTPBA,0,0},
{MATH,OxOOOD0003,0x00032000},
{DMASTARTOxO,OxO},
};
_WinRTpp03[ 2].value = (ULONG)cStat;
_WinRTpp03[ 3].value = (ULONG)NumberOfBytes;
_WinRTpp03[ 6].port = LptEcr;
_WinRTpp03[ 9].port = LptEcr;
if (!WinRTProcessDmaBufferDirect(hWinRT, _WinRTpp03,
sizeof(_WinRTpp03), &iWinRTlength))
goto StartDmaIn_trap;
cStat = (UCHAR)_WinRTpp03[ 2].value;
NumberOfBytes = (USHORT)_WinRTppO3[ 3].value;
//
}
WinRTCloseDevice(hWinRT);
return (TRUE);
StartDmaIn_trap:
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,"WinRT driver failure in StartDmaIn");
WinRTCloseDevice(hWinRT);
return(FALSE);
}
BOOL FinishDmaIn (LPSTR lpDmaDest,int a)
{
BOOL ReturnValue = TRUE;
ULONG TimeoutTime;
if(OpenWinRT()==FALSE)
{
return (FALSE);
}
TimeoutTime=GetTickCount(+1000;
I/M second DMA timeout
//wait for LptEcr bit 2 to go high again, signalling DMA done
while((cStat & 0x0004) == 0)
{
125
if(GetTickCount()>TimeoutTime)
{
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,"DMA timed out:Xd",a);
ReturnValue=FALSE;
goto PastDmaInWait;
}
//%% // start up the preprocessor
// #SetSize 8
// #SetAbsolute On
// DimB cStat;
//
//
cStat = inp(LptEcr);
//XXo/
{
WINRTCONTROLITEM -WinRTpp04[] =
{// command parami param2
{DIM,OxOO010001, Ox0},// OxOO cStat
{INPBA,0,0},
{MATH,OxOOODOOOO,0x00004000},
};
_WinRTpp04 [ 0] value = (ULONG) cStat;
_WinRTpp04[ 1].port = LptEcr;
(void) WinRTProcessIoBuffer(hWinRT, -WinRTpp04,
sizeof(_WinRTpp04),
&iWinRTlength);
cStat = (UCHAR)_WinRTpp04[ 0].value;
//
}
}
PastDmaInWait:
//%% // start up the preprocessor
// #SetSize 8
// #SetAbsolute On
//
// DmaFlush(;
{
WINRTCONTROLITEM _WinRTpp05[]
{// command parami param2
=
{DMAFLUSH,0x0,0x0},
126
};
(void) WinRTProcessIoBuffer(hWinRT, _WinRTpp05,
sizeof(_WinRTpp5), &iWinRTlength);
//
}
//take a look at the buffer memory.
memcpy(lpDmaDest,DmaInformation.pVirtualAddress,\
NumberOfBytes);
// release the DMA buffer back to the system
if (!WinRTFreeDmaBuffer(hWinRT, &DmaInformation, &Length))
{
wsprintf(szErrorTitle,"ERROR");
wsprintf(szErrorMsg,"WinRTFreeDmaBuffer failed");
WinRTCloseDevice(hWinRT);
return(FALSE);
}
WinRTCloseDevice(hWinRT);
return(ReturnValue);
}
//************* END OF DMA Routines ************//
*
finiObjects
*
* finished with all objects we use; release them
static void finiObjects( void )
{
if( lpDD != NULL )
{
if( lpDDSPrimary != NULL )
{
lpDDSPrimary->Release();
lpDDSPrimary = NULL;
}
1pDD->Release();
127
lpDD = NULL;
}
} /* finiObjects */
//***This routine compares the source
//***file and the output file from the decoder***
static BOOL bitDiff(HDC hdc) {
int grp,found;
FILE *ORG,*DEST,*RPT;
char o,d;
char *tmpStr;
char buf[256];
unsigned long i=O;
int rptPos = 0;
ZeroMemory(rptBuf,sizeof(rptBuf));
//wsprintf(buf,"Bit diff .....
);
//TextOut(hdc,local-x+50,local-y+50,buf,lstrlen(buf));
if ( (ORG = fopen(filename-org,"r")) == NULL ) {
//wsprintf(buf,"Cannot open org file to compare");
//TextOut(hdc,localx+400,local-y+50,buf,lstrlen(buf));
return FALSE;
}
//
if ((RPT = fopen(filename-rpt,"a+"))
== NULL)
//
wsprintf(buf,"Cannot open report file");
// TextOut(hdc,400,0,buf,lstrlen(buf));
//
return FALSE;
// }
for (grp=0;grp<groups;grp++) {
tmpStr = makefrm-name(filename-out,grp);
if ((DEST = fopen(tmpStr,"r")) == NULL) {
free(tmpStr);
return FALSE;
}
//****** Compare routine *******//
free (tmpStr);
i = 0;
fseek(ORG,0L,SEEKSET);
found = 0;
while (feof(ORG) == 0) {
o = fgetc(ORG);
if ((o != '0') && (o !=
d = fgetc(DEST);
'1'))
break;
128
{
wsprintf(buf,"o:%c dc",o,d);
Textfut(hdc,400,rptPos,buf,lstrlen(buf));
if (o != d) {
found = 1;
wsprintf (rptBuf [grp],
"Diff at:%d
TextOut(hdc,400,rptPos,rptBuf[grp],
lstrlen(rptBuf[grp]));
rptPos = rptPos + 20;
break;
}
i++;
}
if (found == 0) {
wsprintf(rptBuf[grp],
"No diff(Xd bits)
TextOut(hdc,400,rptPos,rptBuf[grp],
lstrlen(rptBuf[grp]));
rptPos = rptPos + 20;
}
}
fclose(DEST);
fclose(ORG);
return TRUE;
}
//This routine reads 2 blocks of data group(2 of 4 DMA transfers)
//or 2 of 16 frames and decode it
static BOOL readDMAandDecode(HWND hwnd,HDC hdc) {
HANDLE hSrc;
DWORD dwRead;
char buf[256];
char cBuf[1];
short ch;
int i,j,grp,rowcol,bit,lvl;
long local-x, local-y;
imagecols = 128;
imagerows = 128;
alubits = 12;
sblvls = 3;
group-size = GRPSIZE;
groups = NUMGRP;
129
ezw-passes = EZWPASSES;
bottom-bit = 8 - ezwpasses;
image-syn = make3ddblarray(imagecols,imagerows,2);
signs = make2dintarray(imagecols,imagerows);
mags = make2dintarray(imagecols,imagerows);
found = make2dintarray(imagecols,imagerows);
prev-frame = make2dintarray(imagecols,imagerows);
ZeroMemory (ImageData, sizeof (ImageData));
ZeroMemory (ImageDataO, sizeof (ImageDataO));
ZeroMemory (bitBuf, sizeof(bitBuf));
SetBkColor( hdc, RGB( 255, 255, 255 ) );
SetTextColor( hdc, RGB( 0, 0, 0 ) );
bitBufIndex = 0;
totalBits = 0;
RECT rt;
GetClientRect(hwnd,&rt);
local-x = rt.top;
local-y = rt.left;
//Initiate DMA transfer
for (grp=0;grp<groups;grp++) { //
2 groups of data (2 of 16 frames)
wsprintf(buf,"DMA Transfer group:%d
",grp);
TextOut(hdc,local-x+50,localy+50,buf,lstrlen(buf));
for (block=0;block<=BLOCKCNT;block++) {
// 4 blocks of DMA transfers
if (StartDmaIn() == FALSE) {
TextOut(hdc,local-x+50,local-y+50,
szErrorMsg,istrlen(szErrorMsg));
MessageBox(hwnd,szErrorMsg,szErrorTitle,
MBOK);
return(FALSE);
}
if (FinishDmaIn(DmaBuf,block) == FALSE) {
TextOut(hdc,localx+50,local-y+50,
szErrorMsg,lstrlen(szErrorMsg));
MessageBox(hwnd,szErrorMsg,szErrorTitle,
MBOK);
return(FALSE);
}
//Write buffer to file
for (i=O;i<NumberOfBytes;i++) {
for (j=7;j>=0;j--) {
ch = DmaBuf[i] & (1<<j) ? 1
0;
130
bitBuf[bitBufIndex] = ch;
bitBufIndex++;
}
}
if (block == BLOCKCNT)
{
if (OpenWinRTO==FALSE) {
return FALSE;
}
if (PointEcpOuto==FALSE) {
return FALSE;
}
WinRTCloseDevice(hWinRT);
}
}
}
bitBufIndex = 0;
for (grp = 0; grp < groups; grp++) {
low = 0;
high = top-value;
//2047
initarithdecodeo;
initarithmodel();
for (row = 0; row < imagerows; row++)
for (col = 0; col < imagecols; col++)
prevframe[col][row] = 0;
image-mean = 0;
for (bit = 0; bit < 8; bit++)
image-mean = (image-mean << 1) + decode-sym(0);
for (frame = 0; frame < group-size; frame++) {
//*** Display stuff
wsprintf(buf,
"Decoding group:Xd frame:Xd",grp,(grp*group-size)+frame);
TextOut(hdc,local-x+50,local-y+50,buf,lstrlen(buf));
for (row = 0; row < imagerows; row++)
for (col = 0; col < imagecols; col++) {
mags[col][row]
0;
=
signs[col][row]
=
0;
found[col][row]
=
0;
}
for (top-bit = 1 << (9 - bottom-bit); top-bit
&& !decodesym(0); top-bit = top-bit >> 1);
for (bit = top-bit; bit; bit = (bit >> 1)) decode-pass(bit);
if (top-bit && frame) {
image-mean = 0;
for (bit = 0; bit < 8; bit++)
131
image-mean = (imagemean << 1) + decodesym(O);
}
for (row = 0; row < imagerows; row++)
for (col = 0; col < imagecols; col++) {
if (found[col][row]) {
if (signs[col][row])
prev-frame[col][row]
-= mags[col][row]
<< (bottombit + 1);
else prev-frame[col][row]
+= mags[col][row] << (bottombit + 1);
}
image-syn[col][row][0] = (double) prevjframe[col][row];
}
for (lvl = sblvls; lvl > 0; lvl--) synthesize-image(lvl);
// Instead of dumping it to a file, dump it to an array.
dumpimage((grp*group-size)+frame);
totalBits = bitBuf Index;
if ( ((grp*group.size)+frame) == NUMFRM-1 ) {
destroy3ddblarray(image-syn,imagecols,imagerows,2);
destroy2dintarray(signs,imagecols,imagerows);
destroy2dintarray(mags,imagecols,imagerows);
destroy2dintarray(found,imagecols,imagerows);
destroy2dintarray(prev-frame,imagecols,imagerows);
return TRUE;
}
}
}
//****** Done decoding
destroy3ddblarray(image-syn,imagecols,imagerows,2);
destroy2dintarray(signs,imagecols,imagerows);
destroy2dintarray(mags,imagecols,imagerows);
destroy2dintarray(found,imagecols,imagerows);
destroy2dintarray(prev-frame,imagecols,imagerows);
return TRUE;
}
static BOOL boardInit(HWND hwnd) {
if(OpenWinRTO==FALSE) {
return(FALSE);
}
if (PointEcpOutO==FALSE) {
return FALSE;
}
WinRTCloseDevice(hWinRT);
return TRUE;
132
}
static BOOL readGarbage(HWND hwnd) {
int i,loc-block;
//Read to clear the SRAM on the board. (2 DMA reads)
for (i=0;i<2;i++) {
for (loc-block=;locblock<=BLOCKCNT;locblock++) {
{
if (StartDmaIn() == FALSE)
MessageBox(hwnd,szErrorMsg,szErrorTitle,MB-OK);
return(FALSE);
}
if (FinishDmaIn(DmaBuf,loc-block) == FALSE) {
MessageBox(hwnd,szErrorMsg,szErrorTitle,MBOK);
return(FALSE);
}
if (loc.block == BLOCKCNT) {
if (OpenWinRT(==FALSE) {
return FALSE;
}
if (PointEcpOutO==FALSE)
{
return FALSE;
}
WinRTCloseDevice(hWinRT);
}
}
}
return TRUE;
}
LRESULT CALLBACK WindowProc( HWND hWnd, UINT message,
WPARAM wParam, LPARAM lParam )
{
PAINTSTRUCT ps;
RECT
rc;
SIZE
size;
TCHAR szHello[MAXLOADSTRING];
LoadString(hInst, IDS-HELLO, szHello, MAXLOADSTRING);
static HWND hPickedButton = NULL;
static BYTE phase = 0;
static BYTE error = 0;
static char buf [256];
int wmId, wmEvent;
HDC
hdc;
133
long local-x, local-y;
switch( message )
{
case WMACTIVATEAPP:
bActive = wParam;
break;
case WMCREATE:
// Set mouse pointer..
hArrowCursor = LoadCursor (NULL, IDCARROW);
hWaitCursor = LoadCursor (NULL, IDCWAIT);
SetCursor (hArrowCursor);
hBtnP2 = CreateWindow
WSCHILD I WSVISIBLE
300,50,130,30,
hWnd,
(HMENU)BTN-P2,
hInst, NULL);
hBtnP3 = CreateWindow
WSCHILD I WSVISIBLE
300,80,130,30,
hWnd,
(HMENU)BTNP3,
hInst, NULL);
("BUTTON", "Level
I BSRADIOBUTTON,
2 (0100)",
("BUTTON", "Level
I BS-RADIOBUTTON,
3 (0011)",
hBtnP4 = CreateWindow ("BUTTON", "Level
WSCHILD I WSVISIBLE I BS-RADIOBUTTON,
300,110,130,30,
hWnd,
(HMENU)BTNP4,
hInst, NULL);
4 (0010)",
hBtnP5 = CreateWindow ("BUTTON", "Level
WSCHILD I WSVISIBLE I BSRADIOBUTTON,
300,140,130,30,
5 (0001)",
hWnd,
(HMENU)BTNP5,
hInst, NULL);
hBtnP6 = CreateWindow ("BUTTON", "Level
WSCHILD I WSVISIBLE I BSRADIOBUTTON,
300,170,130,30,
hWnd,
(HMENU)BTN-P6,
134
6 (0000)",
hInst, NULL);
EZWPASSES = 6;
SendMessage(hBtnP6, BMSETCHECK, 1, OL);
break;
//case WMSETCURSOR:
//
//
SetCursor(NULL);
return TRUE;
case WMTIMER:
// ZeroMemory (ImageDataO, sizeof(ImageDatao));
RECT rt;
GetClientRect(hWnd,&rt);
localx = rt.top;
local-y = rt.left;
// Flip surfaces
if( bActive ) {
if (lpDDSPrimary->GetDC(&hdc) == DD.0K) {
wsprintf(buf, "Frame:%dYdXd
"
(frame-display%1000)/100,
(frame-display%100)/10,(frame-display%10));
TextOut( hdc, local-x + 50, local-y
+ 50, buf, lstrlen(buf) );
if (totalBits != 0)
ratio = BITSIN_A_FRAME*GRPSIZE/totalBits;
else ratio = 1;
wsprintf(buf,
"Total bits used:%d. Ratio Xd : 1
",totalBits,ratio);
TextOut(hdc, local-x + 50, local-y + 70,
buf , lstrlen(buf) );
for (nt k=0;k<groups;k++) {
TextOut(hdc,400,k*20,rptBuf[k],lstrlen(rptBuf[k]));
}
framePtr = ImageData[frame-display];
if(SetBitmapBits(hbm,sizeof(ImageData[frame-display]),
framePtr) == 0)
{
PostMessage(hWnd, WMCLOSE, 0, 0);
}
if ( StretchBlt( hdc, 100, 100, 128, 128,
hdcImage, 0, 0, 128, 128, SRCCOPY ) == FALSE ) {
PostMessage(hWnd, WMCLOSE, 0, 0);
}
lpDDSPrimary->ReleaseDC (hdc);
}
}
135
//Move to next frame-display
if (!pause) {
if (frame-display >= (NUMFRM-1)) {
frame-display = 0;
if (lpDDSPrimary->GetDC(&hdc) == DDOK) {
if (!readDMAandDecode(hWnd,hdc)) {
MessageBox(hWnd,
"ERROR(2):readDMAandDecode!",
"ERROR",MBOK);
exit(0);
}
lpDDSPrimary->ReleaseDC(hdc);
}
} else { framedisplay
=
framedisplay++; }
}
break;
case WMKEYDOWN:
switch( wParam ) {
case VK.ESCAPE:
case VK_F12:
PostMessage(hWnd, WMCLOSE, 0, 0);
break;
case VK-SPACE:
pause = !pause;
break;
case VKRIGHT:
if (pause) {
if (frame-display >= (NUMFRM-1)) {
frame-display = 0;
} else { frame-display = framedisplay++; }
}
break;
case VKLEFT:
if (pause) {
if (frame-display <= 0) {
frame-display = NUMFRM-1;
} else {
frame-display = frame-display--; }
}
break;
}
break;
case WMCOMMAND:
136
wmId = LOWORD(wParam);
wmEvent = HIWORD(wParam);
switch (wmId) {
case BTNP2:
EZWPASSES = 2;
goto ChangePasses;
break;
case BTNP3:
EZWPASSES = 3;
goto ChangePasses;
break;
case BTNP4:
EZWPASSES = 4;
goto ChangePasses;
break;
case BTNP5:
EZWPASSES = 5;
goto ChangePasses;
break;
case BTNP6:
EZWPASSES = 6;
goto ChangePasses;
ChangePasses:
hPickedButton = (HWND)lParam;
SendMessage(hBtnP2, BMSETCHECK, 0, OL);
SendMessage(hBtnP3, BM-SETCHECK, 0, OL);
SendMessage(hBtnP4, BMSETCHECK, 0, OL);
SendMessage(hBtnP5, BMSETCHECK, 0, OL);
SendMessage(hBtnP6, BMSETCHECK, 0, OL);
SendMessage(hPickedButton, BM-SETCHECK, 1, OL);
break;
default:
return
DefWindowProc(hWnd, message, wParam, lParam);
}
break;
case WMDESTROY:
finiObjects();
PostQuitMessage( 0 );
break;
}
137
return DefWindowProc(hWnd, message, wParam, lParam);
} /* WindowProc */
static BOOL doInit( HINSTANCE hInstance, int nCmdShow ) {
HWND
hwnd;
WNDCLASSEX
wc;
DDSURFACEDESC
ddsd;
DDSCAPS
ddscaps;
ddrval;
HRESULT
hdc;
HDC
//
buf[256];
char
* set up and register window class
wc.cbSize = sizeof(WNDCLASSEX);
wc.style = CSHREDRAW I CSVREDRAW;
wc.lpfnWndProc = (WNDPROC)WindowProc;
wc.cbClsExtra
=
0;
wc.cbWndExtra = 0;
wc.hInstance = hInstance;
wc.hIcon = LoadIcon( hInstance, IDIAPPLICATION );
wc.hCursor = LoadCursor( NULL, IDCARROW );
wc.hbrBackground = (HBRUSH)(COLORWINDOW+1);
wc.lpszMenuName = (LPCSTR)IDCVDODMA3;
wc.lpszClassName = szWindowClass;
wc.hIconSm = LoadIcon(wc.hInstance, (LPCTSTR)IDI-SMALL);
RegisterClassEx( &wc );
* create a window
hwnd = CreateWindow(
szWindowClass,
szTitle,
WSOVERLAPPEDWINDOW,
0, 0,
GetSystemMetrics( SM-CXSCREEN ),
GetSystemMetrics( SMCYSCREEN ),
NULL,
NULL,
hInstance,
NULL );
138
if(
!hwnd )
{
return FALSE;
}
ShowWindow( hwnd, nCmdShow );
UpdateWindow( hwnd );
if (!boardInit(hwnd))
{
MessageBox(hwnd,
"Something wrong in boardInit","ERROR",MB.OK);
return FALSE;
}
MessageBox(hwnd,
"Initialization Completed.\nReset the board.",
"Initialization",MBOK);
if (!readGarbage(hwnd))
{
return FALSE;
}
//**** Creating BITMAP *******II
hbm = CreateBitmap (128,128,1,32,framePtr);
if ( hbm == NULL ) {
MessageBox(hwnd,"ERROR Creating Bitmap!", "ERROR", MB-OK);
return FALSE;
}
hdcImage = CreateCompatibleDC( NULL );
SelectObject( hdcImage, hbm );
//**** DONE Creating BITMAP *******/
* create the main DirectDraw object
//Set frame display to last frame to initiate DMA transfer
//right away
frame-display = NUMFRM-1;
//
ddrval = DirectDrawCreate( NULL, &lpDD, NULL );
if( ddrval == DDOK ) {
Get exclusive mode
139
ddrval = lpDD->SetCooperativeLevel( hwnd,
DDSCLNORMAL );
//DDSCLEXCLUSIVE I DDSCL-FULLSCREEN );
if(ddrval == DD-OK ) {
//ddrval = 1pDD->SetDisplayMode( 640, 480, 8 );
//if( ddrval == DDOK ) {
// Create the primary
ddsd.dwSize = sizeof( ddsd );
ddsd.dwFlags = DDSD-CAPS;
ddsd.ddsCaps.dwCaps = DDSCAPSPRIMARYSURFACE;
ddrval = 1pDD->CreateSurface( &ddsd, &lpDDSPrimary, NULL );
if( ddrval == DDOK ) {
ZeroMemory( &ddsd, sizeof( ddsd ) );
ddsd.dwSize = sizeof( ddsd );
ddrval = lpDDSPrimary->GetSurfaceDesc( &ddsd );
if( ddrval == DDOK )
{
//lpDDSPrimary->ReleaseDC(hdc);
//if
(lpDDSBack->GetDC(&hdc) == DD-OK)
//SetBkColor( hdc, RGB( 0, 0, 255 ) );
//SetTextColor( hdc, RGB( 255, 255, 0 ) );
//lpDDSBack->ReleaseDC(hdc);
// Create a timer to flip the pages
if( SetTimer( hwnd, TIMERID, TIMERRATE, NULL)) {
//if
//
(!readFiles(hwnd)) {
return FALSE;
return TRUE;
}
}
}
}
}
wsprintf(buf, "Direct Draw Init Failed (X081x)\n", ddrval );
MessageBox( hwnd, buf, "ERROR", MBOK );
finiObjects();
DestroyWindow( hwnd );
return FALSE;
}
140
int APIENTRY WinMain(HINSTANCE hInstance,
HINSTANCE hPrevInstance,
LPSTR
lpCmdLine,
int
nCmdShow)
{
// TODO: Place code here.
MSG msg;
LoadString(hInstance, IDSAPPTITLE, szTitle, MAXLOADSTRING);
LoadString(hInstance, IDCVDODMA3, szWindowClass, MAXLOADSTRING);
if( !doInit( hInstance, nCmdShow ) ) {
return FALSE;
}
while (GetMessage(&msg, NULL, 0, 0)) {
TranslateMessage(&msg);
DispatchMessage(&msg);
}
return msg.wParam;
}
C.1.5
resource.h
//{{NODEPENDENCIES}}
// Microsoft Developer Studio generated include file.
// Used by vdodma3.rc
//
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
#define
IDCMYICON
IDD.VDDMA3_DIALOG
IDDABOUTBOX
IDSAPPTITLE
IDMABOUT
IDMEXIT
IDSHELLO
IDIVDODMA3
IDISMALL
IDCVDODMA3
IDRMAINFRAME
IDCSTATIC
2
102
103
103
104
105
106
107
108
109
128
-1
//Define button resources
#define BTN.P2 202
#define BTNP3 203
#define BTNP4 204
#define BTNP5 205
#define BTNP6 206
141
//
Next default values for new objects
//
#ifdef APSTUDIO-INVOKED
#ifndef APSTUDIOREADONLYSYMBOLS
#define _APSNEXTRESOURCEVALUE
#define _APSNEXTCOMMANDVALUE
#define _APSNEXTCONTROLVALUE
#define _APSNEXTSYMEDVALUE
#endif
132
32772
1000
110
#endif
C.1.6
makefrm.c
#This program is used to extract a PGM image file to get
#raw digital pixels and EPROM data in Intel 83 format.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
FILE *inFile;
FILE *outFile;
FILE *out2File;
FILE *epromFile;
char *filename-in;
int
int
int
int
**image;
imagecols, imagerows;
pe-x, pe-y;
addr;
char *strappend(char *strl,char *str2)
{
char *result;
result = calloc((strlen(strl) + strlen(str2) + 1),
strcpy(result, stri);
strcat(result, str2);
return result;
}
char *makefrm-name(char *name,
char *ans;
ans = calloc(5,
int frnum) {
sizeof(char));
if (frnum < 10) sprintf(ans, ".00%d", frnum);
else if (frnum < 100) sprintf(ans, ".Od", frnum);
else sprintf(ans, ".Ad", frnum);
return strappend(name, ans);
142
sizeof(char));
}
int **make2dintarray(int dl, int d2) {
int **array;
int index;
array = calloc(dl, sizeof(int *));
for (index = 0; index < dl; index++) array[index]
(nt *) calloc(d2,sizeof(int));
return array;
=
}
void loadimage() {
unsigned char temp-char;
int row, col;
fscanf(inFile, "P5 %d %d 255", &imagecols, &imagerows);
fread(&tempchar,1,1,inFile);
for (row = 0; row < imagerows; row++)
for (col = 0; col < imagecols; col++) {
fread(&tempchar,1,1,inFile);
if (feof(inFile)) {printf("ERROR: too few pixels\n"); exit(0); }
image[col][row] = (int) temp-char;
}
fread(&tempchar,1,1,inFile);
if (feof(inFile) == 0) {printf("ERROR: too many pixels\n");
}
int max(int a, int b) {
if (a > b) return a;
else return b;
}
int posonly(int a) {
if (a < 0) return 0;
else return a;
}
void dumpimage() {
unsigned char tempschar;
int row, col, x, y, pix;
int x0 = posonly(imagecols - (4 * pe-x)) >> 1;
int yO = posonly(imagerows - (4 * pey)) >> 1;
fprintf(out2File,"P5 %d %d 255\n", 4 * pe.x, 4 * pe-y);
for (y = 0; y < 4; y++)
for (x = 0; x < 4; x++)
for (row = 0; row < pe-y; row++)
143
exit(0); }
for (col = 0; col < pe-x; col++) {
pix = image[xO + x + (4 * col)][y0 + y + (4 * row)];
/* pix = 0; */
/*
fprintf(outFile, "%.2X\n", pix); */
fprintf(epromFile, ":01%.4X007.2X%.2X\n", addr, pix,
((- (pix + 1 + (addr >> 8) + (addr & 255))) & 255));
addr++;
}
for (row = 0; row < (pe-y * 4); row++)
for (col = 0; col < (pe-x * 4); col++) {
/* image[x0 + col][y0+row] = 0; */
tempchar = (char) image[xO + col][yO + row];
fwrite(&tempchar,1,1,out2File);
}
}
main (int argc,char *argv[]) {
int offset, frame, frms, col, row;
argc--;
argv++;
if(argc != 5) {
printf("ERROR: Invalid number of arguments.\n");
printf(
"USAGE: makefrm offset frms pe-x pe-y. pe-x=32 pe-y=32 for 128x128\n");
exit(0);
}
sscanf(*argv++, "7d", &offset);
sscanf(*argv++, "Xd", &frms);
sscanf(*argv++, "Xd", &pe-x);
sscanf(*argv++, "7d", &pe.y);
filenamein = *argv;
inFile = fopen(strappend(filenamejin, ".000"), "r");
fscanf(inFile, "P5 %d %d 255", &imagecols, &imagerows);
fclose(inFile);
image = make2dintarray(max(imagecols, pe-x * 4),
max(imagerows, pe-y * 4));
for (row = 0; row < max(imagerows, pe-y * 4); row++)
for (col = 0; col < max(imagecols, pe-x * 4); col++) {
image[col][row] = 127;
}
addr = 0;
epromFile = fopen("pix.83", "w");
for (frame = 0; frame < frms; frame++) {
144
inFile = fopen(make-frm-name(filename-in,
(frame + offset)),
loadimageO;
fclose(inFile);
/* outFile = fopen(make-frmname("frmdat", frame), "w"); */
out2File = fopen(makejfrm.name("infrm", frame), "w");
dumpimageo;
/* fclose(outFile); */
fclose(out2File);
}
fprintf(epromFile, ":00000001FF\n");
fclose(epromFile);
}
145
"r");
Download