incorporating simulation and implementation into teaching computer

advertisement
Session F1G
Incorporating Simulation and Implementation into
Teaching Computer Organization and Architecture
Bo Hatfield1, Mike Rieker2, Lan Jin 3
Abstract - This paper describes the motivation, the
realization, and the experience of incorporating simulation
and hardware implementation into teaching computer
organization and architecture to computer science
students. It demonstrates that learning by doing has
helped students to truly understand how a computer is
constructed and how it really works in practice.
Correlated with textbook material, a set of simulation and
implementation projects were created on the basis of the
work that students had done in previous homework and
laboratory activities. Students can thus use these designs
as building blocks for completing more complex projects
at a later time. The projects cover a wide range of topics
from simple adders up to ALU’s and CPU’s. These
processors operate in a virtual manner on certain short
assembly-language programs. Specifically, this paper
shares the experience of using simulation tools (Altera®
Quartus II) and reconfigurable hardware prototyping
platforms (Altera® UP2 development boards).
Index Terms – Teaching Methods, Computer Organization,
Computer Architecture, Simulation, Hardware Prototyping,
Altera FPGA Development Kit.
INTRODUCTION AND MOTIVATION
Traditionally, teaching computer organization and architecture
to computer science students can be inefficient if the teaching
focus is solely on textbook materials. Students often have to
rely on their imaginations to understand the underlying
hardware-related concepts. In a college/university level
teaching environment, it is still not feasible to build a
laboratory that can provide various computer systems and
architectures for teaching computer organization and
architecture, yet keeping computer science education up-todate requires keeping pace with the rapid evolution of the
computing industry. Searching for a more efficient way of
teaching computer organization and architecture is an ongoing
task.
One common solution is to provide an active learning
environment for teaching computer organization and
architecture by taking advantage of advanced software
technology [2]. In other words, through simulating key
elements of a processor, students can obtain a better
understanding of the internal operation of a processor.
Simulation can also help students study and appreciate some
of the design trade-offs and performance issues.
Although using such a software simulation tool in
teaching is very popular, it still has its limitations. We agree
with the school of thought that believes that building an
original processor from scratch is the only way to ensure that
the acquired knowledge persists longer and the interface
between hardware and software becomes absolutely clear.
However, this affect will only be useful and feasible if a
processor of a reasonable size is used.
There have been many educators who shared their
experiences and utilized similar approaches for teaching
computer organization and architecture [6]. While we feel that
in some areas our experiences are similar to the previously
reported ones, we gained many special experiences that are
directly related to the teaching model and the software and
hardware tools we used.
In this paper, we will introduce and discuss the building
of a simple MIPS-like RISC computer in teaching computer
organization and architecture. We were motivated to introduce
the students to a RISC-based architecture, yet keep the
processor simple enough for the student to build from scratch.
For example, the instruction set contains no more than sixteen
instructions to minimize the implementation complexity. A
more detailed description of the proposed simple computer are
given in our textbook, which we use in our teaching [1]. It is
worth noting that working on a non-MIPS project in the lab
portion of a class can provide an in-depth understanding of
processor design, however, it can become overwhelming in a
one-semester long undergraduate computer organization and
architecture course. A non-MIPS project can be used in a
graduate level computer architecture course.
This paper is structured as follows. We first briefly
introduce the proposed computer. Then we share our thoughts
on the simulation tools and the hardware prototyping
platforms used. We then discuss the design issues and the
implementation results, which are organized according to the
logical modules in our project sequence. Student learning
experiences and usage of the software and hardware tools will
be reported. Finally, we present our conclusions and future
work.
A SIMPLE RISC COMPUTER
Many high-performance processors have been developed
based on the RISC (Reduced Instruction Set Computer) design
philosophy. Some of the key concepts of the instruction-set
architecture of a RISC machine include:
1
Bo Hatfield, Salem State College, bo.hatfield@salemstate.edu
Mike Rieker, Salem State College, wmrieker@nii.net
3
Lan Jin, California State University, Fresno, lanj@csufresno.edu
2
0-7803-9077-6/05/$20.00 © 2005 IEEE
October 19 – 22, 2005, Indianapolis, IN
35th ASEE/IEEE Frontiers in Education Conference
F1G-18
Session F1G
•
•
•
•
Only operations and addressing modes that are frequently
used are included in the instruction set for direct
implementation by hardware.
Basic arithmetic-logic operations use a three-address
register-register instruction format.
The instruction format has fixed length and regular fields,
so as to make instruction decoding easier and faster.
Only load/store instructions access memory for data.
The common features of RISC computers described
above have enabled the formulation of the specification of a
typical educational RISC machine. One example of such a
machine is the DLX computer derived from the MIPS
architecture, with a subset of the MIPS64 instruction set and a
32-bit word length [5].
Our goal is to design a processor that will introduce the
RISC features to the students, but is simple enough for the
students to complete the hardware implementation within a
limited time frame, e.g., one semester. Without loss of
generality, we propose a computer that is like the MIPS, but
has a smaller and much simpler instruction set than the MIPS
[1]. Simplification is necessary for easier implementation. The
simplification consists of decreasing word length, instruction
length, size of the instruction set, and memory capacity. The
above-mentioned RISC design features remain unchanged.
Table I lists all of the operations defined for this computer.
TABLE I
THE INSTRUCTION SET OF A SIMPLE MIPS-LIKE RISC COMPUTER
OPcodes
Mnemonics
Operation
0000
ADD
Rd ← (Rs1)+(Rs2)
0001
ADDI
Rd ← (Rs1)+#imm
0010
SUB
Rd ← (Rs1)-(Rs2)
0011
MUL
Rd, Rd+1 ← (Rs1)*(Rs2)
Rd ← (Rs1) ∧ (Rs2)
0100
AND
Rd ← (Rs1) ∨ (Rs2)
0101
OR
0110
XOR
Rd ← (Rs1) ⊕ (Rs2)
0111
LSH
Rd ← (Rs1)<<imm if imm < 0
Rd ← (Rs1)>>imm if imm > 0
1000
LW
Rd ← (Mem[(Rs)+imm])
1001
SW
Mem[(Rs)+imm] ← (Rd)
1010
SLT
1011
SGT
1100
1101
1110
1111
BEG
BNE
JR
JAL
if (Rs1) < (Rs2), Rd ← 1,
else Rd ← 0
if (Rs1) > (Rs2), Rd ← 1,
else Rd ← 0
if (Rs1) = 0, PC ← (PC)+imm
if (Rs1) ≠ 0, PC ← (PC)+imm
PC ← (Rs1)
Rd ← (PC)+1; PC ← (Rs1)
The specification summary of our simple computer is as
follows:
• Word, instruction, and address length: 16 bits.
• General-purpose register set: 16x16 bits, (R0)=0.
• Memory capacity: 64K words, word-addressable.
• Instruction format: 4-bit OPcode and 3 address fields
of 4 bits each.
• Load-store format: OPcode, Rs, Rd, imm (used as a
displacement in displacement addressing mode).
•
•
•
ALU format: OPcode, Rs1, Rs2, Rd.
ALUimm format: OPcode, Rs1, imm, Rd.
Branch format: OPcode, Rs1, 8-bit imm (signextended in PC-relative addressing mode).
The MUL instruction multiplies two 16-bit operands (Rs1
and Rs2) to obtain a 32-bit product in two consecutive
registers Rd and Rd+1, where Rd must be an even-numbered
register so that Rd+1 can be obtained by simply XORing the
LSB of the register number of Rd.
The LSH instruction uses the ALUimm format with imm
as the shift amount. It logically shifts operand (Rs1) left or
right, depending on the sign of imm as minus or plus,
respectively.
The SLT and SGT use the ALU format as two compare
instructions. They set the flags ‘1’ or ‘0’ in Rd rather than in a
condition code. This is typical for a RISC-style machine.
The JAL and JR use the ALU format with some fields not
used. In addition to their standard usage, they are also used for
the special cases of jump to subroutine and return from
subroutine, respectively.
SOFTWARE TOOLS AND HARDWARE PLATFORMS
Our main objective in searching for an appropriate tool was to
design, develop, simulate and prototype the proposed simple
computer system using the recent advances in ComputerAided Design (CAD) tools for logic synthesis and simulation.
With respect to a hardware platform, we are aware of the
fact that Field Programmable Gate Arrays (FPGA’s) have
grown in both size and performance as the cost has gone
down. Moreover, the number of gates on a FPGA chip has
grown so large that many complex applications can now be
implemented with them. CAD tools have evolved from a
simple hardware description language and elementary
functional block diagrams to sophisticated hierarchal design
layout and high level hardware design languages such as
Verilog and VHDL with pre-made blocks of code called
“cores”, which can be customized for the desired function. In
addition, ease of modification and enhancement are inherent
in the process of design using a FPGA. These are the
motivations for taking advantage of FPGA’s in a system for
the hardware implementation of our teaching projects.
In searching for an appropriate platform and associated
tools, we settled on using Altera’s University Program
Education Kit, which incorporates 70,000 gates on a FPGA
device. The major factor that helped us to make our final
decision was Altera’s dedicated support for education
programs.
The package provides all of the necessary tools for
creating and implementing digital logic designs [4]. Major
features include the Quartus II development software, a 240pin FPGA device, and a parallel port download cable. The
software supports schematic capture and text-based HDL
(Hardware Description Language) design entry [3] [4]. It also
provides design programming, compilation, and verification
support for all devices on the UP2 board (EPM7128S and
EPF10K70 devices). The UP2 Education Board is a stand-
0-7803-9077-6/05/$20.00 © 2005 IEEE
October 19 – 22, 2005, Indianapolis, IN
35th ASEE/IEEE Frontiers in Education Conference
F1G-19
Session F1G
alone experimental board based on a FLEX10K device that
offers 70,000 gates [4]. When used with the Quartus II
software, the board provides a superior platform for
prototyping intermediate to advanced digital design projects.
The feedback we received from the students concerning
the use of this package was very positive. Ease of use of the
software and availability of on-line support documentation are
among the primary merits mentioned in the student feedback.
PROJECTS
In order to avoid having students be overwhelmed by the
complexity of a complete computer system design, we took
the approach of modularization. We first implemented the
entire system (from software simulation to hardware
implementation) as a top-level system linked to a collection of
sub-circuits. These sub-circuits directly represent logical
modules in the computer, namely the ALU (Arithmetic Logic
Unit) module, the datapath module, and the CU (Control Unit)
module. Each module can be removed from the system and
replaced by a student’s design. In this way, a student gets the
opportunity to see how his/her design fits into the entire
system and gains early experience in integration testing at the
system level. Figure 1 shows the complete top-level system
design using Altera’s Quartus II CAD tool. This schematic
diagram of the design clearly shows the modularization
characteristics of the system. In the following subsections, we
will briefly discuss the three most important modules.
I. The ALU Module
After completing a few tutorials on how to use Altera’s
Quartus II design software and a preliminary project on a 16bit adder design and implementation, students were ready to
start the project on the design and implementation of the first
logical module, the ALU.
As a first step, students can design an ALU module on a
standard 16-bit adder of carry look-ahead structure. Since the
ALU usually serves as the center of the CPU at the registertransfer level, it should provide the convenience of
transferring either operand OP1 or operand OP2 directly to its
output. Actually, to satisfy the limited functionality of the
operation repertoire of the instruction set, we need to include
no more than 8 (1 transfer, 3 arithmetic, and 4 logic)
operations in the function table of the ALU as shown in Table
II.
TABLE II
FUNCTION TABLE OF THE ALU
Function
f2 f1 f0
000
001
010
011
100
101
110
111
OP1
OP2 – OP1
OP1 – OP2
OP1 + OP2
OP1 OR OP2
OP1 AND OP2
OP1 XOR OP2
OP1 XNOR OP2
The simplest way of designing this ALU is to use a set of
16 multiplexers for collecting data from (1) OP1 and OP2, (2)
a simple 16-bit adder for the 3 arithmetic operations, and (3) a
set of XOR-, AND-, and OR-gates for logic operations. A
circuit diagram for the ALU module can be found in [1].
We subdivided the ALU design project into two parts: (1)
logic design and (2) simulation and hardware implementation.
The work involved in the first part was assigned in a
homework assignment, serving the role of pre-lab preparation.
Students received complete functional descriptions of the
ALU module, including a detailed description of its interface
with the rest of the system. In this way, the complete design
could be easily brought into the whole system.
The second part of the project was conducted in a lab
project, in which students took their designs completed in the
first part and conducted a module-level full-function
simulation in the Quartus II environment. Then, the top-level
design files, excluding the pre-designed ALU module, were
given to the students so that they could integrate the ALU
module into the system. After the integration, students
conducted a system-level simulation to make sure that they
work together as a complete system. Finally, the integrated
system was programmed into the FPGA device before the
actual physical test was conducted.
II. The Datapath Module
After designing the ALU, the students design the second
module of the CPU, the general-purpose register set. First,
they should determine the interface between the register set
and the other components of the CPU. Since any arithmeticlogic operation in the instruction set must be executed in just
one cycle, the general-purpose register set must have two read
ports and one write port to be active at the same time. Also,
the general-purpose register set must receive two sourceregister numbers and one destination-register number as its
source operands and destination operand, respectively. An
auxiliary register, A, is used for the execution of instructions
for accumulating the sum of partial products (MUL) and
performing shift operations (LSH).
III. Microprogrammed Control Unit Module
Given that the CPU is composed of two parts, the datapath and
the control unit, the next project should be the design of the
microprogrammed control unit. The design of a
microprogrammed control unit consists of the microinstruction
format and the microprogram flowchart.
It is important to note that the designs of the CPU
datapath, the microinstruction format and the microprogram
flowchart are closely interrelated. In particular, during the
CPU datapath design, the microinstruction format should have
already been taken into consideration, except for certain fields.
These fields are those used for the sequencing control of the
microprogram itself, such as the loop counter, a branch
condition, and the next microinstruction.
0-7803-9077-6/05/$20.00 © 2005 IEEE
October 19 – 22, 2005, Indianapolis, IN
35th ASEE/IEEE Frontiers in Education Conference
F1G-20
Session F1G
FIGURE 1
THE TOP-LEVEL VIEW OF THE IMPLEMENTATION IN THE QUARTUS II DESIGN ENVIRONMENT
REPORT AND ANALYSIS OF STUDENT EXPERIENCES
Over a period of 20 years, the authors have taught computer
organization and architecture at both the undergraduate level
and the graduate level. Over the years, the authors have gained
first-hand teaching experiences and observed student learning
experiences while trying out different textbooks and designing
various lab experiments. Some problems were reported in
student use of the labs that associated with those texts. The
two most critical issues reported by the students are the
unmanageable size of the projects and lack of hardware
implementation.
This experience allowed the authors to co-author and
publish a textbook [1] in which a simple RISC processor is
proposed and studied. A set of labs were developed which
were associated with the teaching of the processor. One of the
most important motivations for this work was to ensure that a
student, taking a one-semester computer organization and
architecture undergraduate course with little or no logic design
experience, the processor and the associated labs are feasible
for students to study, design and implement. The authors’
teaching experience shows that, considering the complexity of
any commercial processor, this processor is more appropriate
for teaching and learning.
During the time period between 1992 and 2005 in
California State University-Fresno and Salem State College,
there were a total number of 800 students who took the
computer organization and architecture course and completed
the associated lab assignments. These two institutions have
typical undergraduate computer science programs in which
(1) no prior logic design experience is required and (2) passing
a one-semester logic design course is required as the pre-
requisite for a computer organization and architecture course.
The course is offered as a junior/senior undergraduate course.
Table III shows the lab sequence that was used by both
institutions. The only difference is that in (1) all six labs were
conducted, while in (2), the first three labs were conducted in
the earlier logic design course and the last three were used in
the computer organization and architecture course. Note that
each lab consists of three parts:
Part I. Design and/or Analysis
Part II. Implementation using simulation software
Part III. Test
As stated in the first section, another key motivation was
to accomplish not only a fully functional simulation of the
simple RISC processor, but also achieve a physical
implementation of it. The authors worked had worked hard in
finding an appropriate tool for achieving this goal. After
discovering the University Program supported by Altera, the
authors decided to use Altera’s Quartus II CAD software tool
and the FPGA device on the UP2/UP3 board. After two
semesters of using Altera’s CAD tools, the student
experiences and feedback were very positive. Students were
surveyed regarding their use of the lab materials and the
Altera design/implementation tools. The survey results
coincide with the findings of the faculty’s interaction with lab
materials during teaching and lab supervising. These can be
summarized as the follows.
• Altera has a very good university program
• Altera’s tool support is very good
• Quartus II is a very good tool to use
• The tutorials are easy to follow and understand
0-7803-9077-6/05/$20.00 © 2005 IEEE
October 19 – 22, 2005, Indianapolis, IN
35th ASEE/IEEE Frontiers in Education Conference
F1G-21
Session F1G
•
Quartus II free web edition makes it so easy for the
students to work at his/her own computer
TABLE III
LABORATORY DESIGN
Description
Lab
Part I
Part II
Use the clock devices in the simulation software Study static and dynamic behavior of
to study waveform diagrams to simulate various basic logic gates.
timing sequences: binary, reverse binary, Gray
code, etc.
Part III
Use Boolean algebra, especially
DeMorgan’s Theorem, to study
equivalent gates and equivalent
circuits
Study various full adder circuits: XOR-based,
XNOR-based, all-NAND, all-NOR, etc. Create
the waveform diagram of a full-adder circuit to
study the phenomena of gate delay and hazards.
Study the ripple-carry parallel adder,
including performing add/subtract
operations to implement two’s
complement or sign-magnitude
algorithms.
Implement combinational circuits
using multiplexers, e.g., code
conversion (binary code to gray code
or vice versa), seven-segment
displays, etc.
Design and simulate counters and
shift register as well as study various
shift operations.
Implement and test multi-functional
counters and shift register circuits.
Study a bit-serial adder to understand
the general model of sequential
circuits and the timing control of bitserial add/subtract operations.
Test the circuits of 16-bit carry lookahead adder in two different
structures: serial-parallel vs. fullyparallel. Compare their carrypropagation delays.
Using some specified methods to
design, implement and test the
circuits of a customized ALU
according to a given function table.
Design and implement a bus-based
CPU datapath, and create a memory
module using simulation software
Design a timing sequence of control
words to test the created CPU
datapath for executing a few bus
communication operations
Design and implement a hardwired or
a microprogrammed control unit.
Test the CPU
1
Getting Started &
Basic Logic Gates
2
Combinational Logic
Circuits
3
Sequential Logic
Circuits
4
ALU
5
CPU Datapath
Construct buses on the tri-state devices in busdriver chips
6
CPU Control
Design, construct, and test a general-purpose
register-based CPU
Study basic flip-flop circuits and/or edgetriggered flip-flop circuits, including static and
dynamic behaviors.
Use simulation software to create sub-circuits.
The circuits of SN74181 and SN74182
industrial chips are good examples of creating
the building blocks.
We quote some of student feedback verbatim as follows.
•
process, and very understandable output. I like to use
Quartus II to build circuit!”
“Quartus II helped me to further my understanding in
computer organization and architecture by helping
me visualize a working circuit while actually building
it. With its available tools and customizability, my
learning increased greatly.”
“The UP2 board is a great learning addition.”
“Also, the ability to download this software (i.e.,
Quartus II web edition) and use it at home with ease
helped greatly.”
“The tool is free and available for home use.
Tutorials are excellent.”
“The simulation and hardware implementation tool
•
helped me in learning Computer Organization and
Architecture”
• “Using the UP2 board helps to see the creation comes
to ‘life’”.
•
•
“The tools are very effective in bringing the ideas in
the textbook alive”
•
• “Quartus II is a good program with nice interface.”
• “I though it (Quartus II CAD tool) was pretty slick in
•
the way it handles all the interconnects, very easy.
The only thing I got a bit confused about was when
starting a mega-function (built-in), but after it was
CONCLUSION AND FUTURE PLAN
shown in class, even that was not bad at all.”
• “UP2 board was very good. Being able to assign pins,
The proposed computer system makes it possible to utilize
doing timing diagrams. Everything seemed to work
each logical module in the design of a sequence of simulation
as theoretically thought of.”
and implementation projects for teaching computer
• “UP2 board helped me in learning. It was nice to see
organization and architecture. A prototype of the system has
the actual physical implementation of our designs. It
been implemented in a FPGA device on an Altera UP2 board.
is more than just theory this way, and makes the use
Its top-level diagram clearly shows the modularization of the
of the design tool more interesting.”
system. Students were given an opportunity to perform hands•
“I really thought that Quartus II was useful for
on design, simulation, implementation, and testing of an
computer organization and architecture. It was easy
individual module while developing a complete understanding
to learn. It has many tools to streamline the building
0-7803-9077-6/05/$20.00 © 2005 IEEE
October 19 – 22, 2005, Indianapolis, IN
35th ASEE/IEEE Frontiers in Education Conference
F1G-22
Session F1G
of the entire system. Moreover, each module can be
redesigned following different requirements or guidelines as
long as it maintains the same interface with the system (e.g.,
Our students had designed and implemented a different
microprogrammed control unit module that achieved better
system-level performance). The student feedback relating to
the labs and to the usage of Altera’s CAD tools were very
positive. In the near future, our plan is to introduce Altera
tools into teaching graduate-level courses. A typical project
would be the design, implementation, and test of a pipelined
processor. In the mean time, developing an assembler for the
processor greatly supports the teaching. Students can write
assembly language programs that can be loaded onto the
processor and executed directly by the hardware.
ACKNOWLEDGMENT
REFERENCES
[1]
Jin, L. and Hatfield, B. J. Computer Organization: Principles, Analysis
& Design, 1st Edition, 2004.
[2]
Yurcik, W., Wolffe, G.S., Holliday, M.A., "A Survey of Simulators
Used in Computer Organization/Architecture Courses", Proceedings of
the 2001 Summer Computer Simulation Conference (SCSC), July 2001.
[3]
Hamblen, J.O., Furman, M.D., Holliday, M.A., Rapid Prototyping of
Digital Systems, 2nd Edition, 2001.
[4]
Altera University Program UP2 Education Kit User Guide, Altera
Corporation December, 2004. http://www.altera.com.
[5]
Patterson, D.A. and Hennessy, J.L. Computer Architecture: A
Quantitative Approach, 3rd Edition, 2003.
[6]
Calazans, N. L. V., Moraes, F. G., Marcon, C. A. M., “Teaching
Computer Organization and Architecture with Hands-on Experience”,
Proceedings of the 32nd ASEE/IEEE Frontiers in Education Conference,
November, 2002.
The authors would like to acknowledge the initial support of
Altera, in particular its University Program.
0-7803-9077-6/05/$20.00 © 2005 IEEE
October 19 – 22, 2005, Indianapolis, IN
35th ASEE/IEEE Frontiers in Education Conference
F1G-23
Download