Uploaded by Yến Yến

AMD64andIntelItanium-by-Chris-Kopek-Brett-Casbeer-Nick-Snead-Adam-Kenny-2004-Spring

advertisement
History of 64-bit Computing: AMD64 and Intel Itanium Processors
CS-350-1: Computer Organization
Spring 2004
Brett Casbeer, Adam Kenny, Chris Kopek, Nick Snead
TABLE OF CONTENTS
Section
Page Number
Introduction
1
History
1-2
AMD64
2–7
Intel Itanium
7 – 10
Intel Itanium 2
10 – 11
Conclusion
11
Glossary
G1 – G2
Introduction
Currently the need for 64-bit computing is driven by applications that require large
amounts of virtual and physical memory. However, AMD and Intel are developing 64-bit
processors for personal computers and servers. They are developing these technologies because
of the high demand for increased performance amongst home users and developers. AMD64 is
the specific term for the 64-bit architecture that AMD uses on their line of processors. This
paper focuses mainly on the AMD64 architecture and not a specific processor. If a specific
AMD64 processor is being referred to it will be the AMD Athlon 64 which is the processor for
home and personal use. The IA-64 is the specific term used to refer to Intel’s new 64-bit
architecture. Intel has developed two server processors; the Itanium and the Itanium 2. Both of
Intel’s processors will be referred too throughout this document. The purpose of this paper is to
show the features and architectures of both the AMD64 and Intel IA-64 processors.
History
Today, nearly all personal computers use a 32-bit processor that runs a 32-bit operating
system. In 1965, Gordon Moore, a famous computer scientist, said that the number of transistors
in a circuit would double every couple of years. This statement is known as Moore’s Law and it
still holds true today. 32-bit processors have existed since the mid eighty’s, and they can no
longer have an efficient substantial speed increase without a major architectural change. In order
for Moore’s Law to be true, a new faster 64-bit processor needed to be created. In 1981 Bill
Gates said, “640K ought to be enough for anybody”. Well 640K of memory is not enough for
anybody today, and 32-bit processors are slowly realizing that same fate. The 64-bit processors
came from a need for faster computing and for a desire of new technology. In 1991 Intel began
to formulate plans and develop its first 64-bit processor. In 1992 DEC (Digital Equipment
Corporation) released the Alpha-64 processor. Today, Intel has developed the IA-64 architecture
and AMD has developed the AMD64 architecture.
An average person would think a 64-bit processor should run twice as fast as a 32-bit
processor, but this is not necessarily true. The 64-bit processor can handle numbers that are 64
bits long while a 32-bit processor can only handle numbers that are 32 bit numbers long. With
most modern applications there are few that require a number larger than 32 bits. A few
exceptions would be digital video, graphics, engineering, and scientific programs because of
their purpose and complex algorithms. If a 32-bit processor encounters a number larger than 32bits it will have to split the number up and process it into 2 sections. The process of splitting up
the number requires vital processing time. The 64-bit processor would be able to handle that
number without splitting it up, thus halving the processing time. If the 64-bit processor handles
numbers larger than 32 bits it will be nearly twice as fast as the 32-bit processor handling the
same number. Assume there are two processors that run at the same speed, except one is 64 bit
and the other is 32 bit. If they encounter a 30-bit number both processors will handle the number
with the exact same speed. This is why it is not necessarily true that a 64-bit processor runs
twice as fast as a 32-bit processor.
1
One of the main benefits of 64-bit processors is their ability to access more memory. 32bit processors normally can only access 4 gigabytes of memory. There are a few tricks and
workarounds to allow 32-bit processors to run more than 4 gigabytes but they are not necessarily
stable, and they hold no future advancements. 64-bit processors can theoretically access 18
million terabytes or 18 billion gigabytes. While no one can currently imagine a need for that
much memory, the potential is there; which shows that 64-bit processors have a long future.
Currently the only systems that really need 64-bit processors are servers, and
supercomputers. A server needs 64-bit computing because its needs to support more memory.
While a personal computer only handles one user, a server needs to handle thousands of users
simultaneously. Since it needs to handle so many users it needs to have a lot of memory.
Supercomputers need 64-bit technology because they use large floating-point numbers. Many
supercomputers process huge engineering calculations. “Seattle-based Cray Inc. is building a
massively parallel processing supercomputer, nicknamed Thor's Hammer, for weapons research
by the National Nuclear Security Administration at the Sandia National Laboratory in
Albuquerque” (Kay, 2004, pg. 1). That supercomputer needs a 64-bit processor to handle the
complex calculations.
Even though there is currently no need for 64-bit processors on personal computers AMD
and Intel have created and are currently developing these processors. One reason for the
development of personal 64-bit processors is advancement of technology and increase of speed.
If a 64-bit processor did not exist then no new software and equipment would be developed.
This would cause computers to be stuck with 32-bits and they will eventually reach a point
where there would be no speed increase. Andy Yaschenko an author of XBit labs said, “As for
the common PCs, 64bit will find no real application” (Yaschenko, 2003, pg. 3). Unfortunately
Andy Yaschenko bases the future technology on past and current technologies. Even though a
need cannot be foreseen, as the past has shown, a need will eventually appear and help continue
64-bit advancements.
AMD64
The AMD64 was released to the public on September 2003 as the first 64-bit processor
designed specifically for personal computing. The processor was developed using the x86
architecture that was first used by Intel’s 8086. Since the AMD64 uses the x86 architecture it is
backwards compatible with 32-bit applications and operating systems. In order for the processor
to be backwards compatible it uses two main operating modes that are divided into five sub
modes. Below is a table that shows the operating modes and their features(AMD, 2003).
2
The two main operating modes are long mode and legacy mode. Long mode consists of
two sub modes, which are 64-bit mode and compatibility mode. In order for long mode to be
active the computer must be running a 64-bit operating system. For the 64-bit mode to be active
the computer must be running a 64-bit application. This mode is enabled by the operating
system on an individual code-segment basis. 64-bit mode supports all the new features and
register extensions of the AMD64 architecture. Compatibility mode allows 64-bit operating
systems to run 16-bit and 32-bit applications without the need for recompilation. In
compatibility mode, the application thinks that it is running on a 32-bit operating system. The
operating system however treats the application as if it is a 64-bit application. One limitation of
compatibility mode is that 16-bit and 32-bit applications are still limited to a maximum of 4GB
of memory space (AMD, 2003).
The other main operating mode is legacy mode. Legacy mode is separated into three sub
modes, which are: protected mode, virtual-8086 mode, and real mode. Legacy mode is active
when a 32-bit or 16-bit operating system is running. Protected mode supports 16-bit and 32-bit
applications, and the application can access up to 4GB worth of memory space. Virtual-8086
mode supports 16-bit programs running under a 32-bit operating system. The programs can only
access 1MB worth of memory space. Real mode supports 16 bit programs running under a 16bit operating system. Just like in Virtual-8086 mode, programs are limited to 1MB of memory
space (AMD, 2003).
3
One new feature of the AMD64 is the register extensions. “64-bit mode implements
register extensions through a new group of instruction prefixes, called REX prefixes” (AMD,
2003, pg. 8). The extensions add eight new GPRs (General Purpose Registers) and all GPRs are
widened to 64-bits. Eight new XMM (Extended Memory Manager) registers of 128-bits are
added as well. Another extended feature of the AMD64 is the 64-bit RIP (Instruction Pointer).
This pointer will allow 64-bit data addressing. The opcodes for this processor are also extended
to support 64-bit addressing and register extensions. Overall, these are the register extensions
that have been added to the AMD64 (AMD, 2003).
The General Purpose Registers are based upon the current operating mode that the system
is in. If it is running legacy mode or compatibility mode then a total of 24 different variations of
the 8 GPRs can be used. The variations include eight 8-bit registers, eight 16-bit registers, and
eight 32-bit registers. The system determines which register to use based on the type of
instruction, opcode, address size, or stack size. In 64-bit mode there are a total of 68 different
variations that can be used. They are: sixteen 8-bit low byte registers, four 8-bit high byte
registers, sixteen 16-bit registers, sixteen 32-bit registers, and sixteen 64-bit registers. The
AMD64 architecture gives 8 more GPRs only to the 64-bit operating mode, and it has a total of
44 more variations (AMD, 2003).
Two special registers that exist in the AMD64 architecture are the Flags Register and the
Instruction Pointer. Just like the GPRs these two registers are treated differently based upon
which operating mode they system is in. If the system is in legacy real mode or virtual-8086
mode it only has access to an 8-bit FLAGS register, but if it’s operating in legacy or
compatibility mode it has access to a 32-bit EFLAGS register. Finally if it is in 64-bit mode it
has access to a 64-bit RFLAGS register. The FLAGS, EFLAGS, or RFLAGS registers all
4
function the same way except some have a fewer number of bits. The purpose of these registers
is to contain control and access bits for the application to access and use. The other special
register is the Instruction Pointer. In legacy or compatibility mode the Instruction pointer is
either a 16-bit IP, or a 32-bit EIP. In 64-bit mode the instruction pointer is extended to 64-bits
and it is called a RIP register. The contents of the instruction pointer are not directly readable by
software, but it is pushed onto the system stack. The purpose of the instruction pointer is to
contain the address of the next instruction that needs to be executed. Both of these special
registers operate the same way that traditional registers operate; the only difference is AMD
extended the size of the registers to support 64-bits(AMD, 2003).
Another set of special registers are the XMM 128-bit media registers. “These registers
perform integer and floating point operations primarily on vector operands” (AMD, 2003,
pg.127). The first main issue with the XMM registers is their compatibility with all operating
modes. The registers are compatible with all operating modes, but in 64-bit mode older
applications must be recompiled to make use of the extra XMM registers. When the computer is
operating in 32-bit mode or 16-bit mode then the applications work normally but only have
access to 8 XMM registers. In 64-bit mode the user must recompile their program in order to get
all the benefits of the 128-bit XMM registers. Of course the user does not have to recompile if
the application is already compiled to work in 64-bit mode. There are many benefits of
recompiling because the system is given access to eight more XMM registers, which gives it a
total of sixteen 128-bit XMM registers. The system also has access to all 16 GPRs, which are
extended to 64-bits. Finally the system has access to 64-bit virtual addressing and the RIP
instruction pointer. The need for more XMM registers is great because it allows the applications
to run in parallel on different vectors. Some types of applications that will make use of the 16
XMM registers are speech recognition programs, 2D and 3D graphics programs, professional
CAD programs, and HDTV(High Definition Television) streaming media programs. The XMM
registers can handle all the data and process it in parallel because of SIMD (Single Instruction
Multiple Data) instructions. These SIMD instructions can access, manipulate, and change the
data all with one instruction. Since they don’t need multiple instructions is saves vital processing
time. Overall the addition of eight XMM registers allows 64-bit applications to run smoother
and faster(AMD, 2003).
The AMD64 architecture handles four different data types. The four different types are
signed integers, unsigned integers, BCD (Binary Coded Decimal) digits, and Packed BCD digits.
5
The signed and unsigned integers can handle 5 different types of integers. They can handle a
byte, word, doubleword, quadword, and double quadword. The diagram above shows the
capacity of each type of integer. The sign byte for signed integers is stored in the most
significant bit. The architecture addresses memory using little endian byte order. This means
that the least significant byte is stored in the lowest byte address. BCD digits have binary values
ranging from 0 to 9. The lowest BCD digit can be 0000, while the highest can only be 1001.
However, because a byte can contain 8 bits the BCD digit is not necessarily efficient enough.
Packed BCD digits are used to hold two BCD digits. In binary it could hold on digit from 00001001 and another from 0000-1001. The maximum digit a packed BCD digit can hold is 99 or
10011001 (AMD, 2003).
The entire address space that a program can use is called virtual memory. Virtual
memory is converted by a hardware and operating system software to smaller physical memory
spaces, which are located on either the main memory or the hard disk. Virtual memory is treated
different depending on what AMD64 operating mode the system is currently in. In the legacy
modes, virtual memory is treated the same as if it was a 32 bit processor with a 32 bit operating
system. In 64 bit mode, it uses a flat segmentation model of virtual memory. The 64 bit virtual
memory space is treated as a single, flat address space. Programs address access locations that
can be anywhere in that 64 bit address space. In compatibility mode, it uses a protected, multisegment model of virtual memory. Legacy protected mode uses the same virtual memory model
as compatibility mode (AMD, 2003).
The AMD64 architecture dispenses with most of the legacy segmentation functions in 64
bit mode. AMD believes that most modern operating systems do not use the segmentation
features that are available in the x86 architecture, in favor of, handling all segmentation functions
in software. Using software causes lost efficiency. AMD64 approach allows new 64 bit operating
systems to be coded more simply, and it supports more efficient management of multiprogramming environments than is possible in the legacy x86 architecture(AMD, 2003).
6
AMD64 has designed knew technology to help reduce bottlenecks within the computer
system. In most processors data that needs to go to the video or main memory has to pass
through a motherboard chip. Data that passes through USB, PCI, or hard-drives usually has to
pass through 2 motherboard chips. Since the data has to squeeze through all these chips, the
processor ends up waiting for the data. This waiting for data results in bottlenecks within the
system. The first advancement AMD made to reduce bottlenecks is they built DDR Memory
Controllers directly onto the processor. Typically the DDR Memory Controller is built into the
motherboard. What this does is it allows the data to transfer directly from the processor to the
main memory. This significantly reduces the data access time because it gets rid of the
“middleman” that was slowing the process down. The other advancement that AMD built into
their processor is HT (Hypertranport) technology. “HT is a high-speed data carrying method
designed to replace or supplement many of the traditional input/output methods that can cause
bottlenecks on modern motherboards” (Dowler, 2003, pg. 4). HT provides point-to-point links
between components, and it travels at varying speeds based on the component it is traveling too.
HT uses these high bandwidth links to send data to the memory and to the 2 chips on the
motherboard. The two chips on the motherboard have a built in HT bridge so they can support
the high speeds of transfer. Hypertransport links are different from traditional links that carry
data. The HT links carry packets of data, similar to today’s Ethernet technology. This works by
sending the address of the data and the actual data all on one line. Traditional methods have an
address line and a data line to transfer data. This traditional method adds complexity and takes
up space on the motherboard and the processor. The HT data can transfer at speeds up to
800Mhz DDR, which allows it to transfer as much or more data on less data lines than the
traditional buses. The overall speed up is when slower buses transfer their data to the high speed
HT links. Instead of moving through a series of slow buses the data is transferred to the HT
“Highway” so that it can move. These two new design advancements that AMD has built into
the AMD64 architecture, allow the processor and motherboard to transfer data much faster, and
reduce bottlenecks within the system (Dowler, 2003).
Intel Itanium
The Itanium Processor is the first of its kind to implement the IA-64 instruction set. One
of the major improvements of the IA-64 architecture over the IA-32 architecture is that the IA-64
architecture employs predication. Predication uses a predication bit that is used to determine the
execution of an instruction without hampering program flow. For example, if a value is equal to
the predicated bit then the instruction would be executed and if the value is not equal, then the
instruction would be ignored and no break would occur in the program flow.
Here is a C-source code example courtesy of ChipGeek:
if (x == 4)
{
z = 9;
}
else z = 0;
7
Using the IA-34 bit architecture the instructions follow this scheme:
1.
2.
3.
4.
5.
6.
Compare x to 4
If not equal goto line 5
z=9
goto line 6
z=0
// Program continues from here
No matter what the value of X, there is going to be at least one break in the instruction
flow (either line 2 or line 4). The IA-64’s use of the predication bit allows the machine to
overcome this hampering. Using the same C-source code above, but using the IA-64
architecture, the scheme would be the following:
1. Compare x to 4 and store result in a predicate bit (we'll call it P)
2. If P==1; z = 9
3. If P==0; z = 0
If the value of P matches the comparison condition the results are written to memory,
otherwise they are ignored. All three lines of code would be performed sequentially without an
interruption in program flow. Only the result from line 2 would be placed in memory because
that is the only predicate condition that matched the result of the compare in 1(Hodgin, 2001).
Using IA-64 architecture over IA-32 architecture removes one of the biggest bottlenecks of the
IA-32 architecture.
The Itanium processor was designed for high performance Internet servers and
workstations. It supports 64-bit addressing, full IA-32 instruction set compatibility, and
scalability across a wide range of operating systems.
The processor is currently offered in speeds of 733 MHz and 800 MHZ, 32KB of L1
cache, 96 KB of L2 cache, and either 2MB or 4MB of L3 cache that is four-way set associative
on two or four 1MB chips. The 4MB L3 cache uses 294.8 million transistors and gives
12.8GBps of memory bandwidth at 800MHz. With this much cache, there is a good chance that
the required data or set of instructions are being held in cache. With this in mind, bus traffic can
be reduced and overall performance reaches new levels (Simon, 2000).
Since the Itanium was designed for 1 to 4000 processor workstations and servers, the
different levels of cache and busses had to be optimized. The Level 3 bus offers fast
communication between multiple CPUs. The large L2 cache reduces traffic in the CPU by
keeping data close to the CPU that is using it. The Itanium also features page sizes from 4KB to
256MB. This gives the Itanium the flexibility to access small amounts of memory in small
chunks and large amounts of memory in large chunks (Simon, 2000).
Intel also uses Data Speculation and cache hits in the Itanium processor. “Data
speculation is caching and calling for data that may be needed or may be changed before it is
needed, so that, in the case that the data is needed and it has not changed, the CPU does not have
8
to take a latency impact from calling for the data. The processor, with the help of compiled
instructions, looks ahead, anticipates what info it may need, and then brings it to cache or into
the processor” (Simon, 2000, 7). Doing this helps hide memory latency. Cache hits help the
CPU find data in cache by setting two-bit markers on memory loads. Doing this helps the
processor quickly find the data that it needs in cache (Simon, 2000).
The Itanium processor uses the EPIC (Explicitly Parallel Instruction Computing)
architecture. This architecture allows the processor to run instructions parallel with other
instructions. The compiler groups the EPIC instructions into a structure named a “bundle”.
There is no maximum size limit for the groups of “bundled” instructions. Also the “bundled”
instructions do not affect each other, which allow multiple instructions to be handled
concurrently without getting in each other’s way. The EPIC architecture relies heavily on
compiler technology because it is at the compiler stage where instructions are “bundled”.
Therefore, any modifications in compiler technology will have a direct effect on the performance
of the Itanium processor (Simon, 2000).
The Itanium processor has four pipelined Arithmetic Logic Units. Each ALU can
process one integer calculation per cycle. The Itanium also has 128 floating-point registers along
with 128 integer registers. Not only does the Itanium have an abundance of registers, the
registers will have the ability to rotate. This will enhance CPU performance by allowing the
CPU to operate on multiple registers and processing large amounts of data (Intel, 2003c).
9
The Itanium’s system bus uses a 2.1 GBps multi-drop system bus, so the flow of
instructions to the processor is plentiful. The first generation systems used dual-memory ported
SDRAM, allocating 4.2 GBps of memory bandwidth. Later generations used DDR and
SDRAM. With these speeds, combined with the large of amount of cache and cache bandwidth,
the Itanium can process many Terabytes of information (Simon, 2000).
Intel says that the Itanium will have significant error checking capabilities. Itanium
processors will employ Enhanced Machine Check Architecture (MCA) with extensive Error
Correcting Code (ECC) and parity error checking on most processor caches and busses (Intel,
2004a). These error capabilities will give Itanium-based machines the ability to recognize errors,
attempt to fix the error(s), or flag the error(s) as corrupted (Simon, 2000).
Intel Itanium 2
The Itanium 2 processor takes the Itanium to an even higher level. There are 3 different
types of Itanium 2 processor. First, the Itanium 2 with 6MB L3 cache for MP and DP servers
and workstations. It features speeds of 1.50 GHz, 1.40 GHz, and 1.30 GHz, 32 KB of Level 1
cache, 256 KB of level 2 cache, and integrated 6MB, 4MB, and 3MB Level 3 cache. The second
is the Itanium 2 Processor 1.40 GHz with 3MB L3 cache optimized for DP servers and
Workstations. Its’ features are much like the first Itanium 2, except it has integrated 3MB and
1.5 MB cache, and is dual processor optimized. The third is the low voltage Itanium 2 Processor
Optimized for higher density DP servers and workstations. Its features are much like the other
two processors as well featuring 1 GHz speeds, integrated 1.5 MB L3 cache, dual processor
optimization, and only 62 watts maximum power consumption. All three processors are based
on the EPIC architecture, have enhanced machine check architecture with extensive error
correcting code, and support HP-UX, Linux, and Windows 2003 operating systems (Intel,
2004b).
Using the E8870 chipset, the Itanium 2 provides a peak memory bandwidth of 6.4
GB/sec. This chipset also allows for up to four DDR SDRAM DIMMs per channel for a total of
up to 128 GB of memory, using 32 4 GB DDR SDRAM DIMMs. Two scalability ports provide
12.8 GB/sec maximum bandwidth for future expansion. Intel has also balanced out the system
bus, memory and I/O, giving greater performance for the entire platform (Intel, 2003b). The
pipeline inside the Itanium 2 is also shorter.
Overall, the Itanium 2 far exceeds the Itanium. Multiple cores with a larger cache take
performance to a new level and power consumption is reduced. Multithreading has been enabled
on the Itanium, which increases performance by up to 30% for multithreaded applications. An
even larger cache further reduces the chance to have to go to memory and search for an
instruction needed Architecture. The EPIC architecture has been further optimized so up to 6
simultaneous instructions can be performed instead of the 3 in the Itanium 1 processor (Intel,
2003a).
10
In conclusion, the Itanium and Itanium 2 processors meet the demands of a wide range of
enterprise workloads. Through the use of EPIC technology, the processor shifts the balance of
responsibilities between software and hardware. With its large amount of cache and high speed
bandwidth, reaction time for acquiring data instructions is brought to an all time low. Terabytes
of memory can be handled over the web with ease and quickness. The Itanium family also
provides support for 64 bits of addressing, full IA-32 instruction set compatibility, and scalability
across a wide range of operating systems and multiprocessor platforms. Currently Intel
dominates the microprocessor market. However, most applications operate at the 32 bit level.
Organizations are reluctant to experiment with new technology unless they feel the risks are
lower than positive net results. Therefore, it will be a gradual evolution for widespread use of
the 64 bit architecture within organizations. The Itanium family of processors must prove that
they are reliable, efficient, and capable of improving performance.
Conclusion
Overall there are many similarities between the AMD64 architecture and the Intel IA-64
architecture. Both architectures support the x86 instruction set, except that the IA-64 needs a
decoder which slows down processing the instructions. According to ExtremeTech the reason
AMD and Intel have similar 64-bit processors is because Intel reverse engineered AMD’s 64-bit
architecture (Hackman, 2004). While this may be true, it only helps simply future problems
because Intel’s IA-64 instruction set is primarily compatible with the AMD64 instruction set.
Both Intel and AMD can foresee the future of 64-bit computing and they are still rapidly
developing advancements within their processors. This document has shown the specific
architectures and features of the AMD64 and Intel Itanium processors.
11
Bibliography
Advanced Micro Devices (2003). “AMD64 Architecture Programmers Manual.” AMD64
Technology., 1, 1-200.
Dowler, Mike (2003). “Athlon 64 and AMD's 64-bit technology.” URL:
http://www.pcstats.com/articleview.cfm?articleid=1466
Hackman, Mark (2004). “Analyst: Intel Reverse Engineered AMD64.” URL:
http://www.extremetech.com/article2/0,1558,1562294,00.asp
Hodgin, Rick (2001). “Intel’s Itanium” URL: http://www.geek.com/procspec/features
/itanium/index.htm
Intel (2003a) . “EPIC Technology Moves Forward.” URL:
http://www.intel.com/business/bss/products/ server/itanium2/epic_technology.pdf
Intel (2003b)“Intel E8870 Chipset” URL: ftp://download.intel.com/design/chipset/e8870/
e8870_prodbrief.pdf
Intel (2003c). “The Intel Itanium Architecture Comes of Age” URL:
http://www.intel.com/business/bss/products/server/itanium2/ecosystem.pdf
Intel (2004a). “Intel Itanium Processor” URL: http://www.intel.com/products
/server/processors/server/itanium/index.htm?iid=ipp_srvr_proc+itaniumwrkstn&
Intel (2004b). “Intel Itanium2 Processor” URL: http://www.intel.com/products
/server/processors/server/itanium2/index.htm?iid=ipp_srvr_proc+itaniumwrkstn&
Kanellos, Michael (2002a). “Itanium 2 on the way, but will it sell?” URL: http://msncnet.com.com/2100-1001-941924.html
Kay, Russell (2004). “Quickstudy: 64-bit CPUs.” URL:
http://www.computerworld.com/hardwaretopics/hardware/story/0,10801,92083,00.html
Microsoft (2003). “Introduction to Developing Applications for the 64-bit Version of Windows”
URL: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnnetserv/html
/ws03-64-bitwindevover.asp
Simon, Jon (2000). “Itanium Technology Guide” URL: http://www.sharkyextreme.com/
hardware/guides/itanium/index.shtml
Yaschenko, Andy (2003). “CPUs with 64bit Architecture: Evolution or Revolution?” URL:
http://www.xbitlabs.com/articles/cpu/display/64bit.html
GLOSSARY
AMD
-
(Advanced Micro Processors)
The main competitor to Intel for personal computer processors, AMD
develops and produces main stream server and pc processors.
AMD64
-
AMD’s new processor architecture that supports 64-bit technology and
supports the past x86 instruction set and architecture
BCD
-
(Binary Coded Decimal)
Digits that are used to refer to decimal digits 0 through 9 in binary.
Bundle
-
128 bits that include three instructions and a template field.
Cache
-
an easily-accessed memory used to store a subset of a larger pool of data that
is expensive or slow to either fetch or compute
DDR
-
(Double Data Rate)
Data is sent on both the rising and falling edges of clock cycles in DDR
Memory
ECC
-
(Error Correcting Code)
a code in which each data signal conforms to specific rules of construction so
that departures from this construction in the received signal can generally be
automatically detected and corrected
EFLAGS
-
32-bit FLAGS register
EIP
-
32-bit IP that handles 32-bit instructions
EPIC
-
(Explicitly Parallel Instruction Computing)
This is the architecture the
Itanium family of processors use
FLAGS
-
16-bit register that is used to provide control and access data for the running
application
GPR
-
(General Purpose Register)
Registers that are used for general use within the processor, they are used to
add, subtract, and perform other basic operations
HT
-
(Hypertransport)
Technology developed by AMD to allow high speed data transfer from the
processor to the onboard chips
G1
IA-64
-
Intel Corporation’s newest 64-bit architecture. It is a new architecture that
supports x86 instructions set, but with a decoder. Currently only server
processors are available with the IA-64 technology.
Intel
-
The world’s largest, commercial market, processor developing, company.
They are notably known for the Pentium processors that are in most pc’s
today.
IP
-
General Instruction Pointer that is used to point to the next instruction that
needs to be executed. Generally the IP is 16-bit
Multithreading - allows the operating system to determine when a context switch should occur
Predication
-
Uses a predication bit that is used to determine the execution of an
instruction without hampering program flow.
REX
-
Refers to the new set of extended registers within the AMD64
RFLAGS
-
a 64-bit FLAGS register.
RIP
-
An Instruction Pointer on the AMD64 that handles 64-bit instructions
Rotating
Register
-
Registers which are rotated by one register position each loop execution so
that the content of register X is in register X+1 after one rotation.
SIMD
-
(Single Instruction Multiple Data)
These instructions are used to handle large floating point and vector bits in
once clock cycle
x86
-
The most common architecture in modern pc’s, its was developed and
introduced first in Intel’s 8086 processor.
XMM
-
(Extended Memory Manager)
128-bit registers that handle floating-point numbers, and they are used mainly
for intense media instructions
G2
Download