Quark SOC and Galileo Architecture (ESP * Fall 2014) - Real

advertisement

Quark SOC and Galileo Architecture

(ESP – Fall 2014)

Computer Science & Engineering

Department

Arizona State University

Tempe, AZ 85287

Dr. Yann-Hang Lee yhlee@asu.edu

(480) 727-7507

Real-time Systems Lab, Computer Science and Engineering, ASU

Current Processor Design

Moore’s law continues to hold true, transistor counts doubling every 18 months

 But can no longer rely upon increasing clock rates and instruction-level parallelism to meet computing performance demands

Semiconductor device fabrication process

 65 nm – 2006, 45 nm – 2008, 32 nm – 2010, and 22 nm – 2012

How to best exploit ever-increasing on-chip transistor counts?

 Multi- & many-core (MC) devices are new technology wave

 exploiting explicit parallelism in the new devices

Size and Power constraints

Real-time Systems Lab, Computer Science and Engineering, ASU 1

Intel Processors

 X86 32/64 architecture

 486 – first pipelined x86 design

 Pentium – the first x86 superscalar CPU

 Processors for

 Server (Xeon), desktop (Core i3/i5/i7), mobile (Core i3/i5/i7), and embedded (Atom)

 All of them support hypervisor (VM)

 Differences

CPUs, memory, and interconnection bandwidth reliability (quality of dies) and form factor power and thermal requirements

 Uses available clock cycles and power, not to push up higher clock speeds and energy needs

Real-time Systems Lab, Computer Science and Engineering, ASU 2

Galileo Board

400MHz Quark SoC

256MB DDR3

Ethernet

USB Host Port

MicroSD Support

I2C, SPI Support

PCI Express Mini Cards

Serial Connectivity

GPIO

Linux on Board

Source: http://www.intel.com/content/www/us/en/intelligent-systems/galileo/galileo-overview.html

Real-time Systems Lab, Computer Science and Engineering, ASU 3

Intel Quark SoC X1000.

 SOC –

CPU core (x86) cache, internal memory (flash, SRAM)

 IO interfaces and external buses

 interconnection or switches

 misc (clock, JTAG)

 Chip size, power and pins

 32nm process in 1 st Quark

 one-fifth the size and

 one-tenth the power of low-end Atom chip

393 solder balls on 15mm 2

5 power rails (3.3V, 1.8V,

1.5V, 1.05V, 1.0V)

Real-time Systems Lab, Computer Science and Engineering, ASU 4

Pins in Quark

 Example: High Speed UART Interface, SIU1_RDX SIU1_TXD

Signal Name Dir

SIU0_RXD I

SIU0_TXD O

Term Power Type S4/S5

20k(H) 3.3V CMOS3.3 Off

3.3V CMOS3.3 Off

Default Buffer State

S3 Reset Enter S0

Off Pull-up Pull-up

Off VOH VOH

 Six different power states

S0 – the system is completely powered ON and fully operational

S5 – the system is completely powered OFF

S1, S2, S3 and S4 – sleeping states, the system appears OFF because of low power consumption and retains enough of the hardware context to return to the working state

 In Galileo schematics

Real-time Systems Lab, Computer Science and Engineering, ASU 5

Quark Core Internal Architecture

32-bit RISC integer core

Single cycle execution

Instruction pipelining

Floating-point unit

Cache with cache consistency support (16-Kbyte for both data and instructions)

Memory management unit

Real-time Systems Lab, Computer Science and Engineering, ASU 6

486 Pipeline

Real-time Systems Lab, Computer Science and Engineering, ASU 7

IO Expander and GPIO Multiplexing

 CY8C9540A – I2C interfaced expander

 with 40 I/O data pins (ports 0-5) independently configurable as inputs, outputs, bi-directional input/outputs, or PWM outputs

 To configure a pin

 an I2C control message to the chip which includes a register address

Real-time Systems Lab, Computer Science and Engineering, ASU 8

X86 ISA Data Representations

Little-endian byte ordering in memory

Words, doublewords, and quadwords do not need to be aligned in memory on natural boundaries.

 2 memory accesses for an unaligned memory access

 aligned accesses require only one

Unsigned integer, signed (two's complement)

FP, string of bits, bytes, .. etc.

SIMD packed data

Pointer

Near

Far (logical)

Real-time Systems Lab, Computer Science and Engineering, ASU 9

Memory Model

Flat memory model – a single, continuous linear address space of 2 32 bytes

Segmented model – a logical address consisting of a segment selector and an offset

Real-address mode – for 8086,

 16 segments of 64K

Linear address space 

(paging) physical space

Real-time Systems Lab, Computer Science and Engineering, ASU 10

Modes of Operation

Protected mode (32 bits address)

 native mode (Windows, Linux), full features, separate memory

 virtual-8086 mode

Real-address mode (20 bits address)

 the programming environment of the Intel 8086 processor with extensions

 native MS-DOS

System management mode

 power management, system security, diagnostics

IA-32e (Intel 64 architecture)

 Compatibility mode – similar to 32-bit protected mode

 64-bit mode –

16 64-bit general purpose registers default address size is 64 bits and its default operand size is 32 bits.

Real-time Systems Lab, Computer Science and Engineering, ASU 11

Programmer’s model

Real-time Systems Lab, Computer Science and Engineering, ASU 12

Protected Mode Memory Management

Use segment descriptor to protect memory accesses

Each program has a descriptor table to map segments

 allow shared segments

 Memory access checks

Limit, type, privilege level checks.

Restrictions of addressable domain, procedure entry-points, and instruction set.

Logical addresses

SS

0018

ESP

0000002A

Local Descriptor Table

Linear address space unused

DRAM

DS offset

0010 000001B6

0008

IP

00002CD3

(index)

18

10

08

00

0001A000

00002A00

00001A00

00003000

LDTR register

Real-time Systems Lab, Computer Science and Engineering, ASU 13

Virtual Memory and Paging

Virtual memory

 uses disk as part of the memory, thus allowing sum of all programs can be larger than physical memory

 Only part of a program must be kept in memory, while the remaining parts are kept on disk.

The memory used by the program is divided into small units called pages (4096-byte).

 OS maintains page directory and page tables

 Page translation: CPU converts the linear address into a physical address

 Page fault: occurs when a needed page is not in memory, and the CPU interrupts the program

Virtual memory manager (VMM) – OS utility that manages the loading and unloading of pages

Real-time Systems Lab, Computer Science and Engineering, ASU 14

Page Translation

A linear address is divided into a page directory field, page table field, and page frame offset.

The CPU uses all three to calculate the physical address.

Real-time Systems Lab, Computer Science and Engineering, ASU 15

Interrupt and Exception

Interrupt

 an asynchronous event that is typically triggered by an I/O device.

Exception

 a synchronous event that is generated when the processor detects one or more predefined conditions while executing an instruction.

 three classes of exceptions: faults, traps, and aborts.

18 predefined interrupts and exceptions and 224 user defined interrupts

Access handler procedures through entries in the interrupt descriptor table (IDT)

 A call to a handler procedure is similar to a procedure call to another protection level

Real-time Systems Lab, Computer Science and Engineering, ASU 16

Interrupt and Exception

Interrupt vector references

 an interrupt gate (interrupt enable (IF) flag in the

EFLAGS register is cleared)

 a trap gate

Gate contains

 access rights information

 segment selector for the code segment of the handler procedure

 an offset into the code segment to entry point of the handler procedure

Real-time Systems Lab, Computer Science and Engineering, ASU 17

Interrupt and APIC

 Interrupt in 8086

 Two pins: NMI and INTR

 Interrupt Acknowledge Cycle to fetch the interrupt vector number from 8259

 APIC

In Pentium and P6 processors

Receives interrupts and send to core for handling

APIC bus: bi-directional data signals (APICD[1:0]) and clock (APICCLK)

Inter-processor interrupt messages for multi-processor systems static and dynamic (based on the priority of executing tasks) distribution

Real-time Systems Lab, Computer Science and Engineering, ASU 18

Interrupt Handling

IO APIC delivers interrupt message to local APIC

 Programmable vector number for each interrupt source

Implied priority based on vector number

 local APIC determines when to service the interrupt relative to the other activities of the processor

 priority = vector / 16

Locate gate from IDT

 Far call to the handler

 (SS, ESP), EFLAGS, CS, EIP, and Error code are saved in stack

Real-time Systems Lab, Computer Science and Engineering, ASU 19

Hardware Initialization and Reset

 Reset processor state

 EIP=0000FFF0H, CS=F000H(segment) and FFFF0000H (base)

 Disable paging, cache, and in real-address mode

 Execute the first instruction at physical address FFFFFFF0H .

 The EPROM containing the software initialization code or BIOS should be located at the upper memory space (including this address)

 Run in real-mode, invalidate the TLBs, set up a GDT for selector

0x08 (code) and 0x10 (data), switch to protected mode

 Start other components on motherboard (FPU, APIC, southbridge, etc.)

Real-time Systems Lab, Computer Science and Engineering, ASU 20

Typical x86 System Architecture

Host Bus (PSB)

100/133/200MHz

64-bit

Processor

AGP Bus

North

Bridge

(MCH)

System

Memory

Clock

Gen

Host Clock

PCI Clock

USB Clock

Hublink Clock

HubLink Bus

SM Bus

CNR

South

Bridge

(ICH)

LAN

Audio

USB

IDE

FWH

Parallel

SIO

PCI Bus

33 MHz 32-bit

LPC Bus

Mouse

Keybrd

Floppy

Serial

Chipset

 North Bridge

South Bridge

Firmware Hub

 Various chipsets available from Intel to meet performance requirements

FSB, DMI/Hub interface

System control hub

(SCH) –

GMCH and ICH are merged into one chip

Real-time Systems Lab, Computer Science and Engineering, ASU 21

Host Bridge in Quark

A central hub that routes transactions to and from Quark

CPU core, DRAM controller, and other functional blocks.

CPU core  PCI devices

 via MMIO and IO accesses

Real-time Systems Lab, Computer Science and Engineering, ASU 22

Download