ELC_Multiprocessor__FPGA_Linux

advertisement
Building Multi-Processor FPGA Systems
Hands-on Tutorial to Using FPGAs and Linux
Chris Martin <cmartin@altera.com>
Member Technical Staff Embedded Applications
Agenda
Introduction
Problem: How to Integrate Multi-Processor Subsystems
Why…
– Why would you do this?
– Why use FPGAs?
Lab 1: Getting Started - Booting Linux and Boot-strapping NIOS
Lab 2: Inter-Processor Communication and Shared Peripherals
Lab 3: Locking and Tetris
Building Hardware: FPGA Hardware Tools & Build Flow
Building/Debugging Software: Software Tools & Build Flow
References
Q&A – All through out.
2
The Problem – Integrating Multi-Processor Subsystems
Given a system with
multiple processor subsystems, these
architecture decisions
must be considered:
Inter-processor
communication
Partitioning/sharing
Peripherals (locking required)
Bandwidth & Latency
Requirements
3
Periph 1
Periph 1
Processor
Subsystem
1
Periph 2
Periph 3
Processor
Subsystem
2
Periph 2
Periph 3
Why Do We Need to Integrate Multi-Processor
Subsystems?
May have inherited processor subsystem
from another development team or 3rd
party
–
Risk Mitigation by reducing change
Fulfill Latency and Bandwidth
Requirements
–
Real-time Considerations
–
If main processor not Real-Time enabled,
can add a real-time processor subsystem
Design partition / Sandboxing
–
Break the system into smaller subsystems
to service task
–
Smaller task can be designed easily
Leverage Software Resources
4
–
Sometimes problem is resolved in less time
by Processor/Software rather than
Hardware design
–
Sequencers, State-machines
Why do we want to integrate with FPGA?
(or rather, HOW can FPGAs help?)
Bandwidth & Latency can be
tailored
– Addresses Real-time aspects of
System Solution
Simple Multiprocessor System
A
Peripheral
ARM
– FPGA logic has flexible interconnect
– Trade Data width with clock
frequency with latency
Experimentation
– Many processor subsystems can be
implemented
– Allows you to experiment changing
microprocessor subsystem
hardware designs
– Altera FPGA under-the-hood
– However: Generic Linux
interfaces used and can be
applied in any Linux system.
5
Shared
Peripheral
Mailbox
NIOS
N
Peripheral
And, why is Altera involved
with Embedded Linux…
Why is Altera Involved with Embedded Linux?
120,000
With Embedded Processor
Without Processor
CPU With CPU
Without Embedded
Design Starts
100,000
80,000
50%
60,000
40,000
20,000
0
Source: Gartner September 2010
More than 50% of FPGA designs include an embedded processor, and growing.
Many embedded designs using Linux
Open-source re-use.
–
6
Altera Linux Development Team actively contributes to Linux Kernel
SoCKit Board Architecture Overview

Lab focus




7
UART
DDR3
LEDs
Buttons
SoC/FPGA Hardware Architecture Overview
DDR

ARM-to-FPGA
Bridges
 Data Width
configurable

A9
I$
A9
D$
I$
D$
L2
EMIF
DMA
ROM
UART
RAM
SD/MMC
FPGA
 42K Logic
Macros
 Using no more
than 14%
AXI Bridge
AXI Bridge
HPS2FPGA
LWHPS2FPGA
32/64/128
32
AXI Bridge
FPGA2HPS
32/64/128
SYS ID
RAM
FPGA Fabric
“Soft Logic”
8
GPIO
32
NIOS
Lab 1: Getting Started
Booting Linux and Boot-strapping NIOS
Topics Covered:
–
–
–
–
–
Configuring FPGA from SD/MMC and U-Boot
Booting Linux on ARM Cortex-A9
Configuring Device Tree
Resetting and Booting NIOS Processor
Building and compiling simple Linux Application
Key Example Code Provided:
– C code for downloading NIOS code and resetting NIOS from ARM
– Using U-boot to set ARM peripheral security bits
Full step-by-step instructions are included in lab manual.
9
Lab 1: Hardware Design Overview
NIOS Subsystem
– 1 NIOS Gen 2 processor
– 64k combined instruction/data
RAM (On-Chip RAM)
– GPIO peripheral
Subsystem 1
SD/MMC
EMIF
Cortex-A9
UART
ARM Subsystem
–
–
–
–
2 Cortex-A9 (only using 1)
DDR3 External Memory
SD/MMC Peripheral
UART Peripheral
RAM
NIOS 0
GPIO
Subsystem 2
Shared Peripherals
10
Dedicated Peripherals
Lab1: Programmer View - Processor Address Maps
NIOS
11
ARM Cortex-A9
Address Base
Peripheral
Address Base
Peripheral
0xFFC0_2000
ARM UART
0xFFC0_2000
UART
0x0003_0000
GPIO (LEDs)
0xC003_0000
GPIO (LEDs)
0x0002_0000
System ID
0xC002_0000
System ID
0x0000_0000
On-chip RAM
0xC000_0000
On-chip RAM
Lab 1: Peripheral Registers
12
Peripheral Address
Offset
Access
Bit Definitions
Sys ID
0x0
RO
[31:0] – System ID.
Lab Default = 0x00001ab1
GPIO
0x0
R/W
[31:0] – Drive GPIO output.
Lab Uses for LED control, push button status
and NIOS processor resets (from ARM).
[3:0] - LED 0-3 Control.
‘0’ = LED off . ‘1’ = LED on
[4] – NIOS 0 Reset
[5] – NIOS 1 Reset
[1:0] – Push Button Status
UART
0x14
RO
Line Status Register
[5] – TX FIFO Empty
[0] – Data Ready (RX FIFO not-Empty)
UART
0x30
R/W
Shadow Receive Buffer Register
[7:0] – RX character from serial input
UART
0x34
R/W
Shadow Transmit Register
[7:0] – TX character to serial output
Lab 1: Processor Resets Via Standard Linux GPIO
int main(int argc, char** argv)
Interface
{
int fd, gpio=168;
char buf[MAX_BUF];


/* Export: echo ### > /sys/class/gpio/export */
fd = open("/sys/class/gpio/export", O_WRONLY);
sprintf(buf, "%d", gpio);
write(fd, buf, strlen(buf));
close(fd);
NIOS resets
connected to GPIO
/* Set direction to Out: */
/* echo "out“ > /sys/class/gpio/gpio###/direction */
sprintf(buf, "/sys/class/gpio/gpio%d/direction", gpio);
fd = open(buf, O_WRONLY);
write(fd, "out", 3); /* write(fd, "in", 2); */
close(fd);
GPIO driver uses
/sys/class/gpio
interface
/* Set GPIO Output High or Low */
/* echo 1 > /sys/class/gpio/gpio###/value */
sprintf(buf, "/sys/class/gpio/gpio%d/value", gpio);
fd = open(buf, O_WRONLY);
write(fd, "1", 1); /* write(fd, "0", 1); */
close(fd);
13
/* Unexport: echo ### > /sys/class/gpio/unexport */
fd = open("/sys/class/gpio/unexport", O_WRONLY);
sprintf(buf, "%d", gpio);
write(fd, buf, strlen(buf));
close(fd);
}
Lab 1: Loading External Processor Code
Via Standard Linux shared memory (mmap)




NIOS RAM address
accessed via mmap()
Can be shared with
other processes
R/W during load
Read-only protection
after load
/* Map Physical address of NIOS RAM
to virtual address segment
with Read/Write Access */
fd = open("/dev/mem", O_RDWR);
load_address = mmap(NULL, 0x10000,
PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xc0000000);
/* Set size of code to load */
load_size = sizeof(nios_code)/sizeof(nios_code[0]);
/* Load NIOS Code */
for(i=0; i < load_size ;i++)
{
*(load_address+i) = nios_code[i];
}
/* Set load address segment to Read-Only */
mprotect(load_address, 0x10000, PROT_READ);
/* Un-map load address segment */
munmap(load_address,
0x10000);
14
Lab 2: Mailboxes
NIOS/ARM Communication
Topics Covered:
– Altera Mailbox Hardware IP
Key Example Code Provided:
– C code for sending/receiving messages via hardware Mailbox IP
NIOS & ARM C Code
– Simple message protocol
– Simple Command parser
Full step-by-step instructions are included in lab manual.
– User to add second NIOS processor mailbox control.
15
Lab 2: Hardware Design Overview
NIOS 0 & 1 Subsystems
– NIOS Gen 2 processor
– 64k combined instruction/data
RAM
– GPIO (4 out, LED)
– GPIO (2 in, Buttons)
– Mailbox
Subsystem 1
SD/MMC
EMIF
Cortex-A9
UART
GPIO
ARM Subsystem
–
–
–
–
2 Cortex-A9 (only using 1)
DDR3 External Memory
SD/MMC Peripheral
UART Peripheral
MBox
RAM
RAM
NIOS 0
NIOS 1
GPIO
GPIO
Subsystem 2
Shared Peripherals
16
MBox
Subsystem 3
Dedicated Peripherals
Lab2: Programmer View - Processor Address Maps
NIOS 0 & 1
17
ARM Cortex-A9
Address Base
Peripheral
Address Base
Peripheral
0xFFC0_2000
ARM UART
0xFFC0_2000
UART
0x0007_8000
Mailbox (from ARM)
0x0007_8000
Mailbox (to NIOS 1)
0x0007_0000
Mailbox (to ARM)
0x0007_0000
Mailbox (from NIOS 1)
0x0005_0000
GPIO (In Buttons)
0x0006_8000
Mailbox (to NIOS 0)
0x0003_0000
GPIO (Out LEDs)
0x0006_0000
Mailbox (from NIOS 0)
0x0002_0000
System ID
0xC003_0000
GPIO (LEDs)
0x0000_0000
On-chip RAM
0xC002_0000
System ID
0xC001_0000
NIOS 1 RAM
0xC000_0000
NIOS 0 RAM
Lab 2: Additional Peripheral (Mailbox) Registers
Peripheral Address
Offset
Access
Bit Definitions
Mailbox
0x0
R/W
[31:0] – RX/TX Data
Mailbox
0x8
R/W
[1] – RX Message Queue Has Data
[0] – TX Message Queue Empty
18
LAB 2: Designing a Simple Message Protocol

Design Decisions:
 Short Length: A single 32-bit word
 Human Readable
 Message transactions are closed-
loop. Includes ACK/NACK

Format:
 Message Length: Four Bytes
 First Byte is ASCII character

Byte 0
Byte 1 Byte 2 Byte3
‘L’
‘0’
‘0’
‘\0’
‘A’
‘0’
‘0’
‘\0’
Message Types:
 “G00”: Give Access to UART
(Push)
 “A1A”: ACK
 “N1A”:NACK
denoting message type.
 Second Byte is ASCII char from  Can be Extended:
0-9 denoting processor number.
 “L00”: LED Set/Ready
 Third Byte is ASCII char from 0-9
 “B00”: Button Pressed
denoting message data, except for
 “R00”: Request UART
ACK/NACK.
Access (Pull)
 Fourth Byte is always null
“G00”
character ‘\0’ to terminate string
(human readable).
Cortex-A9
NIOS 0
19
“A0A”
“N0N”
Lab 2: Inter-Processor Communication with Mailbox HW
Via Standard Linux Shared Memory (mmap)





20
Wait for Mailbox
Hardware message
empty flag
Send message (4 bytes)
Disable ARM/Linux
Access to UART
Wait for RX message
received flag
Re-enable ARM/Linux
UART Access
/* Map Physical address of Mailbox
to virtual address segment with Read/Write Access */
fd = open("/dev/mem", O_RDWR);
mbox0_address = mmap(NULL, 0x10000,
PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xff260000);
<snip>
/* Waiting for Message Queue to empty */
while((*(volatile int*)(mbox0_address+0x2000+2) & 1) !=
0 ) {}
/* Send Granted/Go message to NIOS */
send_message = "G00";
*(mbox0_address+0x2000) = *(int *)send_message;
/* Disable ARM/Linux Access to UART (be careful here)*/
config.c_cflag &= ~CREAD;
if(tcsetattr(fd, TCSAFLUSH, &config) < 0) { }
/* Wait for Received Message */
while((*(volatile int*)(mbox0_address+2) & 2) == 0 ) {}
/* Re-enable UART Access */
config.c_cflag |= CREAD;
tcsetattr(fd, TCSAFLUSH, &config);
/* Read Received Message */
printf(" - Message Received. DATA = '%s'.\n",
(char*)(mbox0_address));
Lab 3: Putting It All Together – Tetris!
Combining Locking and Communication
Topics Covered:
– Linux Mutex
Key Example Code Provided:
– C code showcasing using Mutexes for locking shared peripheral access
– C code for multiple processor subsystem bringup and shutdown
Full step-by-step instructions are included in lab manual.
– User to add code for second NIOS processor bringup, shutdown and
locking/control.
21
Lab 3: Hardware Design Overview (Same As Lab 2)
NIOS 0 & 1 Subsystems
– NIOS Gen 2 processor
– 64k combined instruction/data
RAM
– GPIO (4 out, LED)
– GPIO (2 in, Buttons)
– Mailbox
Subsystem 1
SD/MMC
EMIF
Cortex-A9
UART
GPIO
ARM Subsystem
–
–
–
–
2 Cortex-A9 (only using 1)
DDR3 External Memory
SD/MMC Peripheral
UART Peripheral
MBox
RAM
RAM
NIOS 0
NIOS 1
GPIO
GPIO
Subsystem 2
Shared Peripherals
22
MBox
Subsystem 3
Dedicated Peripherals
Lab 3: Programmer View - Processor Address Maps
NIOS 0 & 1
23
ARM Cortex-A9
Address Base
Peripheral
Address Base
Peripheral
0xFFC0_2000
ARM UART
0xFFC0_2000
UART
0x0007_8000
Mailbox (from ARM)
0x0007_8000
Mailbox (to NIOS 1)
0x0007_0000
Mailbox (to ARM)
0x0007_0000
Mailbox (from NIOS 1)
0x0005_0000
GPIO (In Buttons)
0x0006_8000
Mailbox (to NIOS 0)
0x0003_0000
GPIO (Out LEDs)
0x0006_0000
Mailbox (from NIOS 0)
0x0002_0000
System ID
0xC003_0000
GPIO (LEDs)
0x0000_0000
On-chip RAM
0xC002_0000
System ID
0xC001_0000
NIOS 1 RAM
0xC000_0000
NIOS 0 RAM
Available Linux Locking/Synchronization Mechanisms
Need to share peripherals
– Choose a Locking Mechanism
Available in Linux
–
–
–
–
–
Mutex <- Chosen for this Lab
Completions
Spinlocks
Semaphores
Read-copy-update (decent for multiple
readers, single writer)
– Seqlocks (decent for multiple readers, single
writer)
Available for Linux
– MCAPI - openmcapi.org
24
Tetris Message Protocol – Extended from Lab 2
NIOS Control Flow:
“B00”
NIOS 0
– Wait for button press
– Send Button press message
“A0A”
– Wait for ACK (Free to write to
LED GPIO)
– Write to LED GPIO
“L00”
– Send LED ready msg
– Wait for ACK
“A0A”
ARM Control Flow:
– Wait for button press message
“B10”
NIOS 1
– Lock LED GPIO Peripheral
– Send ACK (Free to write to LED
GPIO)
“A1A”
– Wait for LED ready msg
– Send ACK
“L10”
– Read LED value
– Release Lock/Mutex
25
“A1A”
Cortex-A9
Lab 3: Locking Hardware Peripheral Access
Via Linux Mutex
pthread_mutex_t lock;
<snip – Initialize/create/start>
/* Initialize Mutex */
err = pthread_mutex_init(&lock, NULL);


In this example, LED GPIO is
accessed by multiple
processors
Wrap LED critical section
(LED status reads) with:

pthread_mutex_lock()
 pthread_mutex_unlock()

Also need Mutex init/destroy:
 pthread_mutex_init()
 pthread_mutex_destroy()
/* Create 2 Threads */
i=0;
while(i < 1)
{
err = pthread_create(&(tid[i]), NULL,
&nios_buttons_get, &(nios_num[i]));
i++;
}
<snip – Critical Section>
pthread_mutex_lock(&lock);
/* Critical Section */
pthread_mutex_unlock(&lock);
<snip Stop/Destroy>
/* Wait for threads to complete */
pthread_join(tid[0], NULL);
pthread_join(tid[1], NULL);
/* Destroy/remove lock */
pthread_mutex_destroy(&lock);
26
References
27
Altera References
System Design Tutorials:
–
http://www.alterawiki.com/wiki/Designing_with_AXI_for_Altera_SoC_ARM_Devices_Workshop_Lab__Creating_Your_AXI3_Component
–
Designing_with_AXI_for_Altera_SoC_ARM_Devices_Workshop_Lab
–
Simple_HPS_to_FPGA_Comunication_for_Altera_SoC_ARM_Devices_Workshop
–
http://www.alterawiki.com/wiki/Simple_HPS_to_FPGA_Comunication_for_Altera_SoC_ARM_Devices_Workshop_-_LAB2
Multiprocessor NIOS-only Tutorial:
–
http://www.altera.com/literature/tt/tt_nios2_multiprocessor_tutorial.pdf
Quartus Handbook:
–
https://www.altera.com/en_US/pdfs/literature/hb/qts/quartusii_handbook.pdf
Qsys:
–
System Design with Qsys (PDF) section in the Handbook
–
Qsys Tutorial: Step-by-step procedures and design example files to create and verify a system in Qsys
–
Qsys 2-day instructor-led class: System Integration with Qsys
–
Qsys webcasts and demonstration videos
SoC Embedded Design Suite User Guide:
–
https://www.altera.com/en_US/pdfs/literature/ug/ug_soc_eds.pdf
Related Articles
Performance Analysis of Inter-Processor Communication Methods
– http://www.design-reuse.com/articles/24254/inter-processor-communicationmulti-core-processors-reconfigurable-device.html
Communicating Efficiently between QorlQ Cores in Medical
Applications
– https://cache.freescale.com/files/32bit/doc/brochure/PWRARBYNDBITSCE.p
df
Linux Inter-Process Communication:
– http://www.tldp.org/LDP/tlk/ipc/ipc.html
Linux locking mechanisms (from ARM):
– http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0425/ch04s
07s03.html
OpenMCAPI:
– https://bitbucket.org/hollisb/openmcapi/wiki/Home
Mutex Examples:
– http://www.thegeekstuff.com/2012/05/c-mutex-examples/
29
Thank You

Full Tutorial Resources Online
 Project Wiki Page:
http://rocketboards.org/foswiki/Projects/BuildingMultiProce
ssorSystems

Includes:
 Source code
 Hardware source
 Hardware Quartus Projects
 Software Eclipse Projects
BACKUP SLIDES
Post-Lab 1 Additional Topics
Hardware Design Flow and FPGA Boot with U-boot and SD/MMC
32
Building Hardware:
Qsys (Hardware System Design Tool) User Interface
Interfaces Exported
In/out of system
Connections between
cores
33
Hardware and Software Work Flow Overview
Preloader & U-Boot
Quartus
&
Qsys
Eclipse
DS-5 & Debug Tools
Device Tree
RBF
Inputs:
– Hardware Design (Qsys or RTL or Both)
Outputs (to load on boot media):
– Preloader and U-boot Images
– FPGA Programmation File: Raw Binary Format (RBF)
– Device Tree Blob
34
SDCARD Layout
Partition 1: FAT
–
–
–
–
–
Uboot scripts
FPGA HW Designs (RBF)
Device Tree Blobs
zImage
Lab material
Partition 2: EXT3 – Rootfs
Partition 3: Raw
– Uboot/preloader
Partition 4: EXT3 – Kernel src
35
Updating SD Cards
File
Update Procedure
zImage
Mount DOS SD card partition 1 and
replace file with new one:
$ sudo mkdir sdcard
$ sudo mount /dev/sdx1 sdcard/
$ sudo cp <file_name> sdcard/
$ sudo umount sdcard
soc_system.rbf
soc_system.dtb
u-boot.scr
preloader-mkpimage.bin
$ sudo dd if=preloader-mkpimage.bin
of=/dev/sdx3 bs=64k seek=0
u-boot-socfpga_cyclone5.img
$ sudo dd if=u-boot-socfpga_cyclone5.img
of=/dev/sdx3 bs=64k seek=4
root filesystem
$ sudo dd if=altera-gsrd-imagesocfpga_cyclone5.ext3 of=/dev/sdx2
More info found on Rocketboards.org
– http://www.rocketboards.org/foswiki/Documentation/GSRD141SdCard
Automated Python Script to build SD Cards:
– make_sdimage.py
36
Post-Lab 2 Additional Topic
Using Eclipse to Debug: NIOS Software Build Tools
37
Altera NIOS Software Design and Debug Tools
Nios II SBT for Eclipse key
features:
– New project wizards and
software templates
– Compiler for C and C++
(GNU)
– Source navigator, editor, and
debugger
– Eclipse project-based tools
– Download code to hardware
38
Key Multi-Processor System Design Points
Startup/Shutdown
– Processor
– Peripheral
– Covered in Lab 1.
Communication between processors
–
–
–
–
What is the physical link?
What is the protocol & messaging method?
Message Bandwidth & Latency
Covered in Lab 2
Partitioning peripherals
–
–
–
–
39
Declare dedicated peripherals – only connected/controlled by one processor
Declare shared peripherals – Connected/controlled by multiple processors
Decide Upon Locking Mechanism
Covered in Lab 3
Post Lab 3 Additional Topic
Altera SoC Embedded Design Suite
Altera Software Development Tools
Eclipse
– For ARM Cortex-A9 (ARM Development Studio 5 – Altera Edition)
– For NIOS
Pre-loader/U-Boot Generator
Device Tree Generator
Bare-metal Libraries
Compilers
– GCC (for ARM and NIOS)
– ARMCC (for ARM with license)
Linux Specific
– Kernel Sources
– Yocto & Angstrom recipes:
http://rocketboards.org/foswiki/Documentation/AngstromOnSoCFPGA_1
– Buildroot:
http://rocketboards.org/foswiki/Documentation/BuildrootForSoCFPGA
41
System Development Flow
FPGA Design Flow
Hardware
Development
Software Design Flow
Software
Development
• Quartus II design software
• Qsys system integration tool
• Standard RTL flow
• Altera and partner IP
Design
Design
• ModelSim, VCS, NCSim, etc.
• AMBA-AXI and Avalon bus
functional models (BFMs)
Simulate
Simulate
Debug
Debug
Release
Release
• SignalTap™ II logic analyzer
• System Console
• Quartus II Programmer
• In-system Update
42
• Eclipse
• GNU toolchain
• OS/BSP: Linux, VxWorks
• Hardware Libraries
• Design Examples
• GDB, Lauterbach, Eclipse
• Flash Programmer
Inside the Golden System Reference Design
Complete system example design
with Linux software support
Target Boards:
– Altera SoC Development Kits
– Arrow SoC Development Kits
– Macnica SoC Development Kits
Hardware Design:
– Simple custom logic design in
FPGA
– All source code and Quartus II /
Qsys design files for reference
Software Design:
– Includes Linux Kernel and
Application Source code
– Includes all compiled binaries
43
---Topics – Back Up--Introductions: Altera and SoC FPGAs
Development Tools
– How to Build Hardware: FPGA Hardware Tools & Build Flow
– How to Build Software: Software Tools & Build Flow
– How to Debug all-of-the-above: Debug Tools
Key Multi-processor System Design Points
Hardware design
– Shared peripherals
– Available Hardware IP
Software design
– Message Protocols
– Linux tools/mechanism available today
44
Quartus – Hardware Development Tool
Quartus II User Interface

Quartus II main window
provides a high level of
visibility to each stage of
the design flow
 Project navigator provides direct
visual access to most of the key
project information
 Tasks window allows you to use
the tools and features of the
Quartus II software and monitor
their progress from a flow-based
layout
 Tool View window shows various
tools and design files
 Messages window outputs
messages from each process
of the run
46
Project Navigator
Tool View
window
Tasks window
Messages window
Typical Hardware Design Flow
Project definition
Project creation
Design entry/RTL coding and early pin planning
Design creation
Functional verification
Synthesis (mapping)
• Verify design behavior
Functional verification
Logic
Memory
I/O
Design compilation
• Translate design into device-specific primitives
• Optimization to meet required area and
performance constraints
Placement and routing (fitting)
• Place design in specific device resources with reference to
area and performance constraints
• Connect resources with routing lines
Timing analysis
Functional verification
• Verify design will work in
target technology
• Behavioral or structural description of design
• Early pin planning allows board development in parallel
Functional verification
• Verify performance specifications were met
• Static timing analysis
PC board simulation and test
In-system debug
47
• Simulate board design
• Program and test device on board
• On-chip tools for debugging
Quartus II Feature Overview
Fully integrated development tool
– Multiple design entry methods
Includes intellectual property- (IP-) based system design
– Up-front I/O assignment and validation
Enables printed circuit board (PCB) layout early in the design process
Project definition
Project creation
Design creation
– Incremental compilation
Reduces design compilation and improves timing closure
– Logic synthesis
Includes comprehensive integrated synthesis solution
Advanced integration with third-party EDA synthesis software
– Timing-driven placement and routing
– Physical synthesis
Improves performance without user intervention
– Verification solution
TimeQuest timing analyzer
PowerPlay power analysis and optimization
Functional simulation
– On-chip debug and verification suite
48
Functional verification
Memory
Logic
I/O
Design compilation
Functional verification
In-system debug
Quartus II Feature Overview (1/2)
Feature
Project creation
Design entry
Quartus II Software
 New project wizard




HDL editor
Schematic editor
State machine editor
MegaWizard™ Plug-In Manager
– Customization and generation of IP
 Qsys system integration tool
Design constraint assignments
 Assignment editor
 Pin planner
 Synopsys Design Constraint (SDC) editor
Synthesis
 Quartus II Integrated Synthesis (QIS)
 Third-party EDA synthesis
 Design assistant
Fitting and placing design into FPGA to meet
user requirements
 Fitter (including physical synthesis)
Design analysis and debug
 Netlist viewers
 Advisors
Power analysis
 PowerPlay power analyzer
49
Quartus II Feature Overview (2/2)
Feature
Quartus II Software
Static timing analysis on post-fitted design
 TimeQuest timing analyzer
Viewing and editing design placement
 Chip Planner
Functional verification
 ModelSim®-Altera edition
 Third-party EDA simulation tools
Generation of device programming file
 Assembler
On-chip debug and verification







Technique to optimize design and
improve productivity
 Quartus II incremental compilation
 Physical synthesis optimization
 Design Space Explorer (DSE)
50
SignalTapTM II (embedded logic analyzer)
In-system memory content editor
Logic analyzer interface editor
In-system sources and probes editor
SignalProbe pins
Transceiver Toolkit
External memory interface toolkit
Quartus II Subscription Edition vs. Web Edition
Subscription Edition
Device supported
Software features:
 Incremental compilation
and team-based design
 SSN Analyzer
 Transceiver Toolkit
 MAX series devices: All
(Excluding MAX7000 / 3000)
 Cyclone III/IV/V FPGAs: All
 Arria II/V FPGAs: All
 Stratix III, IV, V FPGAs: All
 Cyclone V SoCs: All
Web Edition
 MAX series devices: All (Excluding MAX7000 /
3000)
 Cyclone V FPGAs: All (Excluding 5CEA9,
5CGXC9, and 5CGTD9)
 Cyclone III/IV FPGAs: All
 Arria II GX FPGA: EP2AGX45
 Cyclone V SoCs: All
Yes
No

SignalTap II, SignalProbe
Yes
If TalkBack feature is enabled

Multi-processor support
Yes
If TalkBack feature is enabled
Yes
 No license required for OpenCore Plus hardware
evaluation
 License fee required for production use
Windows 32/64-bit
Linux 32/64-bit
Windows 32/64-bit
Linux 32/64-bit
Perpetual
(continues to work after
expiration)
No license required except for IP core
$
Free
IP Base Suite MegaCore®
functions
Platform support
License and
maintenance terms
51Price
How to Get Started Using Quartus II Software
Download Quartus II software today and start designing with
Altera programmable logic devices
Quartus II Handbook - http://www.altera.com/literature/lit-qts.jsp
– Guides you through the programmable logic design cycle from design to
verification
– Also covers third-party EDA vendor tool interfaces
Online demonstrations - http://www.altera.com/quartusdemos
– Easiest way to learn about the latest Quartus II software features and
design flows
Training classes - https://mysupport.altera.com/etraining
– Offers online training classes and live presentation coupled with hands-on
exercises to learn about Quartus II features and design flows
Agenda
52
Qsys – System Integration Platform
Qsys System Integration Platform
High-Performance Interconnect
Design Reuse
Hierarchy
Based on Network-on-a-Chip (NoC)
Architecture
Package as IP
Design
System
Add to
Library
Automated Testbench Generation
Industry-Standard Interfaces
Avalon® Interfaces
®
AMBA AXI, APB, AHB

Qsys is Altera’s design environment for



54
Real-Time System Debug
®
Deployment of IP, with hierarchal support
Development platform for Altera custom solutions
Design platform for customers to quickly create system designs
Qsys User Interface
Interfaces Exported
for Hierarchy
Toolbar
Improved Validation Display
55
Qsys Benefits
Raises the level of design abstraction
– System-level design and system visualization
Simplifies complex hierarchal system development
– Automated interconnect generation
Provides a standard platform
– IP integration, custom IP authoring, IP verification
Enables design re-use
Reduces time to market
– System-level design reduces development time
– Facilitates verification
Qsys improves productivity
56
Network-on-Chip Architecture
Transaction Layer
 Converts transactions to
command packets and
responses packets to
responses
Avalon-MM
AXI-MM
57
Transport Layer
Transaction Layer
 Transfers packets to destination
 Converts command
packets to transactions
and responses to
response packets
Avalon-MM
AXI-MM
Avalon-ST
Master
Interface
Master
Network
Interface
Avalon ST
Network
(Command)
Slave
Network
Interface
Slave
Interface
Master
Interface
Master
Network
Interface
Avalon ST
Network
(Response)
Slave
Network
Interface
Slave
Interface
Benefits of Network-On-Chip Approach
See white paper: Applying the Benefits of NoC Architecture to
FPGA System Design
Independent implementation of transaction/transport layers
– Different transport layer network topologies can be implemented without
transaction layer modification
e.g. High performance components on a wide high-frequency crossbar network
Supports standard interface interoperability
– Mix and match interface types on transaction layer without transport layer
modification
Scalability
– Segment network into sub-networks using
Bridges
Clock crossing logic
58
Industry-Standard Interfaces
Developer
Standard Interface Protocol
Avalon® Interfaces
®
AMBA® AXI, AMBA APB, and AMBA AHB
Qsys supports mixing of different interfaces
59
Target Qsys Applications
Qsys can be used in almost every FPGA design
Designs fall into two categories
– Control plane
Memory mapped
Reading and writing to control and status registers
– Data plane
Streaming
Data switching (muxing, demuxing), aggregation, bridges
“Packets………I care about Latency!”
Qsys packet format is wide
– Packet format contains a complete transaction in a single clock cycle
– Supports:
Writes with 0 cycles of latency
Reads with a round-trip latency of 1 cycle
– You can control latency via Qsys configuration
Separate command and response network
– Increases concurrency
Command traffic and Response traffic don’t compete for resources
61
Qsys: Wide Range of Compliant IP
Wide range of plug-and-play intellectual
property (IP):
– Interface protocol IP
E.g. PCIe, Ethernet 10/100/1000 Mbps (TripleSpeed Ethernet), Interlaken, JTAG, UART,
SPI
– External memory interface IP
E.g. DDR/DDR2/DDR3
– Video and imaging processing (VIP) IP
E.g. VIP Suite including scaler, switch,
deinterlacer, and alpha blending mixer
– Embedded processor IP
E.g. Hardened ARM processor system, Nios II
processor
– Verification IP
E.g. Avalon-MM/-ST, AXI4, APB
>100 Qsys compliant IP available
62
Qsys as a Platform for System Integration
Library of
Available IP







Connect IP and
Systems
Interface protocols
Memory
DSP
Embedded
Bridges
PLL
Custom systems
Accelerate
Development
IP 1
Custom 1
IP 2
IP 3
Custom 2
HDL
Simplify
Integration
Automate Error-Prone Integration Tasks
63
Additional Resources
Watch online demos (3-5 min)
www.altera.com/qsys
Complete the Qsys tutorial (2-3 hrs)
www.altera.com/qsys
Watch free webcasts (10-15 mins)
www.altera.com/qsys
Sign up for Qsys training
www.altera.com/training
64
In-system Verification
Debug Challenges
Accessing and viewing internal signals
Not enough pins to use as test points
Capabilities in creating trigger conditions that correctly
capture data
Verification of standard or proprietary protocol interfaces
Overall design process bottleneck
Debug Can Be Costly
66
On-chip Debug
Access and view internal signals
Store captured data in FPGA embedded memory
Use JTAG interface as debug ports
Incrementally add internal signals to view
Reduce
Debug Cycles by
Using On-chip Debug Tools
67
On-chip Debug Technology
Debug tools communicate with the FPGA via standard
JTAG interface
Multiple debug functions can share the JTAG interface
simultaneously
– Altera’s system-level debugging (SLD) hub technology makes
this possible
– All Altera tools and some third-party tools support the SLD hub JTAG
interface
FPGA
Node
1
Download
Cable
68
JTAG
Tap
Controller
SLD
Hub
User's
Design
(Core Logic)
Node
2
Node
N
Node
N-1
On-chip Debug Tools in Quartus II Software
SignalTap II logic analyzer
– Captures and displays hardware events, fast turnaround times
– Incrementally creates trigger conditions and adds signals to view
– Uses captured data stored in on-chip RAM and JTAG interface for communication
In-system memory content editor
– Displays content of on-chip memory
– Enables modification of memory content in a running system
External logic analyzer interface
– Uses external logic analyzer to view internal signals
– Dynamically switches internal signals to output
In-system sources and probes
– Stimulate and monitor internal signals without using on-chip RAM
Exception: SignalProbe incremental routing feature does not use JTAG
interface (i.e. SLD hub technology)
– Quickly routes an internal node to a pin for observation
69
SignalTap II Logic Analyzer
Provides the most advanced triggering capabilities available in an
FPGA-embedded logic analyzer
Proven to be invaluable in the lab
– Captures bugs that would take weeks of simulation to uncover
Has broad customer adoption
Features and benefits
– An embedded logic analyzer
Uses available internal memory
– Probes state of internal signals without using external equipment or
extra I/O pins
– Incremental compilation support
Fast turnaround time when adding signals to view
– Advanced triggering for capturing difficult events/transactions
– Power-up trigger support
Debug the initialization code
– Megafunction support
Optionally, instantiate in HDL
70
In-system Memory Content Editor
Enables FPGA memory content and design constants to be updated insystem, via JTAG interface, without recompiling a design or reconfiguring
the rest of the FPGA
– Fault injection into system
– Update memory while system is running
– Change value of coefficients in DSP applications
– Easily perform “what if?” type experiments in-system in just seconds
Supports MIF and HEX formats for data interchange
Megafunctions supported
– LPM_CONSTANT, LPM_ROM, LPM_RAM_DQ, ALTSYNCRAM (ROM and single-port
RAM mode)
Enable memory
content editor
71
In-system Memory Content Editor
Under Tools menu  In-system Memory Content Editor
72
Altera SoC Embedded Design Suite
Included in SoC Embedded Design Suite (EDS)
Development Studio 5 Altera Edition
– Awesome debugger, especially when combined with
USB Blaster II
Altera SoC FPGA System Trace Macrocells
– Application development environment
– Streamline system analyzer
Hardware Libraries
GNU-based bare-metal (EABI) compiler tools
U-Boot
Root file system to jump start software development
Pre-built Linux kernel
– http://www.rocketboards.org for source trees and community access
74
System Development Flow
FPGA Design Flow
Hardware
Development
Software Design Flow
Software
Development
• Quartus II design software
• Qsys system integration tool
• Standard RTL flow
• Altera and partner IP
Design
Design
• ModelSim, VCS, NCSim, etc.
• AMBA-AXI and Avalon bus
functional models (BFMs)
Simulate
Simulate
Debug
Debug
Release
Release
• SignalTap™ II logic analyzer
• System Console
• Quartus II Programmer
• In-system Update
75
• ARM Development Studio 5
• GNU toolchain
• OS/BSP: Linux, VxWorks
• Hardware Libraries
• Design Examples
• GNU, Lauterbach, DS5
• Flash Programmer
Altera SoC Embedded Design Suite
FPGA Design Flow
Software Design Flow
Hardware
Development
• Quartus II design software
• Qsys system integration tool
• Standard RTL flow
• Altera and partner IP
Design
• ModelSim, VCS, NCSim, etc.
• AMBA-AXI and Avalon bus
functional models (BFMs)
Simulate
• SignalTap™ II logic analyzer
• System Console
• Quartus II Programmer
• In-system Update
76
Software
Development
HW/SW
Handoff
Design
Simulate
• ARM Development Studio 5
• GNU toolchain
• OS/BSP: Linux, VxWorks
• Hardware Libraries
• Design Examples
• VirtualSoftware
Target
Development
Debug
Release
FPGA-Adaptive
Debugging
Debug
Release
• GNU, Lauterbach, DS5
• Flash Programmer
Altera SoC Embedded Design Suite
Comprehensive Suite SW Dev Tools
Hardware-toSoftware
Handoff
Hardware / software handoff tools
Linux application development
–
Yocto Linux build environment
–
Pre-built binaries for Linux / U-Boot
–
Work in conjunction with the Community Portal
Firmware
Development
Linux
Application
Development
Bare-metal application development
–
SoC Hardware Libraries
–
Bare-metal compiler tools
FPGA-adaptive debugging
–
ARM DS-5 Altera Edition Toolkit
Design examples
77
FPGAAdaptive
Debugging
 Free Web Edition
 Subscription Edition
 Free 30-day Eval
Hardware-to-Software Handoff
Hardware
Qsys system info, SDRAM calibration files,
ID / timestamp, HPS IOCSR data
system.iswinfo
Software
78
system.sopcinfo
Preloader
Generator
Device Tree
Generator
.c & .h
source files
Linux
Device Tree
Hardware / Software Handoff Tools



79
Allow hardware and software teams to work
independently and follow their familiar design flows
Take Altera Quartus® II / Qsys output files and
generate handoff files for the software design flow
Device Tree standard specifies hardware connectivity
so that Linux kernel can boot up correctly
Linux Application Development
Yocto build support for Linux
– Yocto standard enables open, versatile, and
cost-effective embedded software development
– Allows a smooth transition to commercial Linux distributions
Pre-built Linux kernel, U-Boot, and root file system to jump
start software development
– Link to community portal for source trees and community access
80
Bare-metal Application Development
Hardware Libraries
– Software interface to all system
registers
– Functions to configure some basic
system operations
(e.g. clock speed settings, cache
settings, FPGA configuration, etc.)
– Support board bring-up and
diagnostics development
– Can be used by bare-metal
application, device drivers, or
RTOS
GNU-based bare-metal
(EABI) compiler tools
81
Application
Operating
System
BSP
Hardware
BMAL
HAL
PAL
Libraries
SoC FPGA
Baremetal
App
Golden System Reference Design
Complete system design with
Linux software support
– Simple custom logic design in
FPGA
– All source code and Quartus II /
Qsys design files for reference
– Include all compiled binariesexample can run on an Altera
SoC Development Kit to
jumpstart development
82
DS-5 Altera Edition- One Tool, Three Usages
1
• JTAG-Based Debugging
2
•
Board Bring-up
•
OS porting, Drivers Dev,
• System Integration
•
Kernel Debug
• System Debug
• Application Debugging
83
•
Linux User Space Code
•
RTOS App Code
3
• FPGA-Adaptive Debugging
One Device, Two Debugging Tools?
ARM® DS-5™ Toolkit
DSTREAM™


84
Altera Quartus™ II Software
JTAG
Dedicated JTAG connection
Visualize & control CPU
subsystem
JTAG


Dedicated JTAG connection
Visualize & control FPGA
One Device, Two Debugging Tools?
ARM® DS-5™ Toolkit
DSTREAM™


85
Altera Quartus™ II Software
JTAG
Dedicated JTAG connection
Visualize & control CPU
subsystem
JTAG


Dedicated JTAG connection
Visualize & control FPGA
Industry First: FPGA-Adaptive Debugging
Altera
USB-Blaster™II
Connection
ARM® Development Studio 5 (DS-5™) Altera® Edition Toolkit
Removes debugging barrier between CPUs and FPGA
Exclusive OEM agreement between Altera and ARM
Result of innovation in silicon, software, and business model
86
FPGA-Adaptive Debugging Features
Single USB-Blaster II cable for
simultaneous SW and HW debug
Automatic discovery of FPGA peripherals
and creation of register views
Hardware cross-triggering between the CPU and FPGA
domains
Correlation of CPU software instructions and FPGA hardware
events
Simultaneous debug and trace for Cortex-A9 cores and
CoreSight™-compliant cores in FPGA
Statistical analysis of software load and bus traffic spanning the
CPUs and FPGA
87
DS-5 Altera Edition
Productivity-Boosting Features
Industry’s most advanced multicore
debugger for ARM
JTAG based system-level
debugging, gdbserver-based
application debugging
in one package
Yocto plugin to enable
Linux based application
development
Integrated OS-aware analysis
and debug capability
88
Visualization of SoC Peripherals
Register views assist the
debug of
FPGA peripherals
– File generated by FPGA tool
flow
– Automatically imported in
DS-5 Debugger
Debug views for debug of
software drivers
– Self-documenting
– Grouped by peripheral,
register and bit-field
CMSIS
Peripheral register
descriptions
89
FPGA-Adaptive, Unified Debugging
FPGA connected to debug and trace buses for nonintrusive capture and visualization of signal events
Simultaneous debug
and trace connection to CPU cores
and compatible IP
Correlate
FPGA signal
events with
software events
and CPU
instruction
trace using
triggers and
timestamps
90
Cross-Domain Debug 1
Trigger from software world to FPGA world
SOFTWARE TRIGGER
HARDWARE TRIGGER!
91
Cross-Domain Debug 2
Trigger from FPGA world to software world
HARDWARE TRIGGER
EXECUTION STOP
OR
HW TRACE TRIGGER
92
EXECUTION STOP
OR
SW TRACE TRIGGER
Correlate HW and SW Events
Debug event trigger point
set from either:
ARM® DS-5™ Toolkit
SignalTap™ II Logic
Analyzer
or
DS-5 debugger
Captured trace can then
be analyzed using
timestamp-correlated
events
93
Timestamp Correlated
SignalTap II Logic Analyzer
System-Level
Performance Analysis
Performance
bottlenecks in SoCs
often come from the
CPU interaction with
the rest of the SoC
Streamline visualizes
software activity with
performance counters
from the SoC and
FPGA to enable full
system-level analysis
Streamline only
requires a TCP/IP
connection to the SoC
94
ARM® DS-5™ Streamline
Linux OS Counters
Processor Counters,
Aggregated, or Per Core
Power Consumption
FPGA Block Counters
Process/Thread Heat Map
Application Events
Altera SoC EDS- Key Benefits
One-stop shop from Altera
All the tools and examples for rapid starts
Familiar tools interface, easy to use
Share tools and knowledge to increase team productivity
Best multicore debugger tools for ARM architecture
Unprecedented visibility and control across
processor cores and across CPU, FPGA domains
Faster time to market, lower development costs!
95
Target Users and Usages
Web
Edition
Board Bring-up
Yes
Device Drivers Dev
Yes
OS Porting
Yes
Baremetal Programming
Yes
RTOS Based App Dev
Yes
Linux Based App Dev
96
Subscription
Edition
Yes
Yes
Multicore App Debugging
Yes
System Debugging
Yes
SoC EDS Editions Summary
Component
Hardware/Software
Handoff Tools
ARM DS-5 Altera
Edition
Web
Edition
Subscription
Edition
30-Day
Evaluation
Preloader Image Generator
x
x
x
Flash Image Creator
x
x
x
Device Tree Generator (Linux)
x
x
x
Eclipse IDE
x
x
x
Key Feature
ARM Compiler*
Debugging over Ethernet (Linux)
x
x
x
Debugging over USB-Blaster II JTAG
x
x
Automatic FPGA Register Views
x
x
Hardware Cross-triggering
x
x
CPU/FPGA Event Correlation
x
x
x
x
x
CodeBench Lite EABI (Bare-metal)
x
x
x
Hardware Libraries
Bare-metal programming Support
x
x
x
SoC Programming
Examples
Golden System Reference Design
x
x
x
Compiler Tool Chains Linaro Tool Chain (Linux)
x
*ARM Compiler is available in DS-5 Professional Edition, available directly from ARM
97
Coordinated Multi-Channel Delivery
Altera.com
Quartus II
Programmer
SignalTap II
98
Altera.com
RocketBoards.org
Pre-built Binaries
• Kernel
• U-Boot
• Yocto
• Minimal RFS
• Tool chains
• Handoff tools
• HW Libraries
• Examples
• Documentation
Frequent Updates
• Kernel source
• U-Boot source
• Yocto source
• RFS source
• Toolchain
source
• Public git
• Wiki
• Mailman
Partners
BSPs
Middleware
3rd Party Tools
Altera NIOS Software Design Tools
Nios II SBT for Eclipse key
features:
– New project wizards and
software templates
– Compiler for C and C++
(GNU)
– Source navigator, editor, and
debugger
– Eclipse project-based tools
99
Download