Access to video memory We create a Linux device-driver our graphics frame-buffer

advertisement
Access to video memory
We create a Linux device-driver
that gives applications access to
our graphics frame-buffer
The role of a device-driver
A device-driver is a software module
that controls a hardware device
in response to OS kernel requests
relayed, often, from an application
i/o memory
RAM
hardware device
in
out
device-driver
module
user
application
ret
call
call
ret
syscall
standard
“runtime”
libraries
Operating System
kernel
sysret
user space
kernel space
Raster Display Technology
The graphics screen is a two-dimensional array of picture elements (‘pixels’)
These pixels are redrawn sequentially, left-to-right, by rows from top to bottom
Each pixel’s color is an individually programmable mix of red, green, and blue
Special “dual-ported” memory
VRAM
CRT
CPU
16-MB of VRAM
RAM
2048-MB of RAM
How much VRAM is needed?
• This depends on (1) the total number of pixels,
and on (2) the number of bits-per-pixel
• The total number of pixels is determined by the
screen’s width and height (measured in pixels)
• Example: when our “screen-resolution” is set to
1280-by-960, we are seeing 1,228,800 pixels
• The number of bits-per-pixel (“color depth”) is a
programmable parameter (varies from 1 to 32)
• Certain types of applications also need to use
extra VRAM (for multiple displays, or for “special
effects” like computer game animations)
How ‘truecolor’ works
24
longword
alpha
16
8
red
green
R
G
0
blue
B
pixel
The intensity of each color-component within a pixel is an 8-bit value
x86 uses “little-endian” order
“truecolor” graphics-modes use 4-bytes per picture-element
VRAM
0
1
2
B
G
R
3
4
5
6
B
G
R
Video Screen
7
8
9
10
B
G
R
…
Some operating system issues
•
•
•
•
•
•
•
•
Linux is a “protected-mode” operating system
I/O devices normally are not directly accessible
Linux on x86 platforms uses “virtual memory”
Privileged software must “map” the VRAM
A device-driver module is needed: ‘vram.c’
We can compile it using: $ mmake vram
Device-node: # mknod /dev/vram c 98 0
Make it ‘writable’: # chmod a+w /dev/vram
Our ‘vram.c’ module
• It’s a character-mode Linux device-driver
• It implements four device-file ‘methods’:
–
–
–
–
‘read()’: lets a program read from video memory
‘write()’: lets a program write to video memory
‘llseek()’: lets a program ‘move’ the file’s pointer
‘mmap()’: lets a program ‘map’ vram to user-space
• It also implements a pseudo-file that lets users
view the RADEON X300 graphics controller’s
PCI Configuration Space parameter-values:
$ cat /proc/vram
What is PCI?
• It’s an acronym for “Peripheral Component
Interconnect” and refers to a collection of
industry standards for devices used in PCs
• An Intel-sponsored initiative (from 1992-9)
having several ambitious goals:
•
•
•
•
•
Reduce diversity inherent in legacy PC devices
Improve speed and efficiency of data-transfers
Eliminate (or reduce) platform dependencies
Simplify adding/removing peripheral adapters
Lower PC’s total consumption of electrical power
PCI Configuration Space
A non-volatile parameter-storage area for each PCI device-function
PCI Configuration Space Header
(16 doublewords – fixed format)
64
doublewords
PCI Configuration Space Body
(48 doublewords – variable format)
Example: Header Type 0
16 doublewords
31
0
Status
Register
BIST
Header
Type
Command
Register
Latency
Timer
Cache
Line
Size
31
0
Device
ID
Vendor
ID
Class Code
Class/SubClass/ProgIF
Revision
ID
Dwords
1- 0
3- 2
Base Address 1
Base Address 0
5- 4
Base Address 3
Base Address 2
7- 6
Base Address 5
Base Address 4
9- 8
CardBus CIS Pointer
11 - 10
Subsystem
Device ID
Subsystem
Vendor ID
reserved
capabilities
pointer
Expansion ROM Base Address
13 - 12
Maximum Minimum Interrupt
Latency
Grant
Pin
Interrupt
Line
reserved
15 - 14
Examples of VENDOR-IDs
•
•
•
•
•
•
•
•
•
•
•
•
0x8086 – Intel Corporation
0x1022 – Advanced Micro Devices, Inc
0x1002 – Advanced Technologies, Inc
0x10EC – RealTek, Incorporated
0x10DE – Nvidia Corporation
0x10B7 – 3Com Corporation
0x101C – Western Digital, Inc
0x1014 – IBM Corporation
0x0E11 – Compaq Corporation
0x1057 – Motorola Corporation
0x106B – Apple Computers, Inc
0x5333 – Silicon Integrated Systems, Inc
Examples of DEVICE-IDs
•
•
•
•
•
•
0x5347:
0x4C58:
0x5950:
0x436E:
0x438C:
0x5B60:
ATI RAGE128 SG
ATI RADEON LX
ATI RS480
ATI IXP300 SATA
ATI IXP600 IDE
ATI Radeon X300
See this Linux header-file for lots more examples:
</usr/src/linux/include/linux/pci_ids.h>
Defined PCI Class Codes
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
0x00: Legacy Device (i.e., built before class-codes were defined)
0x01: Mass Storage controller
0x02: Network controller
0x03: Display controller
0x04: Multimedia device
0x05: Memory Controller
0x06: Bridge device
0x07: Simple Communications controller
0x08: Base System peripherals
0x09: Input device
0x0A: Docking stations
0x0B: Processors
0x0C: Serial Bus controllers
0x0D: Wireless controllers
0x0E: Intelligent I/O controllers
0x0F: Encryption/Decryption controllers
0x10: Satellite Communications controllers
0x11: Data Acquisition and Signal Processing controllers
Example of Sub-Class Codes
• Class Code 0x01: Mass Storage controller
– 0x00: SCSI controller
– 0x01: IDE controller
– 0x02: Floppy Disk controller
– 0x03: IPI controller
– 0x04: RAID controller
– 0x80: Other Mass Storage controller
Example of Sub-Class Codes
• Class Code 0x02: Network controller
– 0x00: Ethernet controller
– 0x01: Token Ring controller
– 0x02: FDDI controller
– 0x03: ATM controller
– 0x04: ISDN controller
– 0x80: Other Network controller
Example of Sub-Class codes
• Class Code 0x03: Display Controller
– 0x00: VGA-compatible controller
– 0x01: XGA controller
– 0x02: 3D controller
– 0x80: Other display controller
Hardware details may differ
• Graphics controllers use vendor-specific
mechanisms to perform similar operations
• There’s a common core of compatibility
with IBM’s VGA (Video Graphics Array)
developed in the mid-1980s, but since
IBM’s loss of market dominance, each
manufacturer has added enhancements
which employ incompatible programming
interfaces – you need a vendor’s manual!
The ‘frame-buffer’
• Today’s PCI graphics systems all provide
a dedicated amount of display memory to
control the screen-image’s pixel-coloring
• But how much memory will vary with price
• And its location within the CPU’s physical
address-space can’t be predicted because
it depends upon what other PCI devices
are installed (and mapped) during startup
The ‘base address’ fields
• The PCI Configuration Header has several
so-called Base Addess fields, and vendors
use one of these to hold the frame-buffer’s
starting address and to indicate how much
vram the video controller can actually use
• The Linux kernel provides driver-writers
with some convenient functions for getting
the location and size of the frame-buffer
Radeon uses Base Address 0
• Our ‘vram.c’ module’s initialization routine
employs these kernel helper-functions:
#include <linux/pci.h>
struct pci_dev *devp; // for a variable that will point to a kernel-structure
// get a pointer to the PCI device’s Linux data-structure
devp = pci_get_device( VENDOR_ID, DEVICE_ID, NULL );
if ( !devp ) return –ENODEV;
// device is not present
// get starting address and length for memory-resource 0
vram_base = pci_resource_start( devp, 0 );
vram_size = pci_resource_len( devp, 0 );
Reading from ‘vram’
• You can use our ‘fileview’ utility to see the
current contents of the video frame-buffer
$ fileview /dev/vram
• Our ‘vram.c’ driver’s ‘read()’ method gets
invoked when an application-program attempts
to ‘read’ from the ‘/dev/vram’ device-file
• The read-method is implemented by our driver
using ‘ioremap()’ (and ’iounmap()’) to temporarily
map a 4KB-page of physical vram to the kernel’s
virtual address-space
I/O ‘memcpy()’ functions
• Linux provides a ‘platform-independent’
way to do copying from an i/o-device’s
memory into an application’s buffer (or
vice-versa):
– A ‘read’ copies from vram to a user’s buffer
memcpy_fromio( buf, vaddr, len );
– A ‘write’ copies to vram from a user’s buffer
memcpy_toio( vaddr, buf, len );
‘mmap()’
• This is a standard UNIX system-call that
lets an application ‘map’ a file into its
virtual address-space, where it can then
treat the file as if it were an ordinary array
• See the man-page: $ man mmap
• This same system-call can also work on a
device-file if that device’s driver provided
‘mmap()’ among its file-operations
The user-role
• In the application-program, six arguments
get passed to the ‘mmap()’ library-function
int mmap( (void*)baseaddress,
int memorysize,
int accessattributes,
int flags,
int filehandle,
int offset );
The driver-role
• In the kernel, those six arguments will get
validated and processed, then the driver’s
‘mmap()’ callback-function will be invoked
to supply missing information and perform
further sanity-checks and do appropriate
page-mapping actions:
int mmap(
struct file *file,
struct vm_area_struct *vma );
Our driver’s code
int mmap( struct file *file, struct vm_area_struct *vma )
{
// extract the paramers we will need from the ‘vm_area_struct’
unsigned long
region_length = vma->vm_end – vma->vm_start;
unsigned long
region_origin = vma->vm_pgoff * PAGE_SIZE;
unsigned long
physical_addr = fb_base + region_origin;
unsigned long
user_virtaddr = vma->vm_start;
// sanity check: mapped region cannot extend past end of vram
if ( region_origin + region_length > fb_size ) return –EINVAL;
// tell the kernel not to try ‘swapping out’ this region to the disk
vma->vm_flags |= VM_RESERVED;
// tell the kernel to exclude this region from any core dumps
vma->vm_flags |= VM_IO;
Driver’s code continued
// invoke a helper-function that will set up the page-table entries
if ( remap_pfn_range( vma, user_virtaddr, physical_addr >> 12,
region_length, vma->vm_page_prot ) ) return –EAGAIN;
return
}
0; // SUCCESS
Demo: ‘rotation.cpp’
• This application-program will demonstrate use of
our ‘vram.c’ device-driver’s ‘read()’, ‘write()’ and
‘llseek()’ methods (i.e., device-file operations)
• It will perform a rotation of the color-components
(R,G,B) in every displayed ‘truecolor’ pixel:
RG
GB
BR
• After 3 times the screen will look normal again
Demo: ‘inherit.cpp’
• This application-program will demonstrate
use of the ‘mmap()’ method in our driver,
and the fact that memory-mappings which
a parent-process creates will be ‘inherited’
by a ‘child-process’
• You will see a rectangular purple border
drawn on your display -- provided the
program-parameters match your screen
In-class exercise
• Can you adapt the ideas in ‘inherit.cpp’ to
create a program (named ‘backward.cpp’)
that will reverse the ordering of the pixels
in each screen-row?
…
…
Download