Prelim 2 Topics • Older Stuff:

advertisement
Prelim 2 Topics
• Older Stuff:
–
–
–
–
Transistors, gates, combinatorial circuits
Flip flops, latches, state machines
CPU design, pipelining & hazards
MIPS assembly, calling conventions
• Newer stuff:
–
–
–
–
–
–
–
–
Physical and virtual memory, page tables, TLBs
Caches, cache-conscious programming, caching issues
Privilege levels, syscalls, traps, interrupts, exceptions
Busses, programmed I/O, memory-mapped I/O
DMA, disks, RAID
Synchronization: deadlocks, race conditions, fairness, correctness
Synchronization primitives (locks, spinlocks, hardware support)
Synchronization abstractions (mutexes, semaphores, monitors & condition
variables)
Critical Sections
• Properties:
correctness
at most one thread inside critical section
liveness
if no thread inside, then a waiting thread should eventually get in
fairness
a waiting thread can get cheated only a bounded number of
times (or, equal probabilites among waiters) (or, fifo ordering
among waiters) (or, …)
• Suppose we have two threads (with thread_id 0 and 1)
using a critical section. Which of the three properties are
achieved by each of the folowing implementations?
CritSec Attempt #1
int turn = 0;
void enter_cs()
{
while (turn == thread_id);
}
void leave_cs() {
turn = thread_id;
}
CritSec Attempt #2
int in_crit_sect[2] = {0, 0};
enter_cs()
{
while (in_crit_sect[other_thread_id])
;
in_crit_sect[thread_id] = 1;
}
leave_cs()
{
in_crit_sect[thread_id] = 0;
}
CritSec Attempt #3
int want_crit_sect[2] = {0, 0};
enter_cs()
{
want_crit_sect[thread_id] = 1;
while (want_crit_sect[other_thread_id])
;
}
leave_cs()
{
want_crit_sect[thread_id] = 0;
}
CritSec Attempt #4
int want_crit_sect[2] = {0, 0};
int turn = 0;
enter_cs()
{
want_crit_sect[thread_id] = 1;
turn = thread_id;
while (want_crit_sect[other_thread_id] &&
turn == thread_id)
;
}
leave_cs()
{
want_crit_sect[thread_id] = 0;
}
Semaphores
The following code is meant to exchange the items
at the top of two stacks.
- it must appear to be a single atomic operation
- if either stack is empty, it should do nothing
- if the stacks are one and the same, it should do
nothing
- multiple swaps should proceed simultaneously so
long as they are using different stacks
How is this code broken? Fix it using semaphores.
void atomic_swap(Stack *q1, Stack *q2) {
Item *item1;
Item *item2;
P(q1->lock);
item1 = pop(q1);
if(item1 != NULL) {
P(q2->lock);
item2 = pop(q2);
if(item2 != NULL) {
push(q2, item1);
push(q1, item2);
V(q2->lock);
V(q1->lock);
}
}
}
Synchronization
There are two printers. Write a function called
print_anywhere(char *doc) that prints a
document on either one of the two printers. Your solution
must guarantee:
safety – each printer has at most one process trying to
print to it at a time
liveness – a process trying to print will not wait if a
printer is free
fairness – a process trying to print will eventually be
allowed to print
Monitors
Does this work? Can we fix it?
struct cond {
struct sema *s;
};
void wait(struct cond *c) { c->s->P(); }
void signal(struct cond *c) { c->s->V(); }
Synchronization Primitives
Suppose we have a kernel that does not provide
any direct support for synchronization primitives.
What are the consequences of this limitation?
Disks & DMA
A disk controller with enough memory can
perform read-ahead, reading blocks on the current
track into its memory before the CPU can ask for
them.
Should it also do write behind?
Disks
Give a nearly worst-case algorithm for erasing all
data on a hard disk with 3 platters, 1000 cylinders,
and 20 sectors per track.
Give a best-case algorithm.
Memory & MMU #1
If a machine has 4GB of actual RAM, does it still make
sense to implement virtual memory?
What impact, if any, does it have on the implementation
(simplicity, performance, features, etc.)?
* assume 32-bit physical addresses
Memory & MMU #2
If a machine has no hard disk, does it still make sense to
implement virtual memory?
What impact, if any, does it have on the implementation
(simplicity, performance, features, etc.)?
* a so-called "thin client" that fetches all programs and
data from the network
Memory & MMU #3
If a machine has no support for asynchronous traps, does it still
make sense to implement virtual memory?
What impact, if any, does it have on the implementation (simplicity,
performance, features, etc.)?
* Motorola M68000 was just such a processor
but the M68010 added asynchronous trap handling.
Memory & MMU #4
If a machine's MMU has no support for page
reference bits or page modification bits, how can
we simulate the functionality?
Memory Layout
It is nice to have applications be loadable at
arbitrary places in memory (i.e., libraries, or your
corewars programs).
How can we achieve this?
Privilege Levels
Normally the system call interface is accessed by
means of executing a special instruction (i.e.
syscall). We could instead make the system
automatically transition to privileged mode on any
jump to a kernel-only memory page.
Is this feasible?
Is this a good idea?
MMU-conscious programming
Assume the virtual memory system uses twenty 1KB
pages, with LRU replacement
int A[4096][4096], B[4096][4096], C[4096][4096];
for (int i=0; i< 4096; i++)
for (int j=0; j< 4096; j++) {
int sum = 0;
for (int k = 0; k < 4096; k++)
sum = sum + A[i, k]*B[k,j];
C[i,j] = sum;
}
Approx. how many page faults will there be during a
single iteration of the j loop? Can we improve this?
Interrupts & Polling
Devices can be managed using interrupts or
polling.
Under what circumstances would it be better to
use interrupts? polling?
Caches
(a) For a 1MB, 8-way associative cache, with 128 byte blocks, find the
number of bits needed for the cache tag, index, and offset.
(b) Consider a direct mapped cache, with 32 lines, and a 21 bit tag.
What is the block or line size (in bytes), and the capacity of the
cache?
(c) For a 256 byte cache with a 28 bit tag and 4 word blocks, find the
number bits in the index and then compute the associativity.
(d) In a 32-bit address space, 8 MB of data map to a single set/index in
a cache. If the cache can hold a total of 8192 data words (not bytes)
and has 64 byte blocks, what is the capacity, the number of sets, and
the associativity?
Cache Misses (according to Wikipedia)
• Compulsory misses are those misses caused
by the first reference to a datum. Cache size
and associativity make no difference to the
number of compulsory misses. …
• Capacity misses are those misses that occur
regardless of associativity or block size,
solely due to the finite size of the cache….
• Conflict misses are those misses that could
have been avoided, had the cache not evicted
an entry earlier.
Download