12.4 Memory Organization in Multiprocessor Systems

advertisement
12.4 Memory Organization in
Multiprocessor Systems
By: Melissa Jamili
CS 147, Section 1
December 2, 2003
Overview
 Shared Memory
 Usage
 Organization
 Cache Coherence
 Cache coherence problem
 Solutions
 Protocols for marking and manipulating data
Shared Memory
 Two purposes
1. Message passing
2. Semaphores
Message Passing
 Direct message passing without shared
memory
 One processor sends a message directly to
another processor
 Requires synchronization between processors
or a buffer
Message Passing (cont.)
 Message passing with shared memory
 First processor writes a message to the shared
memory and signals the second processor that it has
a waiting message
 Second processor reads the message from shared
memory, possibly returning an acknowledge signal to
the sender.
 Location of the message in shared memory is
known beforehand or sent with the waiting
message signal
Semaphores
 Stores information about current state
 Information on protection and availability of
different portions of memory
 Can be accessed by any processor that
needs the information
Organization of Shared Memory
 Not organized into a single shared
memory module
 Partitioned into several memory modules
Four-processor UMA architecture
with Benes network
Interleaving
 Process used to divide the shared
memory address space among the
memory modules
 Two types of interleaving
1. High-order
2. Low-order
High-order Interleaving
 Shared address space is divided into
contiguous blocks of equal size.
 Two high-order bits of an address
determine the module in which the location
of the address resides.
 Hence the name
Example of 64 Mb shared memory
with four modules
Low-order Interleaving
 Low-order bits of a memory address
determine its module
Example of 64 Mb shared memory
with four modules
Low-order Interleaving (cont.)
 Low-order interleaving originally used to reduce
delay in accessing memory
 CPU could output an address and read request to
one memory module
 Memory module can decode and access its data
 CPU could output another request to a different
memory module
 Results in pipelining its memory requests.
 Low-order interleaving not commonly used in
modern computers since cache memory
Low-order vs. High-order
Interleaving
 In a low-order interleaving system,
consecutive memory locations reside in
different memory modules
 Processor executing a program stored in a
contiguous block of memory would need to
access different modules simultaneously
 Simultaneous access possible but difficult to
avoid memory conflicts
Low-order vs. High-order
Interleaving (cont.)
 In a high-order interleaving system,
memory conflicts are easily avoided
 Each processor executes a different program
 Programs stored in separate memory
modules
 Interconnection network is set to connect
each processor to its proper memory
module
Cache Coherence
 Retain consistency
 Like cache memory in uniprocessors,
cache memory in multiprocessors improve
performance by reducing the time needed
to access data from memory
 Unlike uniprocessors, multiprocessors
have individual caches for each processor
Cache Coherence Problem
 Occurs when two or more caches hold the value
of the same memory location simultaneously
 One processor stores a value to that location in its
cache
 Other cache will have an invalid value in its location
 Write-through cache will not resolve this problem
 Updates main memory but not other caches
Cache coherence problem with four
processors using a write-back
cache
Solutions to the Cache Coherence
Problem
 Mark all shared data as non-cacheable
 Use a cache directory
 Use cache snooping
Non-Cacheable
 Mark all shared data as non-cacheable
 Forces accesses of data to be from shared
memory
 Lowers cache hit ratio and reduces overall
system performance
Cache Directory
 Use a cache directory
 Directory controller is integrated with the main
memory controller to maintain the cache
directory
 Cache directory located in main memory
 Contains information on the contents of local
caches
 Cache writes sent to directory controller to
update cache directory
 Controller invalidates other caches with same data
Cache Snooping
 Each cache (snoopy cache) monitors
memory activity on the system bus
 Appropriate action is taken when a
memory request is encountered
Protocols for marking and
manipulating data
 MESI protocol most common
 Each cache entry can be in one of the following
states:
1. Modified: Cache contains memory value, which is
different from value in shared memory
2. Exclusive: Only one cache contains memory value,
which is same value in shared memory
3. Shared: Cache contains memory value
corresponding to shared memory, other caches can
hold this memory location
4. Invalid: Cache does not contain memory location
How the MESI Protocol Works
 Four possible memory access scenarios:
1.
2.
3.
4.
Read hit
Read miss
Write hit
Write miss
MESI Protocol (cont.)
 Read hit


Processor reads data
State unchanged
MESI Protocol (cont.)
 Read miss

Processor sends read request to shared memory via
system bus
1. No cache contains data


MMU loads data from main memory into processor’s cache
Cache marked as E (exclusive)
2. One cache contains data, marked as E


Data loaded into cache, marked as S (shared)
Other cache changes from state E to S
3. More than one cache contains the data, marked as S


Data loaded into cache, marked as S
Other cache states with data remain unchanged
4. One cache contains data, marked as M (modified)


Cache with modified data temporarily blocks memory read
request and updates main memory
Read request continues, both caches mark data as S
MESI Protocol (cont.)
 Write hit
1. Cache contains data in state M or E
 Processor writes data to cache
 State becomes M
2. Cache contains data in state S
 Processor writes data, marked as M
 All other caches mark this data as I (invalid)
MESI Protocol (cont.)

Write miss

Begins by issuing a read with intent to modify
(RWITM)
1. No cache holds data, one cache holds data marked as E,
or one or more caches hold data marked S



Data loaded from main memory into cache, marked as M
Processor writes new data to cache
Caches holding this data change states to I
2. One other cache holds data as M



Cache temporarily blocks request and writes its value back to
main memory, marks data as I
Original cache loads data, marked as M
Processor writes new value to cache
Four-processor system using
cache snooping and the MESI
protocol
Conclusion
 Shared memory
 Message passing
 Semaphores
 Interleaving
 Cache coherence
 Cache coherence problem
 Solutions
 Non-cacheable
 Cache directory
 Cache snooping
 MESI protocol
Download