Konstruktion garantiert fehlerfreier Betriebssysteme

Theory of Memory W. Paul Saarland University and DFKI bmb+f Projekt Verisoft-XT joint work with Ulan Degebaev and Norbert Schirmer Saarland University why might his be important? • Unites theories of – – – – – – – – – store buffers interlocking caches cache coherence out of order execution X64 instruction set address translation optimized compilation structured parallel C semantics • Explains why hypervisor might run structured parallel C • VCC is supposed to mirror structured parallel C semantics • thus VCC might be(come) sound Specifying Memory x M(x) Store Buffer memory M sbuf(y) r(j) w(i) Store Buffer memory M sbuf(y) r(j) w(i) Caches M ca Many Caches: Snooping M ca(1) ca(p) Many Caches x.la M ca(1) ca(p) x.off Many Caches x.la M ca(1) ca(p) x.off Many Caches x.off M ca(1) ca(p) Overlapping Transactions c public (a) b a c c Sequentially Consistent Memory lemma 5 c public (a) b a c c Tomasulo Schedulers for OOO IF issue reservation stations funct. units CDB ROB WB Two Memory Units m RS MMU RS sbuf funct. units LS CDB ROB Single Processor OOO correctness lemma 6 m RS MMU RS sbuf funct. units LS CDB ROB Multi Processor OOO implementation m RS MMU RS sbuf funct. units LS CDB data(i,j) ROB Multi Processor OOO correctness lemma 7 m RS MMU RS sbuf funct. units LS CDB data(i,j) ROB Multi Processor OOO correctness lemma 7 m RS MMU RS sbuf funct. units LS CDB data(i,j) ROB X64 architecture • CPU core mm – R: user registers – SR: system registers ca • CR3 – acc: access – segmentation sbuf acc mmu • mmu: memory management unit – tlb: translation look aside buffer tlb • memory system acc CR3 segmentation core R – mm: main memory – ca: cache – sbuf: store buffer segmentation off lemma 8 mm • 1 segment • large as entire address space • segmentation invisible ca sbuf acc mmu acc tlb CR3 segmentation core R Bad news: cache state is visible • CPU core mm or devices – acc: access ca sbuf acc mmu acc core tlb CR3 R • acc.adr: address • acc.r: rights (user,write, exe) • acc.data • acc.mmode: memory mode – WB: write back – WT: write through ... – NC: no cache Good News: no device, no NC mode • acc.mmode: memory mode mm ca – WB: write back – WT: write through ... – NC: no cache not used sbuf acc mmu acc core tlb CR3 R Sequentially Consistent Physical Memory lemma 9 • acc.mmode: memory mode PM – WB: write back – WT: write through ... mix on same address sbuf acc mmu acc core tlb CR3 R • PM: sequentially consistent physical memory abstraction – Proof: MOESI invariants are maintained Initialize page tables • 1 processor page tables PM sbuf – sbuf invisible • operating mode: paging disabled – mmu invisible acc mmu acc core tlb CR3 R • set up page table tree in PM Translated Linear Memory page tables PM sbuf acc mmu acc core tlb CR3 R • many processors • operating mode: paging enabled • keep tlb consistent Translated Consistent Linear Memory + sbufs lemma 10 LM page tables sbuf acc core CR3 R • many processors • operating mode: paging enabled • keep tlb consistent C0: Pascal with C syntax configurations • c = ( pr, rd, lms, hm,gm) – – – – – memory m pr program rest rd recursion depth lms: [0: recursion depth]!{local memories} hm: heap memory gm: global memory • subvariables – (m,i)[17].gpr[3] • value of pointers: subvariables ! va(c,(m,i)) ba(m,i) size(m,i) Parallel C • c = ( pr, rd, lms, hm,gm) – – – – – memory m pr program rest rd recursion depth lms: [0: recursion depth]!{local memories} hm: heap memory gm: global memory • Share – gm – hm • Interleave at small steps semantics steps va(c,(m,i)) ba(m,i) size(m,i) Parallel C • c = ( pr, rd, lms, hm,gm) – – – – – memory m pr program rest rd recursion depth lms: [0: recursion depth]!{local memories} hm: heap memory gm: global memory • Share – gm – hm • Interleave at small steps semantics steps • Problem: – Processor interleaves instructions of compiled programs code(p) va(c,(m,i)) ba(m,i) size(m,i) simulation relation consis(c, alloc, d) LM alloc (c,y) y alloc (c,p) p Non optimizing compiler: step by step simulation Optimizing compiler: simulation between IO-steps IO-steps (1): volatile accesses Volatiles Sequentially Consistent lemma 11 Structured Parallel C • Implement Locks using Volatiles • IO-steps (2): lock release • Run Processors alone on locked portions of linear memory • Lemma 1: sbufs invisible • Lemma 10: Ordinary C code in linear memory Summary • Implement Locks using Volatiles • IO-steps (2): lock release • Run Processors alone on locked portions of linear memory • Lemma 1: sbufs invisible • Lemma 10: Ordinary C code in linear memory • Outlined correctness proof for implementation of structured parallel C – Initialisation – compilation

Konstruktion garantiert fehlerfreier Betriebssysteme

Related documents

Products

Support

Konstruktion garantiert fehlerfreier Betriebssysteme

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib