제51강 : Process Address Space Ch 14 Process Address Space 1 my code (92KB) text data 08048000-08049000 08049000-0804a000 My Program #include<stdio.h> int i=1; int main(argc, argv) { printf(“%d”i); } 40000000-40013000 ld (92KB) text data bss 4001c000-40109000 libc (1232KB) text data bss stack (8KB) text: r-x-p data: rw--p bss: rw--p stack: rwx-p r: readable w: writable x: executable s: shared p: private(copy on write) 40013000-40014000 40014000-40016000 40109000-4010d000 4010d000-40111000 bfffe000-c0000000 stack Permissions / Purpose intervals of legal addresses “memory areas (VMA)” 2 address space Example of Address Mapping $ ./a.out >null & [1] 673 The address map of the running process is /proc/<pid>/maps $ cat /proc/673/maps 08048000-08049000 r-xp 00000000 08:21 6160562 08049000-0804a000 rw-p 00000000 08:21 6160562 /home/trinite/a.out /home/trinite/a.out 40000000-40013000 r-xp 00000000 08:01 917 40013000-40014000 rw-p 00012000 08:01 917 40014000-40016000 rw-p 00000000 00:00 0 /lib/ld-2.1.3.so /lib/ld-2.1.3.so 4001c000-40109000 r-xp 00000000 08:01 923 40109000-4010d000 rw-p 000ec000 08:01 923 4010d000-40111000 rw-p 00000000 00:00 0 /lib/libc-2.1.3.so /lib/libc-2.1.3.so bfffe000-c0000000 rwxp fffff000 00:00 0 T my code loader D lib L stack S 3 struct task_struct { volatile long struct thread_info unsigned long flags; int prio, static_prio; struct list_head tasks; state; /* -1 unrunnable, 0 runnable, >0 stopped */ *thread_info; /* per process flags, defined below */ struct mm_struct *mm, /*active_mm; struct task_struct *parent; /* parent process */ struct list_head children; /* list of my children */ struct list_head sibling; /* linkage in my parent's children list */ struct tty_struct *tty; /* NULL if no tty */ /* ipc stuff */ struct sysv_sem sysvsem; /* CPU-specific state of this task */ struct thread_struct thread; /* filesystem information */ struct fs_struct *fs; /* open file information */ struct files_struct *files; /* namespace */ struct namespace *namespace; /* signal handlers */ struct signal_struct *signal; struct sighand_struct *sighand; }; 4 CPU Stack task field mm pointers thread_info task_struct tty files kernel stack fs mm_struct address space mmap vm_area_struct vm_next T vm_area_struct vm_next D vm_area_struct vm_next L vm_area_struct vm_next S 5 address space struct mm_struct { /* memory descriptor of a process */ struct vm_area_struct *mmap; /* list of VMAs */ struct rb_root mm_rb; atomic_t mm_users; /* How many users with user space? */ atomic_t mm_count; /* How many references to "mm_struct" */ int map_count; /* number of VMAs */ struct rw_semaphore mmap_sem; spinlock_t page_table_lock; /* Protects task page tables and .. */ struct list_head mmlist; /* List of all active mm's. */ unsigned long start_code, end_code, start_data, end_data; unsigned long start_brk, brk, start_stack; unsigned long arg_start, arg_end, env_start, env_end; unsigned long rss, total_vm, locked_vm; unsigned long def_flags; ….. }; T D L S 6 address space Memory Descriptor mm_struct mmap vm_area_struct vm_next T vm_area_struct • represents process’s address space vm_next • each process receives unique mm_struct vm_area_struct • consists of (has pointers to) several VMA’s vm_next (memory areas) • processes may share address space with children – D S clone(VM_flag); called “thread” (LWP) then mm_struct is not allocated 7 Reaching Memory Areas CPU Stack mm_struct (per Process) task_struct mmap field thread_info pointers mm tty singly linked list of vm_area_structs (to visit every node) balanced binary tree of vm_area_structs (to visit specific node) vm_area_struct files kernel stack mm_rb field vm_next VMA my part vm_area_struct fs VMA libc vm_next vm_area_struct VMA W VMA ld VMA X VMA Y VMA Z vm_next 8 struct mm_struct { /*memory descriptor of a process */ struct vm_area_struct *mmap; /* singly linked list of VMAs */ struct rb_root mm_rb; /* balanced binary tree of VMA’s */ atomic_t mm_users; /* How many users with user space? */ atomic_t mm_count; /* How many references to "mm_struct" */ int map_count; /* number of VMAs */ struct rw_semaphore mmap_sem; spinlock_t page_table_lock; /* Protects task page tables and .. */ struct list_head mmlist; /* List of all active mm's. */ unsigned long start_code, end_code, start_data, end_data; unsigned long start_brk, brk, start_stack; unsigned long arg_start, arg_end, env_start, env_end; unsigned long rss, total_vm, locked_vm; unsigned long def_flags; ….. }; 9 VMA (Memory Area) 10 address space VMA (memory area) CPU Stack T mm_struct (per Process) task_struct D mmap field thread_info pointers mm tty VMA files kernel stack vm_area_struct - text vm_area_struct VMA - data fs vm_area_struct VMA – stack L start_address end_address permission file operations o page fault o ++ o -- S 11 address space Memory Area T struct vm_area_struct{ unsigned long vm_start; unsigned long vm_end; struct vm_operations_struct *vm_ops; struct mm_struct struct vm_area_struct struct file … D *vm_mm; *vm_next; * vm_file; } • • • • • • L vm_start: vm_end: vm_ops: vm_mm: vm_next: vm_file: the initial address in the interval the final address in the interval operations associated with a given VMA points back to this VMA’s associated mm_struct list of VMA’s file we map to S 12 address space VMA (memory area) T • definition – intervals of legal memory address – where process has permission/purpose to access D • content – text, data, bss, stack, memory mapped files, … • kernel can dynamically add/delete memory areas – eg L “add memory mapped file”, “remove shared memory”, etc S • If two VMA’s – have adjacent addresses & – have same permissions merge them 13 address space Other Fields T struct task_struct mm_struct (per Process) struct mm vm_ops field vm_operations_struct D pgd page mapping table L nopage – used by page fault handler, when no page found S open – when the memory area is added to an address space close – when the memory area is removed to an address space …. 14 Kernel thread - Memory Descriptor • does not have process address space (no user context) • mm field == NULL • But, kernel threads need some data, such as page tables • To provide it, kernel threads use the memory descriptor of a task that ran previously 15 Paging 16 Address Space & Page Table Size • Size of Address Space – Assume 12 bit for displacement (4 KB page) • 16 bit machine – 4 bit for page address – Page table per process 24 entries • 32 bit machine – 20 bit for page address – Page table per process 220 entries • 64 bit machine – 52 bit for page address – Page table per process Mapping Table is Too Big Too Sparse 252 entries 17 Address Space per Process 64 bit T 32 bit 16 bit T D T D D S S L Assuming 4KB size page (12 bits for offset) 32 bit machine needs 220 entries for page table 64 bit machine needs 252 entries for page table S Too Large Space per Each Process Too Sparse Too much memory wasted for (unused) Page Tables 18 Page_no(20) Dir_no(10) Page_no(10) Offset(12) Offset(12) PTE Table T T 1024 entries D PTE Table directory 1024 entries D 1024 entries 1024 x 1024 PTE! L L PTE Table S 1024 entries (1024 x 1024) entries (4 x 1024) entries S 19 Page_no(10) Dir_no(10) 31 22 21 directory table size 1024 enrties page directory Offset(12) 12 11 page table size 1024 entries 0 page itself 4KB page table page no page table if NULL entry NULL 20 Paging in Linux For 64 bit address, one more directory(4 parts) directory --- page global directory page middle directory page table (PTE) offset The size of each parts depends on the architecture For 32bit, Linux eliminates Page Middle Directory Same code can work on 32bit and 64bit machine Dir_no(10) Page_no(10) global_directory middle_directory Page_no Offset(12) Offset(12) 21 address space Page Mapping Table struct task_struct struct mm T mm_struct (per Process) vm_ops pgd D PTE vm_area_struct Directory vm_next PTE L page vm_area_struct vm_next vm_area_struct vm_next start_address end_address permission file operations page fault() add_vma remove_vma S PTE 22 Functions for Process Address Space 23 Allocating Memory Descriptor • During fork(), memory descriptor is allocated. – do_fork() copy_process() copy_mm() – copy_mm(): • If normal process, – The mm_struct structure is allocated – from the mm_cachep slab cache via the allocate_mm() • if thread(CLONE_VM), – do not call allocate_mm() – mm field is set to point to parent’s memory descriptor 24 Destroying Memory Descriptor • exit_mm() mmput() – mmput(): • decrease mm_users • if mm_users is zero, mmdrop() is called – mmdrop(): • • • • decrease mm_count if mm_count is zero, free_mm() is invoked to return the mm_struct to the mm_cachep slab cache via kmem_cache_free() 25 Manipulating Memory Areas T • Creating a VMA – do_mmap() D • is used by the kernel to create a new VMA • Is new interval adjacent to existing interval? – if they share the same permissions, – the two intervals are merged into one L • otherwise, a new VMA is created – mmap() system call S • do_mmap() is exported to user via mmap() sys call • actually the real name of system call is mmap2() 26 Manipulating Memory Areas T • Removing a VMA D – do_munmap() • is used by the kernel to remove a VMA L – munmap() • do_munmap() is exported to user via munmap() sys call S 27 Manipulating Memory Areas T • find_vma() – Look up the first VMA which statisfies (addr < vm_end) – finds the 1st VMA that (contains addr) – or (begins at an address greater than addr) D • find_vma_prev() L – same as find_vma but return pointer to previous VMA • find_vma_intersection() – returns 1st S VMA that overlaps given address interval 28