Practicum 1

advertisement
Linux Project
中央大學資工系 碩士二年級
江瑞敏
Outline
• How to compile linux kernel
• How to add a new system call
• Some Projects Example and Way to Solve it
– System Call Hooking by Module
– Project about Memory
– Project about Process
Download Link
• wget https://kernel.org/pub/linux/kernel/linux-2.6.18.tar.bz2
• tar xvf linux-2.6.18.tar.bz2
The Beginning of everything
Compile Linux Kernel
It is Hard?
No, If you understand the concept
The Basic Process
•
•
•
•
•
•
0. make mrproper
1. make oldconfig
2. make –j[n]
3. make modules_install
4. make install
5. reboot
Do You Know What It Means?
make mrproer
• Clean up the environment
• Will Remove almost everything, except….
make clean
• Almost the same as make mrproper.
make oldconfig
• Use the configuration file the current kernel is
using.
• Some other alternative options.
– Make menuconfig
–…
Is config File Important?
Config file
• Determine which kind of kernel you are
compiling
• Determine which modules you want the
kernel to compile.
• Misconfiguration will lead to kernel crash.
make –j[n]
• Compile the whole source code according to
your configuration
make modules_install
• Install the modules into the necessary folder.
– /lib/modules/`uname –r`/
make install
• Install the image into the boot directory.
• Sometimes, update grub is necessary.
What Is System Call
It’s a Bridge
Between
Device
User
Device
Device
Device
Why System Call
Pop Quiz :
Write A Program To Print “Hello World”
What You May Write
What Actually Happened ….
User Application
Printf
libc.so
System
Call
Kernel Code
Device Driver
IO Device
What If There Is No System Call
Everything Will Be
x86 instruction in and out
Let’s Focus On …
User Application
Printf
libc.so
System
Call
Kernel Code
Device Driver
IO Device
Magic int 0x80
Before We Talk Further,
Let’s Talk About X86 Architecture
X86 Architecture Is Interrupt Driven
User Application
Device Driver
Kernel
CPU
8259 PIC
Device
Device
Device
Device
How The CPU Find The Address of The
Device Driver Code
Callback Mechanism
Kernel
Device Driver
Interrupt Descriptor Table
Device Driver
…
…..
Device Driver
Device Driver
CPU
Physical Device
8259 PIC
Device
Device
Device
Device
How About System Call
Magic int 0x80
Interrupt Descriptor Table
System
Call
Handler
syscall_table
…..
System
Call
Handler
…..
0x80
…..
System
Call
Handler
CPU
8259 PIC
int 0x80
Physical Device
Device
Device
Device
Device
cpu
User Application
int 0x80
cs
ds
ss
esp
eip
CPU
…
Stack
Kernel
…
cpu
int 0x80
User Application
cs
ds
ss
esp
eip
CPU
…
Get TSS
GDT
Stack
TSS
…
cpu
int 0x80
User Application
cs
ds
ss
esp
eip
CPU
…
Get TSS
GDT
Stack
TSS
…
cpu
int 0x80
User Application
cs
ds
ss
esp
eip
CPU
…
Get IDT
IDT
ENTRY(system_call)
Stack
0x80
sys_call_table
…
cpu
int 0x80
User Application
cs
ds
ss
esp
eip
CPU
…
Get IDT
IDT
ENTRY(system_call)
Stack
0x80
sys_call_table
…
cpu
int 0x80
User Application
cs
ds
ss
esp
eip
CPU
…
Get IDT
IDT
ENTRY(system_call)
Stack
0x80
sys_call_table
ss
esp
eflags
cs
eip
…
How To Add A System Call
Add a System Call
•
•
•
•
•
1. cd $kernel_src
2. cd arch/i386/kernel/syscall_table.S
3.
….
.long sys_tee
/* 315 */
.long sys_vmsplice
.long sys_move_pages
.long sys_project
/* 318 */
• Kernel.org/pub/linux/kernel
Add a System Call
• cd linux/include/asm-i386/unistd.h
• #define __NR_vmsplice
316
#define __NR_move_pages
317
#define __NR_project
318
#ifdef __KERNEL__
#define NR_syscalls 319
Add a System Call
• cd linux/include/linux/syscalls.h
• asmlinkage long sys_set_robust_list(struct robust_list_head __user *head,
size_t len);
asmlinkage long sys_project( int i );
#endif
Add a System Call
•
•
•
•
cd linux/kernel
touch project.c
Makefile
obj-y = project.o sched.o fork.o
exec_domain.o panic.o printk.o profile.o
Add a System Call
• Project.c
• #include <linux/linkage.h>
#include <linux/kernel.h>
asmlinkage long sys_project( int i ){
printk( "Success!! -- %d\n", i );
return 0;
}
Add a System Call
• Recompile linux kernel
• Reboot
• Create a new file “test.c”
• #include<syscall.h>
int main(){
syscall( 318, 2 );
return 0;
}
Add a System call
• http://in1.csie.ncu.edu.tw/~hsufh/COURSES/F
ALL2007/syscall.html
About 64 bits
• The Idea is the same
• There are many online references
• Therefore, I will not cover in this ppt.
System Call Hooking by Module
System Call Hooking
…
Usermode 程式呼叫
系統呼叫
NR_execve
sys_call_table
57
正常的execve
程式碼
System Call Hooking
正常的execve
程式碼
…
Usermode 程式呼叫
系統呼叫
NR_execve
Hooking Code
sys_call_table
58
System Call Hooking
正常的execve
程式碼
…
Modified
execve
Usermode 程式呼叫
系統呼叫
NR_execve
Hooking Code
sys_call_table
59
Source code links
• http://pastebin.com/rShUxvB5
• http://pastebin.com/KEJxgLGq
Project about Memory
Level 1:
Dump the virtual address of a process
Some Question U may Ask
Where to Start?
Maybe Add a New System Call
1. How to find the process you want?
Process List
• task_struct
• for_each_process()
• If u pay attention in class, these two are not
stranger. 
2. How about Virtual Address that is
being used by the current process?
The Data Structure
• mm_struct
• vm_area_struct
lxr.linux.no
How it looks like
The rest is some basic programming
skill 
Too easy,
Let’s make it a little bit harder
Level 2:
Dump the physical frame that is
associate with the virtual address.
New Problem, New question
How to transfer Virtual Address to
Physical Address?
Some Reminder and Hints
Some Reminder and Hints
Where is CR3?
Now We Have CR3,
Then?
Calculate By Yourself
or
Something Smarter
follow_page()
Push Yourself More
Level 3:
Log these information to a file
Ok,
let’s type
dmesg || grep “myproject” >> log.txt
Dude
Are you…
…. From Kernel of course
Can We Do That???
How to write file in User Mode
• fd = open(filename, “w”);
• write(ptr, string, strlen(string));
• close(fd);
How about Kernel Mode
open -> do_sys_open
Write -> sys_write()
Close -> sys_close()
Is that all?
The magic __user
It tell kernel that the parameter should
pass from user mode
It’s a protection mechanism
Final Step About this Project
Level 4:
Modify The PTE r/w flag
from read/write to read
http://in1.csie.ncu.edu.tw/~hsufh/COURSES/FAL
L2012/linux_project1.html
Structures of Page Directories And
Page Tables Entries
Wow, Looks Simple :D
Basic Idea
• 1. loop through the translation table of a
process according to the virtual address.
• 2. After finding the pte, change the read/write
flag
• 3. Done
Code Implement
• pte_wrprotect()
Code Implement(Cont. )
• for(loop_count = addr; loop_count < end; loop_count+=PAGE_SIZE){
•
pgd = pgd_offset(mm, loop_count);
•
if (pgd_none(*pgd)){
•
printk("pgd none happened\n");
•
continue;
•
}
•
pud = pud_offset(pgd, loop_count);
•
pmd = pmd_offset(pud, loop_count);
•
pte = pte_offset_map_lock(mm, pmd, loop_count, &ptl);
• if(operation == 1){
•
*pte = pte_mkwrite(*pte);
•
} else{
•
*pte = pte_wrprotect(*pte);
•
}
Result
Result
What!?
Use Printk to Verify
Printk Tell Us Two Things
1. we have change the pte r/w flag
2. only one entry being change
back, other didn’t in most cases.
Magic Happened ?
Now,
Imagine you are CPU
What will happened when
some process try to access a read
only area
Page Fault Happened
The Question Becomes,
How Linux Handle Page Fault
U might Ask,
What is Page Fault
From CPU point of view
1. present flag of pgd or pte is clear.
2. code running in user mode attempts to write to a
read only page.
– More detailed check intel programmer manual.
From Kernel Point of View
1. present flag is clear:
• A. Access the first time.
• B. Page is being swap out.
2. write to a read only page:
• A. is a process really write to a read only page
• B. is a page-fault optimization such as copy on write.
How Does Linux Kernel Determine
These Kind of Difference
Well, First….
And This
Then This
What The FxxK…….
This Time Let’s Look Closer
Now We Know An Important Thing
Linux Kernel Will Compare The
vm_flag
Some Useful Knowledge
How Linux Implement COW
Cow??
Moo ?
• 1. COW refer to copy on write
• 2. google and wiki are your friend
• 3. how linux implement copy on write.
– A. pte r/w flag disable
– B. vm_flag & VM_WRITE == true
Our project accidently match
the above conditions!
• 1. same page table entry of parent and child
process point to the same pfn
• 2. set r/w flag of both pte to read only
• 3. when page fault happened, page fault handler
will check the vm_flag of the current virtual
address.
• 4. if vm_flag has VM_WRITE, page fault handler
will refer this situation as a COW condition.
• 5. assign a new pfn with r/w flag enable if there
are two pte point to it.
Copy on Write linux implement
parent
pgd
pte
Pfn N
Task_struct
Pfn (N+1)
Pfn (N+2)
Physical address
child
Task_struct
pgd
pte
A New Idea of The Project
1. Change PTE r/w flag as we just did
2. Change the vm_flag as well
Code Implementation
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
down_write(&current->mm->mmap_sem);
vma = find_vma(mm, addr);
vm_start = vma->vm_start;
vm_end = vma->vm_end;
mask = VM_READ|VM_WRITE|VM_EXEC|VM_SHARED;
new_flags = VM_READ;
old_flags = vma->vm_flags;
if(old_flags&VM_WRITE){
old_flags &= ~(VM_WRITE);
new_flags |= old_flags;
} else{
new_flags |= old_flags;
}
prot = protection_map[new_flags & mask];
vma->vm_flags = new_flags;
vma->vm_page_prot = prot;
up_write(&current->mm->mmap_sem);
addr &= PAGE_MASK;
change_pte(addr, end, operation);
Result
Where is the “press enter
to continue” ?
It’s time to use GDB
Set a break point before syscall happened
Seems like this time printf cause the error
Here is the problem.
Think Slowly
Calling printf will need to push some
parameters
Recall From The Last Code
• we have changed vm_flag for the whole
vm_area_struct which means the entire block
of linear address.
• Address of the array is not always align to 4kb.
Consider The following Conditions
Start address align
End address align
Start Address Align
End Address Not Align
high
End addr
Total need
3 pages
Start addr
Area problem may occur
Test_array
low
Start Address Not Align
End Address Align
high
Area problem may occur
End addr
Total need
3 pages
Start addr
Test array
low
Start Address Not Align
End Address Not Align
high
End addr
Area problem may occur
Total need
4 pages
Start addr
Area problem may occur
Test_array
low
Our case
high
Assembly code:
…..
Call syscall;
Push $string;
Call printf;
The parameter is right here
Since the page is RO.
low
Verify Our Thoughts (Test case 1)
• Rewrite the user mode program. This time use
malloc instead of local variable.(Heap instead
of stack)
• Char *test_array;
• Test_array = (char *)malloc(ARRAY_SIZE)
Test Case 1 Result
Verify Our Thoughts (Test case 2)
•
•
•
•
Char test1[0x2000];
Char test_array[ARRAY_SIZE];
Char test2[0x2000];
This can also bypass the conditions that I just
mentioned.
Test Case 2 Result
Also work~~
How About Mprotect.c
• 1. basically, the idea is the same.
– A. change vm_flag
– B. change pte r/w flag
• 2. Some hints:
– A. Strongly recommend reading Text Book
• Chapter 8: Memory Management
• Chapter 9: Process Address Space
– B. code to change vma_flag is in mprotect_fixup().
– C. the code to loop through the translation table starts from
change_protection(….)
-> change_pud(….)
-> change_pmd(…..)
-> change_pte_range(…..)
Full Source
• Level 1 and 2 :
http://pastebin.com/wEVLaQyg
• Level 3:
http://pastebin.com/HFW8WTN5
Download