SCALABILITY OF EXT2 Yancan Huang, Guoliang Jin May 13, 2008 MOTIVATION Graph for create MOTIVATION Graph for open MOTIVATION Same method, different graphs: Code for create: asmlinkage long sys_creat(const char __user * pathname, int mode) { return sys_open(pathname, O_CREAT | O_WRONLY | O_TRUNC, mode); } Code for open: sys_open(pathname, O_RDWR); Why? Create: where does the time go? OVERVIEW Motivation What is scalability of file system Experiment environment Create: where does the time go Setup Techniques Benchmark The file create process What does ext2_lookup do What dose ext2_create do The file open process Conclusion & Future work WHAT IS SCALABILITY Large File Systems Large, Sparse Files Large, Contiguous Files Large Directories Large Numbers of Files OF FILE SYSTEM EXPERIMENT ENVIRONMENT Setup: UML Mount our own ext2 file system called ext2k With 1GB empty virtual disk Measuring techniques gettimeofday long long c; __asm__ __volatile__ (“rdtsc” : “=A” (c)); Output techniques printk write to log on host Benchmark Sequential create and open on a 1GB disk OVERVIEW Motivation What is scalability of file system Experiment environment Setup Techniques Benchmark Create: where does the time go The file create process What does ext2_lookup do What dose ext2_create do The file open process Conclusion & Future work CREATE: WHERE DOES THE TIME GO asmlinkage long sys_creat( const char __user * pathname, int mode) { return sys_open(pathname, O_CREAT | O_WRONLY | O_TRUNC, mode); } THE FILE CREATE PROCESS The process of sys_open sys_open { do_sys_open { getname get_unused_fd_flags do_filp_open { open_namei inameidata_to_filp } fsnotify_open fd_install putname } prevent_tail_call } // fs/open.c // fs/open.c // fs/open.c // fs/namei.c THE FILE CREATE PROCESS When open_namei meets file create open_namei { …… path_lookup_create lookup_hash open_namei_create if (! error) return 0; …… } 45% 54% THE FILE CREATE PROCESS The process of lookup_hash lookup_hash { permission __lookup_hash { cached_lookup // always fail for create struct dentry *new = d_alloc dentry = inode->i_op->lookup if (!dentry) // always true for create dentry = new return dentry } } THE FILE CREATE PROCESS The process of open_namei_create open_namei_create { vfs_create { may_create security_inode_create dir->i_op->create fsnotify_create } may_open } // dir is an inode WE ARE NOW IN EXT2 Thanks to inode->i_op->lookup Thanks to inode->i_op->create Going to ext2_lookup Going to ext2_create WHAT DOES EXT2_LOOKUP DO The process of ext2_lookup ext2_lookup { ext2_inode_by_name { ext2_find_entry { …… } } iget d_splice_alias } WHAT DOES EXT2_FIND_ENTRY DO The process of ext2_find_entry ext2_find_entry { do { ext2_get_page ext2_last_byte while () { ext2_match ext2_next_entry } } while () } NOW WHERE WE ARE When open_namei meets file create open_namei { …… path_lookup_create lookup_hash open_namei_create if (! error) return 0; …… } WHAT DOES EXT2_CREATE DO The process of ext2_create ext2_create { ext2_new_inode mark_inode_dirty ext2_add_nondir { ext2_add_link d_instantiate } } WHAT DOES EXT2_ADD_LINK DO The process of ext2_add_link ext2_add_link { for () { ext2_get_page ext2_last_byte while () { ext2_match ext2_rec_len_from_disk } } __ext2_write_begin ext2_commit_chunk } HOW OFTEN THESE WHILE LOOP EXECUTED REVISIT THE CREATE GRAPH THE FILE OPEN PROCESS The process of sys_open sys_open { do_sys_open { getname get_unused_fd_flags do_filp_open { open_namei inameidata_to_filp } fsnotify_open fd_install putname } prevent_tail_call } // fs/open.c // fs/open.c // fs/open.c // fs/namei.c THE FILE OPEN PROCESS When open_namei meets file open open_namei { …… if (!(flag & O_CREAT)) { error = path_lookup_open(dfd, pathname, lookup_flags(flag), nd, flag); if (error) return error; goto ok; } …… } THE FILE OPEN PROCESS The process of path_lookup_open path_lookup_open { __path_lookup_intent_open { get_empty_filp do_path_lookup { path_walk { link_path_walk { …… } } } } } THE FILE OPEN PROCESS The process of link_path_walk link_path_walk { __link_path_open if (fail) { dget mntgrt __link_path_open } } // in the dcache // force real lookup requests REVISIT THE OPEN GRAPH OVERVIEW Motivation What is scalability of file system Experiment environment Setup Techniques Benchmark Create: where does the time go The file create process What does ext2_lookup do What dose ext2_create do The file open process Conclusion & Future work CONCLUSION In create, ext2_lookup makes sure there won’t be two files with the same name, and ext2_add_link performs a similar routine again The dentry structure of ext2 is linear Using B-tree to manage this in memory structure would show better performance Another scalability issue in ext2 is that its inode number is determined when the disk is formatted FUTURE WORK Try to use B-tree to manage the dentry structure and test the performance But the B-tree itself is complex Try more workload QUESTIONS?