SEARCHING DATA STRUCTURES •Consider a set of data with N data items stored in some data structure •We must be able to insert, delete & search for items •What are possible ways to do this? What is the complexity of each structure & method ? 2 DATA STRUCTURES •Unsorted Array •Sorted Array •Linked List •Binary Search Tree •Heap •What are advantages & disadvantages of each? •Time complexities of each? •What about memory requirements? 3 SOFTWARE DEVELOPMENT •Ask THESE questions – always! •What if a data structure already exists & you MUST use it? • Must be able to use most efficiently. 4 HASHING •Technique for data storage & retrieval having constant time •Do you believe it??? •Perfect Hashing fits this definition, but not hashing in general 5 HASHING •A storage & retrieval technique in which a data item (key) is converted to the address in which it will be stored. The same conversion is used to retrieve the data. •Example – MSU M-number – 8 digits – index into an array • Phone numbers? • Problem? 6 HASH FUNCTION •Mathematical operation which converts a search key into a hash table address • Modulo functions is OFTEN used as part of the hash function •Examples: • M-number ~ Table size 7 COLLISION •A collision occurs when 2 different search keys hash to the same table address. •Collision Resolution Policy (CRP) – strategy for selecting an alternate location for the hashed item that cannot be placed in the computed table address •CRP – affects the complexity of the hashing process • Examples 8 OPEN ADDRESSING - CRP •Select an alternate location in the table •Linear Probing – beginning at original hash location, sequentially search the table for available location. {+1} • Incremental probing - use a value other than 1 •Double Hashing – use a second function to determine the probe increment {+f(n)} 9 OTHER CRP •Bucket Hashing – Each hash address is actually a set of table locations •Chaining – a linked list at each hash address contains all keys that hash there •Table format for each? 10 TABLE SIZE?? •How big should a hash table be? •How full should the table get? •Implications of table size? 11 MEASURING HASH PERFORMANCE •Hash Function Complexity? •Probes: number of hash table locations “probed” (checked) before finding an empty location (CRP) • Consider AVERAGE for a large data set •Table size: want smallest that provides few probes 12 OUR SEMESTER PROJECT – HASHING •Analysis •Empirical Studies • Table Sized • CRP’s • Functions 13