Document 10949031

advertisement
SEARCHING DATA STRUCTURES
•Consider a set of data with N data items stored in some data
structure
•We must be able to insert, delete & search for items
•What are possible ways to do this? What is the complexity
of each structure & method ?
2
DATA STRUCTURES
•Unsorted Array
•Sorted Array
•Linked List
•Binary Search Tree
•Heap
•What are advantages &
disadvantages of each?
•Time complexities of each?
•What about memory
requirements?
3
SOFTWARE DEVELOPMENT
•Ask THESE questions – always!
•What if a data structure already exists & you MUST use it?
• Must be able to use most efficiently.
4
HASHING
•Technique for data storage & retrieval having
constant time
•Do you believe it???
•Perfect Hashing fits this definition, but not hashing in
general
5
HASHING
•A storage & retrieval technique in which a data item (key) is
converted to the address in which it will be stored. The
same conversion is used to retrieve the data.
•Example – MSU M-number – 8 digits – index into an array
• Phone numbers?
• Problem?
6
HASH FUNCTION
•Mathematical operation which converts a search key into a
hash table address
• Modulo functions is OFTEN used as part of the hash function
•Examples:
• M-number ~ Table size
7
COLLISION
•A collision occurs when 2 different search keys hash to the
same table address.
•Collision Resolution Policy (CRP) – strategy for selecting an
alternate location for the hashed item that cannot be placed in
the computed table address
•CRP – affects the complexity of the hashing process
• Examples
8
OPEN ADDRESSING - CRP
•Select an alternate location in the table
•Linear Probing – beginning at original hash location,
sequentially search the table for available location. {+1}
• Incremental probing - use a value other than 1
•Double Hashing – use a second function to determine the
probe increment {+f(n)}
9
OTHER CRP
•Bucket Hashing – Each hash address is actually a set of
table locations
•Chaining – a linked list at each hash address contains all
keys that hash there
•Table format for each?
10
TABLE SIZE??
•How big should a hash table be?
•How full should the table get?
•Implications of table size?
11
MEASURING HASH PERFORMANCE
•Hash Function Complexity?
•Probes: number of hash table locations “probed” (checked)
before finding an empty location (CRP)
• Consider AVERAGE for a large data set
•Table size: want smallest that provides few probes
12
OUR SEMESTER PROJECT – HASHING
•Analysis
•Empirical Studies
• Table Sized
• CRP’s
• Functions
13
Download