Uploaded by helenabenatar

final

advertisement
AVL TREES: Balanced BST
- Self-balanced tree: mantains balance in O(1)
- For every node, left and right subtrees can differ in height by at most 1
- Each node must know the height of children
- Update heights on insert/remove
Checking for balance:
- Only nodes on path to insertion/removal can be out of balance
- A node can only be out of balance by 2
AVL cases insertion
- Outside (LL/LR)
- Akimbo (LR/RL)
SINGLE ROTATION:
- Pick up heavy child
- Node falls to the left or right
- Existing left or right child is reattached
DOUBLE ROTATION:
- LR: left rotate left child, then right rotate node
- RL: right rotate right child, then left rotate node
- You move the heavy child to the other side so you can then do a single rotation.
https://www.cs.usfca.edu/~galles/visualization/AVLtree.html
https://www.sanfoundry.com/cpp-program-implement-avl-trees/
SORTING:
https://www.cs.usfca.edu/~galles/visualization/ComparisonSort.html
Issues:
- Time complexity: best is comparison-based sorts: O(nlogn)
- Space complexity: best is in place sorts: constant extra space
- Handling duplicates: stable sorts preserve original order, unstable don’t
Selection Sort:
- Divide array into 2: sorted and unsorted, sorted is empty
- Swap smallest unsorted element with first unsorted element
- Sorted subarray increases, unsorted subarray decreases
Pseudocode:
- For i in [0…(size-1)]
o imin = index of min element in ai through asize-1
o swap elements ai and aimin (can avoid if i = min)
Complexity:
- Time complexity: O(n^2)
- Space complexity: in-place sort: O(n) for array, O(1) for swapping
- Unstable
Insertion Sort:
- Not in-place: start with empty result list, insert each item in order
- In-place: divide array into sorted and unsorted part, first element is sorted
o for each unsorted element, slide it left into sorted position
Complexity:
- Time complexity:
o O(n) best case: already sorted
o O(n^2) worst case: reverse sorted
o O(n^2) average case
- Space complexity: in-place sort: O(n) for array, O(1) for swapping
- Stable
Bubble Sort:
- Swap adjacent out of order elements
- After each pass, largest element “bubbles” into position
Complexity:
- Time complexity:
o O(n) best case: already sorted
o O(n^2) worst case: reverse sorted
- Space complexity: in-place sort: O(n) for array, O(1) for swapping
- Stable
Merge Sort / Quick sort
- If list greater than 1
o Divide list into 2 sublists
o Recursively sort sublists
o Combine sorted sublists into sorted lsit
Merge Sort:
- Not in-place
- Merge algorithm:
o If L1 is empty, return L2
o If L2 is empty, return L1
o ……
Complexity:
- Time complexity: O(nlogn)
o O(n) at each level
o O(logn) levels
- Space complexity: in place difficult, not in place O(2n)
Quick Sort:
- Pick pivot element from list
- Re-order list so elements less than pivots are to the left, elements greater are to
the right
- Concatenate lists
Complexity:
- Time complexity:
o Worst case O(n^2): bad pivot
o Best case O(nlogn): good pivot
- Space complexity: in place difficult, not in place O(2n)
Non-comparison sorts:
So far best sorts are: heap sort, merge sort, quick sort: O(nlogn)
This is the best it can be for comparisons
Bucket sort:
- If you know range of values, create bucket for each possible value
- Assign each item to bucket in constant time
- Read items out of buckets from least to greatest
Complexity:
- Time complexity: O(n + range_size)
o O(n) to put into buckets
o O(range_size) + n to get result out
- Space complexity: O(range_size) extra space: O(range_size +n)
- Can be stable or not
Radix sort:
- Sort by digit, first from right to left digits
Complexity:
- Time complexity: O(dn) (d digits)
- Space complexity: O(BASE + n) : need BASE buckets, store n numbers
- Can be stable or not
Algorithm:
- Input is interpreted numerically as in the range [0, BASEnum_digits).
- Example above was in base 10
- We have an auxiliary array, bins, of BASE elements
- pass = 0
- while pass < num_digits
o for each element in the array
 location = ((value of element) / (BASEpass)) mod BASE
 insert element into list at bins[location]
o concatenate all the list in bins increment pass
[Note: don't actually calculate BASEpass! It's BASE * previous value each time through
loop.]
SORT TYPE
BEST CASE
WORST
CASE
SPACE
STABILITY
WHEN TO
USE
Selection
2
O(n )
O(n )
constant
unstable
List is small,
space is
limited
Insertion
O(n)
O(n2)
constant
stable
List is nearly
sorted
stable
List is nearly
sorted/
small. Stops
when list is
sorted
unstable
Fast, not
too much
space
stable
Fast for
large data
sets
unstable
Fastest, not
good for
large data
sets, bad
when bad
pivot. Bad
for perfectly
sorted array
Bubble
Heap
Merge
Quick
Bucket
Radix
O(n)
O(nlogn)
O(nlogn)
O(nlogn)
(fastest)
O(n)
O(n) if digits
are bound
2
O(n2)
O(nlogn)
O(nlogn)
O(n2)
O(n)
O(dn)
constant
constant
O(2n)
constant or
O(2n)
O(range_size
Can be both
+ n)
Use when
you know
the range
O(base + n)
When you
know the
range but
its very
large
stable
Priority Queues:
- Arrival order but based on priority
- Keys and values:
o Not necessarily unique keys
o Keys can be complicated
o Keys must be immutable
- Need comparison function to compare keys
Heaps:
- Min heap or max heap
- Operations:
o Insert(key, value)
o Min/maxElement()
o removeMin/Max()
o isEmpty()
o size()
o min/maxKey()
Binary heap invariants:
- Shape property:
o each level is full, except maybe the last
o Lowest level fills from left to right
- Heap property:
o Parents are more important than children
o Order between children doesn’t matter
Heap can be stores using array list:
O(1) access to elements
- Put root at position 1 instead of 0
- Data member for current size
- For an element at position i:
o Left child is at: 2i
o Right child is at: 2i+1
o Parent is at: i/2
https://www.cs.usfca.edu/~galles/visualization/Heap.html
Algorithm: after inserting/removing
- Reestablish shape property first
- Reestablish heap property, without disrupting shape
- Update heap size
Insert: worse case O(logn), average O(1)
- Insert item at element [heapSize + 1]
- Increment heap size
- Up-heap until you get to root
o Compare added element with parent: if in correct order, stop
o If not, swap and upheap from parent
RemoveMin:
- Remove root (save for returning)
- Replace root with last element in array (to maintain shape)
- Down-heap the new root
o Compare root with children, if in correct order, stop
o If not, swap with smallest child and down-heap child
o NOTE: check if children exists. If right exists, left must too
Build heap: O(n)
- Create array with elements
- Starting from lowest completely filles level at first node with children (n/2), down
heap each element
Graphs:
- Ultimate linked data structure
- Vertices/nodes connected by arcs/edges
- Represent arbitrary connections
- Edges can have labels or weights/costs
- Can be directed or undirected
Uses:
- Find route from A to B
- Shortest route
- Cheapest route
Representation:
- Adjacency matrix:
o O(1) time to find edge from A to B
o O(n) time to list all vertices adjacent to vertex
o O(n^2) space
- Adjacency list:
o O(E+V) space
o O(V) time to find edge from A to B
- Other options: hash table, sets
Graph Traversals: in unweighted graphs
Complexity of both: O(V+E)
- BFS:
o queue<Artist> q
o enqueue starting artist
o while queue isn’t empty:
o artist curr = q.dequeue
o mark_vertex(curr)
o get neighbors of curr
o for each neighbor:
 if not marked, mark curr as predecessor and enqueue
 NOTE: works because mark_predecessor won’t mark if artist already
has a predecessor
- DFS:
o mark_vertex(curr)
o get neighbors of curr
o for each neighbor:
 if not marked, mark curr as predecessor and recurse
BFS weighted graph: Dijkstra’s algorithm (greedy algorithm)
- for each Vertex v:
o v.dist = INFINITY
o v.known = false
o v.prev = NONE
o s.dist = 0
- while there is an unknown vertex:
o Vertex v = unknown vertex with smallest distance
o v.known = true
o for each Vertex w adjacent to v
 if (not w.known)
 c = cost of edge from v to w
 if (v.dist + c) < w.dist
o w.dist = v.dist + c
o w.prev = v
Limitations: doesn’t work with negative edges
Time complexity: O(V^2 + E) = O(V^2)
Revised algorithm: puts unprocessed vertices into a min-priority queue based on curr
best distance
- dist[source] ← 0
- create vertex priority queue Q
- for each vertex v in Graph:
o if v ≠ source
 dist[v] ← INFINITY
 prev[v] ← UNDEFINED
o Q.add_with_priority(v, dist[v])
- while Q is not empty:
o u ← Q.extract_min()
o for each neighbor v of u:
 alt ← dist[u] + length(u, v)
 if alt < dist[v]
 dist[v] ← alt
 prev[v] ← u
 Q.decrease_priority(v, alt)
- return dist, prev
Time complexity: O((E + V) log(V))
Hashing:
- Implementing key-value stores in constant-time access (array list)
- Hash function: maps a key (whatever that is in your application) to an integer
deterministically, i. e., will always get the same answer for any given key (can’t just
be random).
- Compression function: reduces hash function returned value and puts it in range
- Good hash function:
o Deterministic
o Fast
o Spreads out keys well
o Assigns keys to same hash value with low probability
Collisions:
- Avoid by using good hash functions and picking table sizes to minimize collisions
- Handling:
o Chaining: put a list in each slot in array
o Linear probing:
 table must always have extra space.
 Insert in the next unfilled slot.
 Must have a bool to tell if slot is filles
Load factor:
- Number of keys stored / num buckets in table
- A measure of how full table is
- Open addressing cannot support load factors greater than 1
- Low load factor means probably O(1) complexity
- High load factor means O(n)
- Most systems keep it under 0.7
- If load factor is high, grow the array
Growing the array:
- Can’t just copy over elements, because array is bigger and compression would
change hash value
- Have to rehash values
-
Download