Proving a function is of a certain O ο· State: Must find constants π > 0 πππ π0 ≥ 1 such that the function ≤ π π₯ πππππ πππ πππ π ≥ Trees π0 ο· ο· Simplify the inequality w/ the goal being a number on one side and c and π0 on the other ο· Choose value for c that simplifies the inequality the most (many values possible) ο· ο· Choose value for n that is ≥ π0 and satisfies the inequality(many values possible) ο· If given the equation as functions, keep c and n general and show they exist ο· Show π π ⋅ β π ππ π π π ⋅ β π ππ π π ππ π π π and all functions non-negative ο· ο· Must show π π ⋅ β π ≤ ππ π ⋅ β π πππ πππ π ≥ π0 ο· As π π ππ π π π , there are constants c’>0 and π0′ ≥ 1 such that π π ≤ π ′ π π πππ πππ π ≥ π0 ’ ο· Multiply both sides by h(n) to get π π β π ≤ π ′ π π β π πππ πππ π ≥ π0 ’ ο· Choose π0 ’= π0 and c=c’ so that c and π0 exist proving the inequality Binary Search Trees Computing Time Complexity ο· ο· ο· Find components that are constant and label them c# Find expensive parts(loops and recursive calls) and find out how many times they run If there are recursive calls in loops, figure out the # of loop runs without the recursive call, o then figure out the effect of the recursive call Add together all components, then cross out constants The highest time complexity left is the time complexity Fastest to slowest time complexities: Class P/Efficient Algorithms O 1 < O logn < O n < O nlogn < O n2 < O na < O bn < O n! < O nn ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· Hash Tables ο· Uses a hash function(containing hash code and a compression map) to map keys to the hash table(array) Hash code converts a key(of any variable type) to an integer Most popular is the polynomial(fewest collision) Collisions can be resolved by: Linear probing ο finds next available index in the table from starting position Separate chainingο each array index is a linked list Double hashingο uses secondary hash function after using initial hash function until free index found Average time complexity of get(k), put(k,d), and remove(k) is O(1) w/good hash function O(n) w/bad Average time complexity of smallest(k), largest(k), predecessor(k), and successor(k) is O(n) ο· ο· ο· ο· ο· ο· ο· ο· Binary search tree w/ internal node height difference of its subtrees of at most 1 Height is O(logn) Get, put, removes, smallest, largest, successor, and predecessor are O(logn) Always rebalance the smallest unbalanced subtree If the tree becomes unbalanced b/c of insertion, one rotation will rebalance tree When a single and double rotation can be applied to an unbalanced tree, single rotation rebalances the tree If the tree becomes unbalanced after removal, several rotations may be needed to rebalance tree Shortest Paths ο· ο· ο· ο· ο· ο· ο· ο· ο· Minimum Spanning Treesο Prim’s ο· ο· ο· ο· ο· A spanning subgraph is a subgraph of a graph G containing all vertices of G A spanning tree is a spanning subgraph that is itself a tree Minimum spanning trees are spanning trees o a weighted graph w/ minimum total edge weight Use Prim’s algo to find minimum spanning tree Prim’s is O(π2 ) for adjacency list and adjacency matrix Max 2 children and are ordered(proper has exactly 2 chilren per internal node) #leaves = #internal nodes +1 Max # of nodes at any level = 2π , π = πππ£ππ πππ Max # of leaves = 2β , β = βπππβπ‘ Every key in left subtree u is smaller key(u) vice versa right subtree evry key larger Performing inorder traversal returns smallest to largest key If removing internal node u, is replace by smallest key larger than u If one child of tree is a leaf and root is removed, other child becomes root Smallest, largest, get, put, remove, and successor are O(height) Recurrence Equations ο· ο· Equations represented in terms of itself π−1 f 0 = c1 , π π = π + π2 , π > 0 2 Other Time Complexities Multi-Way Search Treeο An ordered tree such that: Linear search: O(n) Binary Search: O(logn) ο· Each internal node has- at least 2 children and at most π childrenο # children = 1+data items ο· Internal node stores π − 1 data items ππ , π·π ο· An internal node storing keys π1 ≤ π2 ≤ β― ≤ ππ−1 has π children π£1 , π£2 , … , π£π ο· By convenience we add sentinel keys π0 = −∞ and ππ = ∞ ο· Leaves hold no items and serve as placeholders ο· In-order traversal visits keys in icreasing order ο· Requires secondary data structure ο· Smallest, largest is O(height) ο· Get, successor, and predecessor is O(height x logd) ο· Put and remove is O(d + height x logd) AVL Trees ο· ο· ο· ο· ο· ο· Height of the tree is the maximum depth of any node Degree of a node is the # of children(sum of degrees in tree=#of edges Depth or level is # of ancestors 3 different tree traversals: o Pre-order: visit(u) then left, then right o Post-order: left child, then right, then visit(u) o In-order: left child, visit(u), right child 2 ) Sort Algo For weighted graphs(each edge has # value) use Dijkstra’s Unweighted graphs use BFS A sub path of a shortest path is itself a shortest path There is a tree of shortest paths from a start vertex to all other vertices Dijkstra’s assumes the graph is connected, the edges are undirected, and the edge weights are non-negative Edge relaxations is when you recalculate the minimum distance to a node that is not yet in path Equals min{current min, (path distance + edge length)} Dijkstra’s is O(π2 ) for adjacency list and adjacency matrix B-Trees ο Multiway search tree of order d such that: ο· ο· The root has at least 2 children and at most d All internal nodes other than the root have at π least ⌈ ⌉ (round up) and at most d children ο· ο· ο· 2 All the leaves are the same level Height is π log π π Put and remove are same as 2-4 trees and are O(πππππ) Var initializations While both var In array indices If/else: add correct value If first is in array indices While first not out of array inidices Add elements Else While second not out of array indices Add elements Return n vertices on edges Adjacency List Adj Mtrx Space Edge list m=#edges O(n+m) O(n+m) O(n) ο· incidentEdges(u) areAdjacent(u,v) insertVertex(u) insertEdge(u,v,w) removeVertex(u) removeEdge(u,v) O(m) O(m) O(1) O(1) O(m) O(m) O(deg(u)) O(min{deg(u),deg(v)}) O(1) O(1) O(deg(u)) O(deg(u)+deg(v)) O(n) O(1) .π π2 O(1) .π π2 O(1) ο· ο· ο· ο· ο· Sorting Methods Graphs ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· ο· #of edges in graph m= ο· ο· ο· ο· ο· ο· 1 ) Graph ο· ο· ο· Graph is connected if there is a path from each node to all other nodes A tree is a graph without cycles (m=n-1)/ Forrest is a set of trees (π ≤ π − 1) Subgraph is a subset of vertices and edges that forms a graph Connected component is a maximal connected subgraph Mark(u) 2 If/else(sometimes) For each edge (u,v) incident on u If, then: operations + recursive call ο· External memory is divided into blocks called disk blocks Transfer of a block between external and primary memory is called disk transfer or I/O Want to minimize the number of disk transfers referred to as the I/O complexity B-Trees fit in disk block better than AVL Trees Return Multiway search tree such that Every internal node has 2, 3, or 4 children AND all of the leaves are same level Height is O(logn) put and remove are O(logn) Insertion can cause overflow which is handled with a split() operation around 2nd key o Take 2nd key and move to parent, left of second key and right of second key are children Overflow may propagate to the parent node in which make second key root 2 cases for deletion: node with or without leaf children o Without: replace node with in-order successor/predecessor o With (2 cases): ο§ fusionο adjacent siblings are 2-nodes, 2 children(underflow may propagate to parent) ο· parent from between children joins other child ο§ transferο adjacent siblings in a 3-node or 4-node(no underflow) ο· take key from sibling to parent, parent goes to underflow Directed Graphsο each edge is one direction ο· ο· ο· ο· π π−1 ο· ο· ο· 2-4 Trees ο· ο· ο· ο· ο· A pair (V,E) where V is a set of vertices and E is collection of edges (u,v) Edges can be directed: ordered pair of vertices (u,v) u is start v is end Edges an be undirected: unordered pair of vertices(u,v) Directed graphο all edges directed / Undirected graphο edges not directed Mixed graph: directed and undirected edges Endpoints or end vertices of an edge U and V are the endpoints of a/ a,d, and b are incident on V X has degree 5/ U and V are adjacent h and i are parallel edges/ j is a self loop a graph without parallel edges or self loop is a simple graph path is sequence of adjacent vertices/simple path: path S.T all vertices and edges are distinct cycle is circular sequence/ simple cycle: all vertices and edges are distinct except start and end sum of degrees in a graph equals 2m, m =#edges Sorting ordered dict can have diff time complex. depending on data structure Unordered array (selection sort) is O(π2 ) AVL or 2-4 is O(nlogn) Sorted array (insertion sort) is O(π2 ) In place sorting algorithm: In-place insertion sort is O(π2 ) o Merge sort is O(nlogn): splits array in half repeatedly then merges pieces based on order o Quick sort is O(nlogn) on avg and O(π2 ) in worst case o Merge & quick sort use divide and conquer o Quick sort chooses a pivot then moves smaller elements left vice versa o Then quick sort sorts the left and right pieces and combines it all Memory If graph is simple, π ≤ π π − 1 BFS and DFS still work if modified DFS tree routed at v: shows all vertices reachable from v via directed paths Strong connectivity is where every node can reach all others A directed acyclic graph(DAG) is a directed graph that has no directed cycles A topological ordering of a directed graph is a o numbering π£1 , … , π£π of the vertices such that: o For every edge (π£π , π£π ) we have i<j An ordering is topological if u can draw arrows only right Must be a DAG to have topological ordering Graph Traversals: BFS DFS ο· ο· ο· ο· ο· Depth first search visits all the vertices and edges in the connected component of node v The discovery of edges labeled by DFS form a spanning tree v called a DFS tree Breadth first search visits all of the nodes adjacent to a node o Places found node into the queue, then finding their adjacent nodes BFS visits all of the vertices and edges of a connected component Discovery of edges from BFS form a spanning tree called BFS Tree Prim’s Algorithm 1. ο· 2. Starts the same as Dijkstra’s and runs similarly Difference is we only care about the min edge distance Edge with min distance is chosen, then all edges incident on it are evaluated a. Node info is updated if there’s an edge with smaller length 3. Edge with shortest length is traversed and the process is repeated. 4. The end result of Prim’s(note that distance values are the shortest edge to the node) Dijkstra’s Algorithm 1. ο· 2. 3. ο· 4. Estimated shortest path and predecessor node is store(from starting node) Nodes that can’t be estimated from the discovered nodes have d =∞ and p=null Path of length 1 is chosen , then all edges incident on it are evaluated a. Node info is updated if a shorter path can be found No more paths of length 1 can be found, so one of the paths of length 2 is chosen a. then all edges incident on it are evaluated and node info is updated node 4 isnt updated b/c current shortest path is d=2 and node 4 is d=3 from 3 to 4 other path of length 2 is chosen and this process is repeated until all nodes found