Algorithms Dynamic Programming LCS (Longest Common Subsequence) abcddeb bdabaed ๐ด๐ = ๐ต๐ 1 + ๐๐๐ก(๐ − 1, ๐ − 1) ๐๐๐ก(๐, ๐ − 1) ๐๐๐ก(๐, ๐) = { ๐ด๐ ≠ ๐ต๐ max { ๐๐๐ก(๐ − 1, ๐) ๐ ๐ ๐ ๐ ๐ ๐ ๐ 0 0 0 0 0 0 0 ๐ 1 1 0 0 ๐ 2 1 0 0 ๐ 0 ๐ 0 ๐ 0 ๐ 0 ๐ 0 ๐ Ramsy Theory Erdes and Szekeresh proved that given a permutation of {1, … , ๐}, at least one increasing or decreasing substring exists with length > √๐. For example, observe the following permutation of {1, … ,10}: 10 5 8 1 4 3 6 2 7 9 (1,1)(1,2)(2,2)(1,3) … The pairs present the number of the longest increasing substring and longest decreasing substring till that point (accordingly). Since every number raises the length of one of the substrings by one (either the increasing or decreasing) each lengths pair is unique Due to this fact, one of the numbers must be at least √๐ Bandwidth Problem The bandwidth problem is an NP hard problem. Definition: Given a symmetrical matrix, is there a way to switch the columns s.t. the "bandwidth" of the matrix is smaller than a number received as input. Another variant is to find the column switch that produces the smallest bandwidth. The bandwidth of a matrix is defined as the largest difference between numbering of end points. 0 1 1 0 1 1 1 0 1 1 0 0 1 0 0 Special Cases If the graph is a line, it's very easy finding the smallest bandwidth (it's 1 – need to align the vertices as a line). In case of "caterpillar" graphs however, the problem is already NP hard. ("Caterpillars" are graphs that consist of a line with occasional side lines) Parameterized Complexity Given a parameter k, does the graph have an arrangement with bandwidth ≤ k? One idea, is to remember the last 6 vertices (including their order!), and simply remember the vertices before and after (without order) so that we know not to repeat a vertex twice. A trivial solution would be to simply remember them all. But while it's feasible to remember ๐ the last 6 vertices, remembering the vertices we've passed takes up (๐ − ๐ ) where ๐ is the ๐ step index. In case of ๐ = 2 we would need time exponential in ๐. A breakthrough came in the early 80s by Saxe. The main idea is to remember the vertices we've passed implicitly. In order to do so, here are a few observations: 1) Wlog, ๐บ is connected (otherwise we can find the connected sub-graphs in linear time and execute the algorithm independently on each of them, concatenating the results at the end) 2) Maximal degree of the graph (if has bandwidth ≤ ๐) is ≤ 2๐. Prefix Active Region Suffix Due to these observations, a vertex is in the prefix, if it is connected to the active region without using dangling edges. A vertex in suffix must use dangling edges to connect to active region. --------end of lesson 1 Color Coding ๐ด(๐ฅ) The measure of complexity would be the expected running time (over random coin tosses of the algorithm) for the worst possible input. Monte Carlo Algorithms There is no guarantee the solution is correct. It is probably correct. Las Vegas Algorithms You want the answer to be always correct. But the running time might differ. Hamiltonian path A simple path that visits all verties of a graph. Logest Simple Path TODO: Draw vertices We are looking for the longest simple path from some vertex ๐ฃ to some vertex ๐ข. Simple path means we are not allowed to repeat a vertex twice. This problem is NP-Hard. Given a parameter ๐ and find a simple path of length ๐. A trivial algorithm would work in time ๐๐ . Algorithm 1 - Take a random permutation over the vertices. - Remove every “backward” edge - Find the longest path in remaining graph using dynamic programming TODO: Draw linear numbering For some permutation, some of the edges go forward, and some go forward. After removing backward edges, we get a DAG. For each vertex ๐ฃ from left to right, record length of longest path that ends at ๐ฃ. Suppose the graph contains a path of length ๐. What is the probability all edges of the path survived the first step? The vertices have ๐! possible permutations, and only one permutation is good for us. So with 1 probability ๐! we will not drop the edges of the path and the algorithm succeeds. This is not too good! If ๐ = 10 it is not too good! So what do we do? Keep running it until it succeeds. 1 The probability of a failure is (1 − (๐!)), but if we run it more than once it becomes: 1 ๐๐! (1 − (๐!)) 1 ๐ ≈ (๐) expected running. Expected running time ๐(๐2 ๐!). ๐๐ log ๐ You can approximate ๐! =≈ ๐ ๐ ≤ ๐, ๐ ≅ log log ๐ Algorithm 2 - Color the vertices of the graph at random with ๐ colors. - Use dynamic programming to find a colorful path. 1 2 … ๐ ๐ฃ1 ๐1 ๐ฃ1 ๐2 ๐ฃ๐ + ๐๐ โฎ ๐ ๐ฃ๐ ๐ 2 S is the set of colors we used so far. ๐! ๐๐ 1 Probability that all vertices on path of length ๐ have different colors: ๐ ๐ ≈ ๐ ๐ ๐ ๐ = ๐ ๐ If ๐ ≅ log ๐ then the success would be 1 . ๐1… Tournaments A tournament is a directed graph in which for every pair of vertices there exists an edge (in one direction). An edge exists if the player (represented by the vertex of origin) won another player (represented by the second vertex). We try to find the most consistent ranking of the players. We want to find a linear order that minimizes the number of upsets. An upset is an edge going in the other direction of another edge (tour?). A Feedback Arc Set in a Tournament (FAST) is all the arcs that go backwards. Another motivation Suppose you want to open a Meta search engine. How can you aggregate the answers given from the different engines? TODO: Draw the ranking… The problem is NP-Hard. k-FAST Find a linear alignment in which there are at most ๐ edges going backwards. ๐ ( ) 0≤๐≤ 2 2 We will describe an algorithm that is good for ๐ ≤ ๐ ≤ ๐(๐2 ) Suppose there are ๐ก subparts that have the optimal solution and we want to merge them into a bigger solution. If the graph ๐บ is partitioned into ๐ก parts, and we are given the “true” order within each part, merging the parts takes time ๐๐(๐ก) (O just for the bookkeeping). Color Coding Pick ๐ก (๐ ๐๐ข๐๐๐ก๐๐๐ ๐๐ ๐) and color the vertices independently with ๐ก colors. We want that: For the minimum feedback arc set ๐น, every arc of ๐น has endpoints of distinct colors. Denote it as “๐น is colorful”. Why is this desirable? If ๐น is colorful, then for every color class we do know the true order. We use the fact that this is a tournament! The order is unique due to the fact that between every two vertices there is an arc (in some direction). Lets look at two extreme cases: 1) ๐ก > 2๐ - the probability the second vertex gets the first color then the probability is ๐ ๐ก < 2. However, the runtime would be ๐๐(๐) 2) ๐ก < √๐ - not too good! Intuition: You have a √๐ vertices where the direction of the arcs is essentially random. So you can create a graph that has very bad chances (didn’t provide a real proof). If ๐ก > 8√๐ then with good probability – F is colorful. The probability behaves something 1 like: ๐๐ก. Then the expected running time is ๐๐ก , and with the previous running time the total running time is still ๐๐(๐ก) . ๐ 1 ๐ 1 ๐ก → (1 − ) ≈ ( ) → ๐ก ≅ √๐ ๐ก ๐ 1 √๐ ( ) ๐ ๐๐(√๐) Lemma: Every graph with k edges has an ordering of its vertices in which no vertex has more than √2๐ forward edges. Why is this true? Pick an arbitrary graph with k edges, and start arranging its edges with the lowest degree first. So each time we put the vertex of lower degree of the suffix. deg ๐ =degree of ๐ฃ๐ ๐๐ =number of forward edges of ๐ฃ๐ ๐ =๐−๐ ๐ ≤ ๐ - Because a vertex can have one forward edge to every vertex to its right ๐ ≤ deg ๐ → ๐ โ ๐ ≤ deg ๐ โ ๐ ≤ ∑๐>๐ deg ๐ ≤ ∑๐ deg ๐ = 2๐พ Therefore, ๐2 ≤ 2๐พ → ๐ ≤ √2๐พ. The chance of failure for some vertex is bound by ๐๐ ๐ก (each neighbor has a chance of having ๐ the wrong color). Therefore, the chance of success is at least 1 − ๐ก๐. However, since ๐๐ is blocked by √2๐พ, in order to have a valid expression, ๐ก must be larger than √2๐พ. (1 − ∏ (1 − ๐ -------end of lesson 2 ๐๐ ๐๐ ) ≅ ๐− ๐ก ๐ก ∑ ๐๐ ๐๐ 2๐ ๐๐ ) ≤ ∏ ๐ − ๐ก = ๐ ๐ก = ๐ − ๐ก = ๐ −√๐ ๐ก Repeating last week’s lesson: In every graph with ๐ edges, if we color its vertices at random with √8๐ colors, then w.p. ≥ (2๐)−√8๐ the coloring is proper. Assume we color the graph with ๐ก colors. TODO: Draw the example that shows the coloring is dependent. Inductive Coloring Given an arbitrary graph ๐บ, you find the vertex with the least degree in the graph. Then remove that vertex. Then find the next one with the least degree and so on… This determines an ordering of the vertices: ๐ฃ1 , ๐ฃ2 , … , ๐ฃ๐ ๐ฃ๐ - has the least degree in ๐บ(๐ฃ๐ , … , ๐ฃ๐ ). Then we start by coloring the last vertex. Each time we color the vertex according to the vertices to its right (so it will be proper). If ๐ is the maximum right degree of any vertex, then inductive coloring uses at most ๐ + 1 colors. In every planar graph, there is some vertex of degree at most 5. Corollary: Planar graphs can be colored (easily) with 6 colors. Every graph with ๐ edges, has an ordering of the vertices in which all right degrees are at most √2๐. So we need at least √2๐ + 1 colors. But we don’t want our chances of success to be too low, so we use twice that number of colors - 2√2๐. Let the list of degrees to the right - ๐1 , ๐2 , … , ๐๐ . ๐ ∑ ๐๐ = ๐ ๐=1 What is the probability of violating the proper coloring for vertex ๐? Number of colors left for vertex ๐ - ๐ก−๐๐ ๐ก ๐ก − ๐๐ ๐๐ ∏( ) = ∏ (1 − ) ๐ก ๐ก ๐ But we know ๐๐ ๐ก ≤ √2๐ √8๐ = 1 2 It’s easier to evaluate sums than products. Suppose ๐๐ ๐ก 1 = ๐ (a very small number) ๐ Why is this true: 1− 2 1 ≥ 2−๐ ๐ Let’s raise both sides: ๐ ๐ 2 2 1 2 1 (1 − ) ≥ (2−๐ ) = ๐ 2 1 1 As long as < , each power only chops off less than half of what remains meaning the left ๐ 2 1 side won’t for below 2. So the inequality is true. So, back to the original formula: ∏ 2− 2๐๐ ๐ก = 2− ๐ ∑ ๐ ๐ก ๐ = 2๐ก ๐ Maximum weight independent set in a tree Given a tree. In which, each vertex has a non-negative weight. We need to select an independent set in the tree. In graphs this problem is NP hard, but in trees we can do it in polynomial time. TODO: Add a drawing of a tree We pick an arbitrary vertex ๐ as the root. We think of the vertices as being directed “away” from the root. Given a vertex ๐ฃ, denote by ๐(๐ฃ) as the set of vertices reachable from ๐ฃ (in the direction from the root). So for each vertex, we will keep two variables: ๐ + (๐ฃ) − is the maximum weight independent set in the tree ๐(๐ฃ), that contain ๐ฃ. ๐ − (๐ฃ) − is the maximum weight independent set in the tree ๐(๐ฃ), that do not contain ๐ฃ. Need to find ๐ + (๐), ๐ − (๐) and the answer is the largest of the two. The initialization is trivial. ๐ + (๐ฃ) = ๐ค(๐ฃ) and ๐ − (๐ฃ) = 0. For every leaf ๐ of ๐, determine ๐ + (๐) = ๐ค(๐), ๐ − (๐) = 0 and remove it from the tree. Pick a leaf of the remaining tree, with children ๐ข1 , … , ๐ข๐ . ๐ ๐ + (๐ฃ) = ๐ค(๐ฃ) + ∑ ๐ − (๐ข๐ ) ๐=1 ๐ ๐ − (๐ฃ) = 0 + ∑ max{๐ − (๐ข๐ ), ๐ + (๐ข๐ )} ๐=1 This algorithm can also work on graphs that are slightly different than trees. (do private calculations for the non-compatible parts). Can we have a theorem of when the graph is just a bit different than a tree and still the algorithm can run in polynomial time? Tree Decomposition of a Graph We have some graph ๐บ, and we want to represent it as a tree ๐. Each node of the tree ๐ would represent a set of vertices of graph ๐บ. Every node of the tree is labeled by a set of vertices of the original graph ๐บ. Denote such sets as bags. We also have the following constraints: 1) Moreover, the union of all these sets is all vertices of ๐บ. 2) Every edge 〈๐ฃ๐ , ๐ฃ๐ 〉 in ๐บ is in some bag. 3) For every vertex ๐ฃ ∈ ๐บ, the bags containing ๐ฃ are connected in ๐. Meaning, that they are connected with vertices that contain ๐ฃ, and do not have to pass through vertices that do not contain ๐ฃ. Given two bags - ๐ต1 and ๐ต2 ๐ . ๐ก. ๐ต1 ⊆ ๐ต2 , they are connected through a single path (because it’s a tree). This path must contain all vertices of ๐ต1 . Tree Width of a Tree Decomposition The Tree width of ๐ is ๐ if the maximal bag size is ๐ + 1. Tree width of ๐บ is the smallest ๐ for which there is a tree decomposition of tree width ๐. Intuitively – a graph is closer to a tree when its ๐ is smaller. Properties regarding Tree width of graphs Lemma: If ๐บ has tree width ๐, then ๐บ has a vertex of degree at most ๐. Observe a tree decomposition of ๐บ. It has some leaf ๐ฃ. The bag of this leaf has ๐ + 1 vertices at most. It has only one neighbor (since it’s a leaf). Since no bag contains another bag, there is some vertex that exists in its neighbor that is not in ๐ฃ. TODO: Copy the rest Fact: ๐๐(๐บ\๐ฃ) ≤ ๐๐(๐บ). Since we can always take the original tree decomposition and remove the vertex. Corrolery: If ๐บ has tree width ๐ then ๐บ can be properly colored ๐ + 1 colors. Indicates that if ๐บ is a tree, its tree width is 1 (since a tree is a bi-part graph and therefore can be colored by 2 colors). A graph with no edges has tree width 0, since you can have each bag as a singleton of a vertex. A complete graph on ๐ vertices has ๐๐ = ๐ − 1 (one bad holding all vertices) A compete graph missing an edge – 〈๐ข, ๐ฃ〉 has ๐๐ = ๐ − 1: We can construct two bags – ๐บ − ๐ข and ๐บ − ๐ฃ and connect them. Theorem: ๐บ has ๐๐ = 1 iff ๐บ is a tree. Assume ๐บ has ๐๐ = 1. Has a vertex ๐ฃ of degree 1. Remove ๐ฃ. The graph is still a tree! So we can continue… We assume the graph is connected. But it doesn’t have a cycled! If it had a cycle, we would have a contradiction. A connected graph with no cycles is a tree. Assume ๐บ is a tree. Lets construct the decomposition as follows: Lets define each vertex as a bag with two vertices. An edge is connected to all edges that are other edges of the contained vertices. Series-Parallel graphs TODO: Draw resistors… Series-Parallel graphs are exactly all graphs with ๐๐ = 2. Start from isolated vertices. 1) Add a vertex in series. 2) Add a self loop 3) Add an edge in parallel to an existing edge 4) Subdivide an edge Series-Parallel⇒ ๐๐(2). TODO: Draw ------ end of lesson 3 Graph Minor A graph ๐ป is a minor of graph ๐บ if ๐ป can be obtained from ๐บ by: (1) Removing vertices (2) Removing edges (3) Contracting edges TODO: Draw graph Definition: A sub-graph is a graph generated by removing edges and vetices Definition: An induced sub-graph is a graph with a subset of the vertices that includes all remaining edges. Contracting an edge is joining the two vertices of the edge together, such that the new vertex has edges to all the vertices the original vertices had. A graph is planar if and only if it does not contain neither ๐พ5 nor ๐พ3,3 as a minor. TODO: Draw the forbidden graphs Definition: A graph is a tree or a forest if it doesn’t contain a cycle⇔A clique of 3 vertices as a minor. A graph is Series parallel, if it does not contain a ๐4 as a minor. Theorem: There are planar graphs on ๐ vertices with tree width Ω(√๐) Let’s look at an √๐ by √๐ grid โ − โ − โ − โ | | | | โ − โ − โ − โ | | | | โ − โ − โ − โ | | | | โ − โ − โ − โ We will construct √๐ − 1 bags. A bag ๐ contains columns ๐ and ๐ + 1 Bags: โ − โ − โ … This is a tree decomposition by all the properties. Vertex Separators Vertex Separators: A set of ๐ of vertices in a graph ๐บ is a vertex separator if removing ๐ from ๐บ, the graph ๐บ decomposes into connected components of size at most 2๐ 3 TODO: Draw schematic picture of a graph It means we can partition the connected components into two groups, none of them with more than 2๐ 3 vertices. Every tree has a separator of size 1 Let ๐ be a tree. Pick an arbitrary root ๐. ๐(๐ฃ) =the size of the sub-tree of ๐ฃ (according to ๐). ๐(๐) = ๐. 2 3 2 3 All leaves have size 1. So ∃๐ฃ with ๐(๐ฃ) > ๐ and ๐(๐ข) ≤ ๐ for all children ๐ข of ๐ฃ. That ๐ฃ is the separator. If a graph ๐บ has tree width ๐, then it has a separator of size ๐ + 1. My summary: Let ๐ท be some tree decomposition. Each bag has at most ๐ + 1 vertices. We can now find the separator of ๐ท and consider its ๐ + 1 vertices as the separator of the graph ๐บ. Note that when we calculate ๐(๐ฃ) for some ๐ฃ ∈ ๐ท, we should count the number of vertices inside the bags below it (not the number of bags). His summary: Consider a tree decomposition ๐ of width ๐. Let ๐ serve as its root. And orient edges of ๐ away from ๐. 2 Pick ๐ to be the lowest bag whose sub-tree contains more than 3 ๐ vertices. Every Separator in the √๐ by √๐ grid is of size at least √๐ 6 Why is this so? Let’s assume otherwise. So there is such a separator. Let ๐ be a set of 5√๐ 6 √๐ 6 vertices. rows that do not contain a vertex from ๐. Same for columns. Let’s ignore all vertices in the same row or column with a vertex from ๐. So we ignore at most √๐ 6 โ √๐ + √๐ 6 ๐ โ √๐ = 3 vertices. Claim: All other vertices are in the same connected component. Since we can walk on the row and column freely (since no members of ๐ share the same row and column). Therefore 2 we have 3 ๐ connected components – A contradiction. If ๐บ has tree width ๐, then ๐บ is colorable by ๐ + 1 colors. Every planar graph is 5-colorable. Proof: A planar graph has a vertex of degree at most 5. We will use a variation of inductive coloring. Pick the vertex with the smallest degree. Assume we have a vertex with a degree 5. These 5 neighbors cannot form a clique! (since the graph is planar) So one edge is missing. Contract the two vertices of that missing edge with the center vertex (the one with degree 5). Fact: Contraction of edges maintains planarity. This is immediately seen when thinking of bringing the edges closer in the plain. Now we color the new graph (after contraction). Then we give all “contracted” vertices the same color in the original graph (before contraction). After the contraction we have degree 4, so we can color it with 5 colors. Hadwiger: Every graph that does not contain ๐พ๐ก as a minor, can be colored with ๐ก − 1 colors. (A generalization of the 4 colors theorem). If a graph has tree width ๐, then we can find its chromatic number in time ๐(๐(๐) โ ๐) Note: The number of colors is definitely between 1 and ๐ + 1 (we know it can be colored by ๐ + 1 colors… The question is whether it can be colored with less). For a value of ๐ก ≤ ๐, is ๐บ ๐ก-colorable? Theorem (without proof): Given a graph of tree width ๐, a tree decomposition with tree width ๐ can be found in time linear in ๐ โ ๐(๐) TODO: Draw the tree decomposition For each bag, we will keep a table of size ๐ก ๐+1 of all possible colorings of its vertices. For each entry in the table, we need to keep one bit that determines whether that coloring is legal with the bags below that bag. We can easily do it for every leaf (0/1 depending on: “is this coloring is legal with sub-graph below bag). A coloring is legal if the sub-tree can be colored such that there is no collision of assignment of colors. A family ๐น of graphs is minor-closed (or closed under taking minors) if whenever ๐บ ∈ ๐น and ๐ป is a minor of ๐บ, then ๐ป ∈ ๐น. Characterizing ๐น: 1) By some other property 2) List all graphs in the family (only if the family is finite) 3) List all graphs not in ๐น (if the complementary of the family is finite) 4) List all forbidden minors 5) List a minimal set of forbidden minors For planar graphs, this minimal set was ๐พ5 and ๐พ3,3 A list of minors is a list of graphs: ๐บ1 , ๐บ2 … such that no graph is a minor of the other. A conjecture by Wagner: Minor relation induces a “well quasi order”⇔No infinite antichain ⇔ A chain of graphs such that no graph is a minor of another graph. So this is always a finite property!!! The conjecture was proved by Robertson and Seymour. ----- End of lesson 4 Greedy Algorithms When we try to solve a combinatorical problem, we have many options: - Exhaustive Search - Dynamic Programming Now we introduce greedy algorithms as a new approach Scheduling theory Matroids Interval Scheduling There are jobs that need to be performed at certain times that take a certain time ๐1 ๐ 1 − ๐ก1 ๐2 ๐ 2 − ๐ก2 โฎ We can’t schedule two jobs at the same time! TODO: Draw intervals We can represent the problem as a graph in which two intervals will share an edge if they intersect. Then we will look for the maximal independent set. Such a graph is called interval graph. There are classes of graphs in which we can solve Independent Set in polynomial time. One such family of graphs is Perfect graphs. Algorithm: Sort intervals by earliest ending times and use a greedy algorithm to select the interval that ends sooner. Remove all its intersecting intervals and repeat until no more intervals remain. Why does it work? Proof: Suppose the greedy algorithm picked some intervals ๐1 , ๐2 , … Consider the optimal solution that has the largest common prefix with the greedy one: ๐1 , ๐2 , … Consider first index ๐ such that ๐๐ ≠ ๐๐ Since ๐๐ ends at the same time (or sooner) than ๐๐ we can generate a new optimal solution that has ๐๐ instead of ๐๐ . Such a solution would still be optimal (same number of intervals) and is legal ⇒ there exists a solution with a larger common prefix, a contradiction! Interval Scheduling 2 ๐ฝ1 โถ ๐ค1 > 0, ๐๐ > 0 ๐ฝ1 โถ ๐ค1 > 0, ๐๐ > 0 ๐ค determines how important the job is. ๐ determines how much time would it take for a CPU to perform the job. Penalty for job ๐ given a particular schedule is the ending time of ๐ × ๐ค๐ Total penalty is ∑๐ ๐๐ (๐) Find schedule ๐ than minimizes the total penalty. ๐ฝ1 … ๐ฝ๐ ๐ฝ๐ … Let’s flip some job: ๐ฝ1 , … ๐ฝ๐ ๐ฝ๐ … And suppose that by flipping the order grew. The penalty for the prefix and the suffix stays the same! So we should only consider what happens to the penalty from ๐ฝ๐ , ๐ฝ๐ that resulted the switch ๐ค๐ (๐ก + ๐๐ ) + ๐ค๐ (๐ก + ๐๐ + ๐๐ ) ๐ค๐ (๐ก + ๐๐ ) + ๐ค๐ (๐ก + ๐๐ + ๐๐ ) ๐ค๐ ๐ค๐ ๐ค๐ ๐๐ < ๐ค๐ ๐๐ ⇒ < ๐๐ ๐๐ So this suggests the following algorithm: Schedule jobs in decreasing order of ๐ค๐ ๐๐ This is optimal. Proof: Consider any other schedule which does not respect this order. Then there must be some ๐ in which ๐ค๐ ๐๐ < ๐ค๐ ๐๐ . Then we can reverse the order, and get a better scheduling – a contradiction! General Definition of Greedy-Algorithm solvable problems – Matroids ๐ = {๐1 , … , ๐๐ } (sometimes we have matroids in which the items represent edges) ๐น = a collection of subsets of ๐. (Short for “Feasible”. Often called “independent sets”). We want ๐น to be hereditary. Hereditary - If ๐ ∈ ๐น, ๐ ⊂ ๐ ⇒ ๐ ∈ ๐น And we also need a cost function: ๐: ๐ → ๐ + Find a set ๐ ∈ ๐น with maximum cost. ∑๐๐ ∈๐ ๐(๐๐ ) Example: Given a graph ๐บ, the items are the vertices of ๐บ. ๐น is the independent sets of ๐บ, ∀๐ฅ. ๐(๐ฅ) = 1. ๐๐ผ๐(๐บ) is NP hard. Definition: A hereditary family ๐น is a matroid, if and only if ∀๐, ๐ ∈ ๐น if |๐| > |๐| then ∃๐ ∈ ๐, ๐ ∉ ๐ such that ๐ ∪ {๐} ∈ ๐น. Proposition: All maximal sets in a matroid have the same cardinality. Suppose |๐| > |๐|, then there should be some item in ๐ that we can add to ๐ to make it feasible, and therefore ๐ does not have the maximal cardinality. All maximal sets is called “bases”, and the cardinality of the maximal sets is called the “rank of the matroid”. We have all sorts of terms: Matroids, Independent Sets, Basis, Rank. How are they related? Consider a matrix. The items are the columns of the matrix. The independent sets are columns that are linearly independent columns. Note that it is hereditary. Because if a set of columns is linearly independent, then any subsets is also linearly independent. The basis of the matrix is the largest set of columns that spans the space. And the rank is also the same as in linear algebra. Theorem: For every hereditary family ๐น, the greedy algorithm finds the maximum solution for every cost function ๐ ⇔ ๐น is a matroid Greedy Algorithm: Iteratively, add to the solution the item ๐๐ with largest cost that still keeps the solution feasible. Proof: Assume that F is not a matroid. So (by definition) there are some sets ๐, ๐ such that |๐| > |๐|, ๐, ๐ ∈ ๐น and ∀๐ ∈ ๐\๐. ๐ ∪ {๐} ∉ ๐น. 1 ๐ ∀๐ ∈ ๐. ๐(๐) = 1 + ๐ ๐ < if ๐ = |๐น| (or something similar) ∀๐ ∈ ๐\๐. ๐(๐) = 1 ∀๐ ∉ ๐ ∪ ๐. ๐(๐) = ๐ 2 Since the elements of ๐ have the highest cost, they will be selected until the set ๐ is chosen and cannot increase any further. But the optimal solution would be to select the elements of ๐ – so the greedy algorithm does not solve the problem! Suppose ๐น is a matroid. Since ๐: ๐ → ๐ + - all weights are positive, the optimal solution is a basis. But likewise, the greedy algorithm is also a basis Sort items in solution in order of decreasing cost. Suppose the items in the greedy solutions are ๐1 , … , ๐๐ where ๐ is the rank of the matroid. And the items in the optimal solution are ๐1 , … , ๐๐ where ๐ is the rank of the matroid. Suppose the maximum prefix is not the entire list of items. Suppose index ๐ is different. So ๐๐ ≠ ๐๐ but ๐๐−1 = ๐๐−1 . But we know that: ๐(๐๐ ) ≥ ๐(๐๐ ) So let’s build another optimal solution. Observe the set: ๐1 , … ๐๐−1 , ๐๐ Because this is a matroid, we definitely have some item in ๐1 , … , ๐๐−1 , ๐๐ , … , ๐๐ that can be added and still have an element of ๐น. We can continue doing so until the group is just as large as ๐1 , … , ๐๐−1 , ๐๐ , … , ๐๐ . But all added items have to be of the set {๐๐ , … , ๐๐ }. But since all items are ordered all of them must have a cost ≥ ๐(๐๐ ) and ๐(๐๐ ) ≤ ๐(๐๐ ) Greedy works also for general ๐, if the requirement is to find a basis. Graphical Matroid Graph ๐บ. items are edges of ๐บ.๐น - forests of ๐บ. Sets of edges that do not close a cycle. ๐บ is connected – bases⇔spanning trees. Greedy algorithm - Finds maximum weight spanning tree. We can also find minimal spanning tree – Kruskal’s algorithm. ----- End of lesson 5 Matroids – a hereditary family of sets ๐ ∈ ๐น, ๐ ⊂ ๐ ⇒ ๐ ∈ ๐น ๐, ๐ ∈ ๐น, |๐| > |๐| ⇒ ๐ ∈ ๐ ๐ . ๐ก. ๐ ∪ {๐} ∈ ๐น The natural greedy algorithm, given any cost function: ๐: ๐ → ๐ + , finds the independent set (members of ๐) of maximum total weight/cost. The maximal sets in ๐ all have the same cardinality, ๐(rank). The maximal sets are called “basis”. ๐′ ⊂ ๐ And for every ๐ ⊂ ๐, ๐ ∈ ๐น. ๐ ′ = ๐ ∩ ๐ ′ This gives rise to a family ๐น ′ . If (S, F) is a matroid, then so is (๐ ′ , ๐น ′ ) |๐ ′ | > |๐ ′ | The rank of the new matroid is ๐ ′ ⇒ ๐ ′ ≤ ๐ Dual of a Matroid Given a matroid (๐, ๐น), it’s dual (๐, ๐น ∗ ) is the collection of all sets where each ๐ satisfies ๐\๐ still contains a basis for (๐, ๐น) In the graphical matroid: ๐ edges of graph ๐บ. ๐น - forests of graph ๐บ The bases are the spanning trees of the graph. ๐น ∗ - Any collection of edges that by removing it the graph is still connected. Theorem: The dual ๐น ∗ of matroid ๐น is a matroid by itself. Moreover, (๐น ∗ )∗ = ๐น. Proof: We need to show that the dual is hereditary – but this is easy to see. If we remove an item from ๐ฅ, it still doesn’t interfere with the bases of ๐. ๐, ๐ ∈ ๐น ∗ , |๐| > |๐| ∃๐ ∈ ๐. ๐ ∪ {๐ฅ} ∈ ๐น ∗ TODO: Draw sets ๐ and ๐. Let’s look at ๐ ′ = ๐ − (๐ ∪ ๐) We know that (๐ ′ , ๐น ′ ) is a matroid, and it has rank ๐ ′ . If ๐ ′ = ๐ we can move any item from ๐ to ๐ and we will still be in ๐น ∗ . So the only case we have a problem is when ๐ ′ < ๐ (it can’t be larger). ๐ต′ be a basis for (๐ ′ , ๐น ′ ). |๐ต| = ๐ ′ ๐ − ๐ ′ ≤ |๐\๐| Complete ๐ต′ to a basis of ๐น using ๐ต๐ฆ . The number of elements of ๐ต๐ฆ that we use is exactly ๐ − ๐ ′ ≤ ๐\๐ < ๐\๐. So some item wasn’t used in ๐\๐! We can take that item, and add it to ๐ and therefore still get a basis when ๐ plus that item is removed from ๐. (๐น ∗ )∗ = ๐น ? Because the bases of the dual is just the complement of the bases of the original matroid. |๐| = ๐ Also all the bases of the dual are of size ๐ − ๐. ๐ ∗ = ๐ − ๐. How did we find the minimal spanning tree? We sorted all weights by their weight, and added an edge in the spanning tree as long as it doesn’t close a cycle. Minimum weight basis for ๐น=complement of maximum weight basis for ๐น ∗ . Graphical Matroids on Planar Graphs TODO: Draw planar graphs Every interior (a shape closed by edges) is denoted as a vertex. Then every two vertices are connected if they share a common “side” (or edge). The exterior is a single vertex. A minimal cut set in the primal is a cycle in the dual. The dual of a spanning tree is a spanning tree. The complement of a spanning tree in the primal is a spanning tree in the dual. Assume we have a planar graph with ๐ฃ vertices, ๐ faces, and ๐ edges. ๐ฃ−1+๐−1=๐ We can always triangulate a planar graph, thus increasing the number of vertices but keeping it planar. In such graphs 3๐ = 2๐. 2๐ ๐ฃ−1+ −1=๐ 3 ๐ ๐ฃ−2= 3 ๐ = 3๐ฃ − 6 ๐ 2 (๐ฃ) = 6 − 12 ๐ฃ = 6 minus something – at least one with 5 or less! (๐, ๐น1 ), (๐, ๐น2 ) We can look at their intersection such that ๐ ∈ ๐น1 and ๐ ∈ ๐น2 Partition Matroid Partition ๐ into: ๐1 , ๐2 , … , ๐๐ And every set that contains at most one item from each partition. TODO: Draw stuff A matching is a collection of edges where no two of them touch the same vertex. The intersection of two partition matroids is the set of matchings in bipartite graph. In a bipartite graph the set of matchings are not a matroid. TODO: Draw example graph. Theorem: For every cost function ๐: ๐ → ๐ +, one can optimize over the intersection of two matroids in polynomial time. Intersections of 3 matroids. TODO: Draw partitions for the matroid. Given a collection of triples – find a set of maximum cardinality of disjoint triples. This problem is ๐๐-Hard. Things we will see: 1) Algorithm for maximum cardinality matching in bipartite graphs. 2) Algorithm for maximum cardinality matching in arbitrary graphs. 3) Algorithm for maximum weight matching in bipartite graphs. There is also an algorithm for maximum weight matching in arbitrary graphs, but we will not show it in class. Vertex cover – a set of vertices that cover (touch) all edges. Min vertex cover ≥ maximum matching Min vertex cover ≤ 2 โmaximal matching. For bipartite graphs minV.C.=max matching. Alternating Paths TODO: Draw graphs Vertices not in our current matching will be called “exposed”. An alternating path connects two exposed vertices and edges alternates with respect to being in the matching. Given an arbitrary matching ๐ and an alternating path ๐, we can get a larger matching by switching the matching along ๐. Alternating forest: A collection of rooted trees. The roots are the exposed vertices, and the trees are alternating. TODO: Draw alternating forest In a single step: - connect exposed vertices - Add two edges to a tree (some tree) The procedure to extend a tree: Pick an outer vertex, and connect it to - exposed vertex – alternating path – extend matching - outer vertex – different tree. So we can get from a root of some tree to the root of another tree. - Extened the tree (alternating forest) by two edges. Claim: When I’m stuck I certainly have a maximum matching. Proof by picture. Select the inner vertices as the vertex cover. --- end of lesson 6 Matchings TODO: Draw an alternating forest 1) If you find an edge between exposed vertices then you can just add it to the matching 2) An edge between two outer vertices (in different trees) – alternating path 3) Edge from outer vertex (the exposed vertices are considered as outer vertices) to an unused matching edge In cases 1 and 2 we restart building a forest Gallai-Edmond’s Decomposition Outer vertices ar denoted as ๐ถ (components) Inner vertices are denoted as ๐ (separator) The rest are denoted as ๐ (Rest) We don’t have edges between ๐ถ and itself otherwise the algorithm would not stop We also don’t have edges between ๐ถ and ๐ otherwise it wouldn’t have stopped In short, we may have edges between ๐ถ and ๐, ๐ can have edges between it and itself, ๐ can have edges between it and ๐ and ๐ can have edges between it and itself. The only choice of matching we used are internal edges of ๐ and edges between C and ๐. All vertices of ๐ are matched, all vertices of ๐ are matched, and the number of vertices of ๐ถ participate in the matching is |๐ถ| − |๐| Another definition of ๐ถ, ๐ and ๐ : ๐ถ is the set of vertices that are not matched some maximum matching. ๐ is all neighbors of ๐ถ ๐ is the rest The number of vertices not matched in a maximum matching is exactly |๐ถ| − |๐|. General Graph In general graphs we might have odd cycles. So case 2 doesn’t work anymore (because an edge can exist in the same tree) We must have a new case: 2a) An edge between two outer vertices of the same tree. In this case we get an odd cycle. Contract the odd cycle! The contracted vertex is an outer vertex. Suppose we had a graph ๐บ, and now we have ๐บ ′ as the contracted graph. First note the following: Size of matching in ๐บ ′ is equal to size of matching in ๐บ − ๐ if the odd cycle had length 2๐ + 1. Lift matching from ๐บ ′ to ๐บ and restart building a forest (instead of just restart building a forest). Now we can see why we call the outer vertices ๐ถ - since they might represent components (contracted odd cycles) Can be see that each component always represents an odd number of vertices. Now, we might have components that might have edges to themselves, but not between two components. So the separator really separates the different components from each other and ๐. Vertices of ๐ are all matched. Components in ๐ถ are either connected to some vertex in ๐ or none (at most connected to one vertex in ๐) But then this is the optimal matching! Which means that the algorithm is correct. So the algorithm finds a matching that misses |๐ถ| − |๐|. But this is the minimal number of edges we miss by any covering – so our solution is the optimal one. Theorem: If in a graph there is a set ๐ of vertices where removal leaves in the graph ๐ก components of odd size (and any number of components of even size, then the size of the maximum matching is at most ๐−(๐ก−|๐|) 2 Minimum Weight Perfect Matching in Bipartite Graphs Weights are non-negative, and we try to find the perfect matching with the maximal weight. Let’s observe the variant where we search for a maximal weight. If an edge is missing, we can add zeroes instead. Then a perfect matching always exist. And then we can apply a transformation on the maximal variant to make it a minimal variant. So the problem is well defined for non-perfect graphs. We will use: - Weight Shifting - Primal-Dual method In our case the primal problem – minimum weight covering of vertices by edges. But a perfect matching is a covering problem with minimal weight. The dual problem is the packing problem: Assign non-negative weights to the vertices such that for every edge, sum of weights of its endpoints is at most weight the edge. Maximize sum of weights of vertices. TODO: Draw a bipartite graph The primal is a minimization problem, so it’s at least as large as the dual. For optimal solution there is equality! We will show that… This is a theorem of Evergary of 1931 Let’s try to reduce the problem to an easier problem. Let ๐ค๐ the smallest weight of an edge Subtract ๐ค๐ from the weight of every edge. Every perfect matching contains exactly ๐ โ ๐ค๐ from its weight. As for the dual problem, we can start with the weight of every vertex as 0, and increase the weight of every vertex by ๐ค๐ . 2 The dual we got is a feasible dual, and we decreased the weight of the edges. Let ๐ฃ be some vertex. We can subtract ๐ค๐ฃ from all of its edges. Since one of them has to be in the final perfect matching, the perfect matching lost exactly ๐ค๐ฃ And in the dual, we can increase the weight of ๐ฃ by ๐ค๐ฃ We will keep the following claims true: - For every edge, the amount of weight it loses is exactly equal to the number of weight gained by its endpoints. - At any state, edges have non-negative weights. Consider only edges of zero weight and find maximum matching. There are two possibilities now: - It is a perfect matching. In this case it is optimal. - It’s not a perfect matching. We can observe the Galai-Edmond decomposition of the graph. Let ๐ be the minimum weight of an edge between ๐ถ and ๐ or an edge of weight 2๐ between ๐ถ and ๐ถ. For every vertex in ๐ถ, we will add to its weight +๐ For every vertex is ๐ we will reduce its weight by ๐ With respect to the matching, we did not change the weight. But we created one more zero weight edge! (Note that no edges became negative) Either we increased the matching (for an edge between ๐ถ and ๐ถ) Or we increased ๐ (for an edge between ๐ถ and ๐ , since the new connected component of ๐ now belongs to ๐ since a neighbor of some vertex in ๐ถ) Every time we make progress. So in ๐(๐2 ) weight shifting steps, get a perfect matching of weight zero. --- end of lesson 7 Algebraic Methods for Solving Algorithms Search of Triangles By using exhaustive search, we can go over all triples and do it in ๐(๐3 ) The question is, can we do it better? Multipication of Two Complex Numbers (๐ + ๐พ๐)(๐ + ๐พ๐) = ๐๐ − ๐๐ + ๐พ(๐๐ + ๐๐) Assume the numbers are large! Multiplications are rather expensive. Much more then addition and subtractions. Let’s compute ๐๐, ๐๐, and then compute: ๐ + ๐ and then ๐ + ๐ And then calculate: (๐ + ๐)(๐ + ๐) = ๐๐ + ๐๐ + ๐๐ + ๐๐ But then we can use the previous values to extract ๐๐ + ๐๐ So the naïve way uses 4 products and 2 additions/subtractions. The new way uses 3 products and 5 additions/subtractions. Fast Matrix Multiplication Suppose we have a matrix: ๐11 ๐12 … ๐1๐ โฎ โฎ ๐ด=[ โฎ ๐ต = (๐๐๐ ), ๐ = ๐ด๐ต = (๐๐๐ ), ๐๐๐ = ∑ ๐๐๐ ๐๐๐ โฎ ], ๐ ๐๐1 … … ๐๐๐ So we need ๐3 products to compute ๐ Unlike the previous problem, we want to reduce both the number of products and the number of additions. We will show a very known algorithm by Strassen: Assume for simplicity that ๐ is a power of 2. If not we can pad it with zeroes to the next power of 2. Let’s partition ๐ด and ๐ต into blocks: ๐ด ๐ด12 ๐ต ๐ต12 ๐ด = [ 11 ๐ต = [ 11 ], ] ๐ด21 ๐ด22 ๐ต21 ๐ต22 And then: ๐ด ๐ต + ๐ด12 ๐ต21 ๐ = [ 11 11 ๐ด11 ๐ต12 + ๐ด12 ๐ต22 ๐ด21 ๐ต11 + ๐ด22 ๐ต21 ๐ ] = [ 11 ๐ด21 ๐ต12 + ๐ด22 ๐ต22 ๐21 ๐12 ] ๐22 ๐ 2 So ๐(๐) = 8๐ ( ) + ๐(๐2 ) After solving the recursion, we get: ๐(๐) = ๐(8log ๐ ) = ๐3 If we could compute everything by 7 multiplication (instead of 8) the time will be: ๐ ๐(๐) = 7๐ ( ) + ๐(๐2 ) = ๐(7log ๐ ) = ๐(๐log 7 ) = (๐2.8 ) 2 ๐ด11 ๐ด12 ๐ด21 ๐ด22 ๐ต11 ๐11 ๐ต12 ๐12 ๐21 ๐ต21 ๐ต22 ๐11 ๐12 ๐21 ๐22 ๐22 Products: For example: 0 + + + 0 + + − 0 − − 0 0 0 0 0 0 0 0 0 0 0 0 0 What we want: + 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 If you think of pluses and minuses as +1 and -1 then we have a matrix of rank 2. As we can see, every multiplication gives a matrix of rank 1! So we must use 7 matrices of rank 1, and generate 8 matrices of rank 2. + + 0 0 ๐1 = [ + + 0 0 0 0 ๐2 = [ 0 0 0 0 0 0 0 0 ] = (๐ด11 + ๐ด21 )(๐ต11 + ๐ต12 ) 0 0 0 0 0 0 + + ] = (๐ด12 + ๐ด22 )(๐ต21 + ๐ต22 ) 0 0 0 0 + + We can observe that ๐1 + ๐2 = ๐11 + ๐12 + ๐21 + ๐22 In other words ๐22 = ๐1 + ๐2 − ๐11 − ๐12 − ๐21 So we used two products and we can find one expression. We only need to calculate these three expressions. 0 0 0 0 0 0 0 0 ๐3 = [ ] = ๐ด21 (๐ต11 + ๐ต21 ) + 0 + 0 0 0 0 0 0 0 0 0 0 0 0 0 ๐4 = [ ] = (๐ด21 + ๐ด22 )๐ต21 0 0 − 0 0 0 + 0 So we have that ๐21 = ๐3 + ๐4 0 0 ๐5 = [ 0 0 0 0 ๐6 = [ 0 0 0 0 0 0+ 0 + ] = ๐ด12 (๐ต12 + ๐ต22 ) 0 0 0 0 0 0 + 0 0 − 0 0 ] = (๐ด11 − ๐ด12 )๐ต12 0 0 0 0 0 0 So we already used 6 products, and generated ๐12 , ๐21 and are able to compute ๐22 later. We are only missing ๐11 and only one multiplication left! Can we do it??? (the tension!) 0 0 0 0 0 − + 0 ๐7 = [ ] 0 − + 0 0 0 0 0 ๐11 = ๐7 + ๐1 − ๐3 − ๐6 We did it!! The world was saved! But you can even multiply matrices faster! If you divide the matrix into 70 parts you can find algorithms that solve it with ๐2.79 The best known result is ๐2.37. Often people specify that matrices can be multiplied with ๐(๐๐ ) and use it as a subroutine. The final lower bound is ๐2 Multiplying Big Numbers Multiplying numbers of size ๐ takes ๐(๐2 ) by using the naïve approach. You can break every number into two and apply an approach similar to the one used when multiplying complex numbers. The best running time is something like: ๐ โ log ๐ โ log log ๐… Testing How can you check that ๐ด โ ๐ต = ๐ is correct? Probabilistic test: ๐ฅ ∈๐ โ 0011001 … ๐ If you got the right answer: ๐ดโ๐ตโ๐ฅ =๐โ๐ฅ But this is rather easy! A multiplication by a vector takes ๐(๐2 ) time, and we can use associativity on the left to multiply ๐ฅ by ๐ต first. Suppose the true answer is ๐, but we got ๐′ So with a random vector, there’s a good probability that ๐๐ฅ ≠ ๐′ ๐ฅ ′ Since they are different, there must be that ๐๐๐ ≠ ๐๐๐ The chance is exactly ½ that we catch the bad matrix. The reason is too long for the shulaim of this sikum. The inverse of a matrix can also be calculated using matrix multiplication, so finding an inverse takes ๐(๐๐ ) as well. Boolean Matrix Multiplication There are settings in which subtraction is not allowed. One such setting is Boolean matrix multiplication. In this case, multiplication is an AND operation and addition is an OR operation. So it’s clear why we can’t subtract. In this case, we can replace OR with addition and AND with a regular multiplication then: ′ ๐๐๐ = 0 ⇒ ๐๐๐ = 0 ′ ๐๐๐ > 0 ⇒ ๐๐๐ = 1 If you don’t like big numbers, you can use everything under mod ๐ of some prime variable But we can think of worst cases such that things stop working. Such as when the OR is replaced by XOR. Back to Search of Triangles ๐ฃ1 … … ๐ฃ๐ โฎ [โฎ ] ๐ฃ๐ ๐ด๐๐ = 1 ⇔ (๐, ๐) ∈ ๐ธ Otherwise it’s zero. Let’s look at ๐ด2 ๐ด2๐๐ = 1 if there’s a path of length 2 between vertex ๐ and ๐ So ๐ด2 is a matrix of the number of paths of size at most 2. ๐ด๐ก = # length ๐ก paths between ๐, ๐ So ๐ด3๐๐ is the number of paths of size 3 from ๐ to itself. But every such path is a triangle! So ๐ก๐๐๐๐(๐ด3 ) = # triangles in G times 6 (since for every triangle, there are 2 possible paths from each of the vertices to itself) But you can calculate ๐ด3 in ๐(๐๐ )! ๐ด๐ก๐๐ = 1 ⇔ there is a path of length at most ๐ก from ๐ to ๐. So ๐ด๐ actually “hashes” all reachability of a graph! So by performing log ๐ โ ๐(๐๐ ) operations we can find the closure of the graph. --- end of lesson 8 Matrices and Permanents Permanent ๐๐๐(๐ด) = ∑๐ ∏๐๐=1 ๐๐,๐(๐) Determinant det(๐ด) = ∑๐(−1)๐ ∏๐๐=1 ๐๐,๐(๐) det(๐ด) = 0 ⇔ ๐ด is singular Valiant showed that: Computing the permanent is #๐ complete. So ๐๐ hard. Even for 0,1 matrices. TODO: Draw a neighbor matrix of a bipartite graph. Permanent of a 0/1 matrix ๐ด is the same as the number of perfect matchings in the associated bipartite graph. ๐๐๐(๐ด) (๐๐๐ 2) = det(๐ด) (๐๐๐ 2) (because subtraction and addition are the same!) ๐๐๐(๐ด) (๐๐๐ 3) Suppose ๐ = 30 30 30 ๐ The number of permutations is 30! ≈ ( ) There is an algorithm that is still exponential but substantially faster. It was suggested by Ryser, and its running time is around 2๐ . So in the case of 30, it’s 230 which is not so bad. The trick is to use the “Inclusion-Exclusion” principle. ๐ × ๐ matrix ๐ด ๐ non-empty set of columns. ๐ ๐ (๐) – Sum of entries in row ๐ and columns from ๐. ๐ ๐−|๐| ๐๐๐(๐ด) = ∑(−1) ๐ ∏ ๐ ๐ (๐) ๐=1 Since we go over all ๐, we have 2๐ terms. The running time is something around 2๐ times some polynomial. ๐ฅ11 ๐๐๐ ( โฎ ๐ฅ๐1 … ๐ฅ1๐ โฑ โฎ ) = ∑ ∏ ๐ฅ๐,๐(๐) = … ๐ฅ๐ ๐ ๐ Multilinear polynomial. ๐! monomials. Each monomial is a product of variables. We have ๐2 variables. This applies both to the determinant and the permanent. In Ryzer’s formula we only get monomials with ๐ variables – 1 from each row. In the original definition we also get monomials with ๐ variables – 1 from each row. But each of them must be from a different column. First let’s see that all the terms that should be in Ryzer’s formula are actually there. ∏๐๐=1 ๐ฅ๐,๐(๐) - only for ๐ = all columns. This is a monomial that includes 5 variables but only 3 columns: ๐ฅ11 ๐ฅ21 ๐ฅ33 ๐ฅ32 ๐ฅ55 ๐ variables, ๐ columns – {1,3,5} ๐⊂๐ The number of minus signs: ∑๐|๐⊂๐(−1)๐−|๐| โ 1 ๐๐๐๐ = 0 TODO: Draw many partially drawn matrices. For every variable independently, substitute a random value {0, … , ๐ − 1} ๐ > ๐2 , ๐ is prime. Then compute the determinant. And we can even do the computations modulo ๐. Lemma: For every multilinear polynomial in ๐2 variables, if one substitutes random values from 0, … , ๐ − 1 and computes modulo prime ๐, the answer is 0 with probability at most ๐2 . ๐ 1 ๐๐ฅ + ๐ = 0 (๐๐๐ ๐) the probability is ๐ ๐ Suppose the probability is ๐ for ๐ variables. We want to show it’s ๐+1 for ๐ ๐ + 1 variables. We can look at the new polynomial as ๐(๐ฅ1 , … , ๐ฅ๐+1 ) ๐ฅ๐+1 (๐1 (๐ฅ1 , … , ๐ฅ๐ ) + ๐2 (๐ฅ1 , … , ๐ฅ๐ )) ๐ ๐ If ๐1 and ๐2 are zero, this happens in probability (by induction) and otherwise, there’s only one possible vale of ๐ฅ๐+1 such that everything is zero. So the total probability is (๐+1) ๐ Kastaleyn: Computing the number of matchings in planar graphs is in polynomial graphs. Kirchoff: Matrix-tree theorem. Counts the number of spanning trees in a graph. We have a graph ๐บand we want the number of spanning trees. Laplacian of ๐บ – ๐ฟ. ๐ฃ1 โฎ ๐ฃ๐ ๐ฃ1 ๐1 0 … −1 โฑ ๐ฃ๐ ๐๐ ๐๐,๐ = ๐๐,๐ = −1 โ (〈๐, ๐〉 ∈ ๐ธ) ๐๐,๐ = degree of vertex ๐ TODO: Draw the graph and its corresponding matrix (though we got the point) The algorithm: Generate the Laplacian matrix of the graph. Cross some row and column (same column) and calculate the determinant. This is the number of spanning trees of the graph. Given ๐บ, Direct its edges arbitrarily. Create the incidence matrix of the graph. ๐ฃ1 ๐ฃ2 ๐ฃ3 ๐ฃ4 ๐1 ๐2 ๐3 ๐4 +1 0 0 0 −1 −1 +1 0 0 0 −1 −1 0 +1 0 +1 For each edge you put a +1 for the vertex it is entering and −1 for the vertex it leaves. Denote this matrix ๐. If we multiply ๐ โ ๐๐ we get the Laplacian. The incidence matrix has special properties. One of the properties is its “Totally Unimoduler”. Totally unimodular: Every square sub-matrix has determinant either +1, −1 or 0. Theorem: Every matrix in which every column has at most a single +1, at most a single −1 and rest of the entries are zero, is totally unimodular. --- end of lesson Given a graph ๐บ we have its Laplacian denoted ๐ฟ. Also ๐ is the vertex-edge incidence matrix. We know that ๐๐๐ = ๐ฟ Another thing we said about ๐ is that it’s totally unimodular. Which means that every square submatrix has determinant either 0,1 or +1. Wlog, we always look at the case where ๐ ≥ ๐ (๐ is the number of edges, ๐ is the number of vertices). If the number of edges is smaller than the number of vertices, either we can use all edges as a spanning tree or we don’t have a spanning tree. ๐๐๐๐(๐) ≤ ๐ − 1 Let’s observe the matrix transposed: TODO: Draw a transposed incidence matrix. If we take ๐ฅ = [1 Then ๐ฅ ∈ ker ๐ … 1] What do we know about a submatrix with a subset of the edges (but all the vertices) If ๐ is a spanning tree, then the rank is ๐ − 1. Otherwise, then ๐๐๐๐ < ๐ − 1 ๐ ๐ det(๐๐ −๐ โ ๐๐ −๐ ) = det(๐๐ −๐ ) โ det(๐๐ −๐ ) = 1 if ๐ is a spanning tree and 0 otherwise. Denote ๐๐ −๐ = ๐๐′ So the number of spanning trees = ∑๐ det ๐๐′ โ det ๐๐′ Binet-Cauchy Formula for computing a determinant: Let ๐ด and ๐ต be two matrices. Say ๐ด is ๐ × ๐ and ๐ต is ๐ × ๐, ๐ > ๐. Then det ๐ด โ ๐ต๐ = ∑๐(det ๐ด๐ )(det ๐ต๐ ) (๐ ranges over all subsets of ๐ columns) If ๐ = ๐ then det(๐ด โ ๐ต) = (det ๐ด) โ (det ๐ต) If ๐ > ๐ the determinant is zero. So it’s not interesting. If ๐ > ๐ Set ๐ด = ๐ต = ๐−๐ . Then everything comes out. Prove the Binet-Cauchy formula: x1 0 0 Δ = [0 โฑ 0 ] n×n 0 0 ๐ฅ๐ matrix det(๐ดΔ๐ต๐ ) = ∑(det ๐ด) โ (det BS ) โ (∏ ๐ฅ๐ ) ๐ ๐∈๐ We will prove this, and this is a stronger statement then the one we need to prove. The answer ๐ดΔ๐ต๐ is an ๐ × ๐ matrix: Every entry in the matrix is a linear polynomial in the variable ๐ฅ๐ . Linear because we never multiply an ๐ฅ๐ in ๐ฅ๐ . In addition we don’t have any constant terms. What is the determinant? det(๐ดΔ๐ต๐ ) is a homogenous polynomial of degree ๐. We can take any monomial of degree ๐, and see that the coefficients in both cases are the same. On the right hand side we don’t have any monomials of degree higher than ๐ or lower than ๐. We need to prove they are zero on the left side. ๐ = a set of less then ๐ variables. Substitute 0 for all variables which are not in ๐. ๐ดΔ′ ๐ต๐ BAAAAHHH. Can’t write anything in this class! Spectral Graph Theory NP-Hard problems on random instances. Like 3๐๐ด๐, Hamiltonicity k-clique, etc… We sometimes want to look at max-clique (which is the optimization version of k-clique). We can also look at the optimization version of MAX-3SAT. 3XOR or 3LIN: (๐ฅ1 ⊕ ฬ ฬ ฬ ๐ฅ2 ⊕ ๐ฅ3 ) … This problem is generally in ๐. But if we look for the maximal one we get an NP-Hard problem. Motivations: - Average case (good news) - Develop algorithmic tools - Cryptography - Properties of random objects Heuristics: 3SAT: “yes” – Find a satisfying assignment in many cases (yes/maybe). Refutation/”no” – find a “witness” that guarantees that the formula is not satisfyable. (no/maybe). Hamiltonicity ๐บ๐,๐ - Esdes-Rengi random graph model: Between any two vertices independently, place an edge with probability ๐. ๐บ๐,๐= 5 . The ๐ doesn’t have to be a constant. ๐ Process of putting edges in the graph at random one by one. 3SAT: At first, a short list of clauses is satisfyable. But as the formula becomes larger, it is not so sure that it is satisfyable. What is that point? A conjecture is t 4.3 โ ๐. There is a proof of 3.5 โ ๐. However, there is a proof that this is a sharp transition. For refutation, the condition is exactly opposite. The longer the formula, the easier it is to find a witness for “no”. --- end of lesson Adjacency matrix: 0 1 [ ] ๐๐,๐ = 1 ⇔ (๐, ๐) ∈ ๐ธ โฑ 1 0 For regular graphs, there is no difference between the adjacency matrix and the laplacian. For irregular graphs, we do have differences and we will not get into it. Properties: Non-negative, symmetric, connected ⇔ irreducible Irreducible means we cannot represent it as a block matrix such that the upper right block is zeroes. ๐1 ≥ ๐2 ≥ โฏ ≥ ๐๐ are all real, we might have multiplicities so we also allow equality. ๐ฃ1 , … , ๐ฃ๐ eigenvectors. If we take two distinct eigenvalues ๐๐ ≠ ๐๐ then ๐ฃ๐ ⊥ ๐ฃ๐ โ๐ has an orthogonal basis using eigenvectors of ๐ด. If we look at the eigenvector that corresponds to the largest eigenvalue, then the eigenvector is positive (all its elements are positive). ๐1 > ๐2 , ๐1 ≥ |๐๐ | (The last 2 properties are only true if the graph is connected) ๐ฅ is an eigenvector with eigenvalue ๐ if ๐ด๐ฅ = ๐๐ฅ In graphs, if we have the adjacency matrix, and we multiply it by ๐ฅ. The values of ๐ฅ corresponds to values we give to some of the vectices of ๐ด. TODO: Draw graph. If some ๐ฅ is an eigenvector of ๐ด, it means every vertex is given a value that is a multiple of the values of its neighbors. ๐ก๐๐๐๐๐ด = 0 ∑ ๐๐ = ๐ก๐๐๐๐ Since ๐1 > 0 ⇒ ๐๐ < 0 Bipartite Graphs Let ๐ฅ be an eigenvector with non-zero eigenvalue ๐. TODO: Draw bipartite graph By flipping the value of ๐ฅ on one site of the bipartite graph, we get a new eigenvector and it has eigenvalue – ๐. This is a property only of bipartite graphs (as long as we’re dealing with connected graphs). Consider an eigenvector ๐ฅ. Then observe: ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ This is called Rayleigh quotients If ๐ฅ is an eigenvector of eigenvalue ๐ ⇒ ๐ฅ ๐ ๐๐ฅ ๐ฅ๐๐ฅ = ๐ โ =๐ ๐ฅ๐๐ฅ ๐ฅ๐๐ฅ Let ๐ฃ1 , … , ๐ฃ๐ be an orthonormal eigenvector basis of โ๐ . In this case: ๐ฅ = ๐1 ๐ฃ1 + โฏ + ๐๐ ๐ฃ๐ where ๐๐ = 〈๐ฅ, ๐ฃ๐ 〉 ๐ด (∑ ๐๐ ๐ฃ๐ ) = ∑ ๐๐ ๐๐ ๐ฃ๐ ∑ ๐๐2 ๐๐ ∑ ๐๐2 This is like a weighted average where every ๐๐ gets the weight ๐๐2 So this expression cannot be larger than ๐1 and cannot be smaller than ๐๐ ๐ (∑ ๐๐ ๐๐ ) (∑ ๐๐ ๐๐ ๐ฃ๐ ) = Suppose we know that ๐ฅ ⊥ ๐ฃ1 ? It means that the result is a weighted average of all vectors except for ๐1 Another way of getting the same thing: ∑๐,๐ ๐๐๐ ๐ฅ1 ๐ฅ๐ (element-wise multiplication of the matrix ๐ด and the matrix ๐ฅ โ ๐ฅ ๐ . Large max cut ⇒ ๐๐ is close to −๐1 Can show it using rayleigh’s quitients. But the interesting thing is that the opposite direction is not so true! We looked at the relation between ๐1 and ๐๐ . Now let’s look at the relation between ๐1 and ๐2 . If a graph is ๐-regular all 1 eigenvalue. If we have a disconnected d-regular graph, ๐1 , ๐2 are equal! Since 111111 … is an eigenvector of both ๐1 and ๐2 . So if they are equal, the graph is disconnected! What happens if they are close to being equal? ๐2 close to ๐1 , then ๐บ close to being disconnected! Suppose we have a graph. TODO: Draw the graph we are supposed to have. We perform a random walk on a graph. What is the mixing time? If a token starts at a certain point, what is the probability it will be in any of the other vertices? If it turns quickly into a uniform distribution then the mixing is good. Small cuts are obstacles of fast mixing. If we started on the first vertex, we can say we started with the (probabilities) vector: [1 0 … 0] ๐ด ∑ ๐๐ ๐ฃ๐ ๐ ๐ด๐ก If I do ๐ก steps of the random walk it’s: ๐๐ก (∑ ๐๐ ๐ฃ๐ ) = ∑ ๐๐ (๐๐ )๐ก ๐ฅ๐ ๐๐ก If ๐๐ โซ |๐๐ | then we get a uniform distribution over all vertices. ๐๐ for non-regular graphs Suppose the graph has maximum degree Δ and average degree ๐. So we know ๐1 ≤ Δ But this is always true: ๐1 ≥ ๐ ๐ฅ ๐ = (1,1, … 1) and consider ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ So: ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ = sum over all entries of ๐ด = 2๐ ๐ ≥ average degree. c-core of a graph is the largest subgraph in which every vertex has degree ≥ ๐. Let ๐ ∗ be the largest ๐ for which the graph ๐ has a c-core. At ๐-regular graphs ๐ ∗ = ๐. ๐1 ≤ ๐ ∗ ๐ฃ is the eigenvector associated with ๐1 . Sort vertices by their value in the vector ๐ฃ. So ๐ฃ1 is the largest value and ๐ฃ๐ is the smallest. And all numbers are positive (according to the provinious bla bla theorem). The graph cannot have a ๐1 + 1-core. (a vertex cannot be connected to ๐1 neighbors above it, since then it cannot equal to a multiple ๐1 of the values of its neighbors) If we look at a star graph Δ = ๐−1 ๐ = 2 (a bit less than) And the largest eigenvalue ๐1 = √๐ − 1 We can give the center √๐ − 1 and the rest 1. Sometimes when we have a graph that is nearly regular but have a small number of vertices with a high degree, we should remove the vertices of high degree since they skew the spectrum! Let’s use ๐ผ(๐บ) to denote the size of the maximum independent set in the graph. And let’s assume that ๐บ is ๐-regular. ๐ผ(๐บ) ≤ − ๐๐๐ ๐ − ๐๐ ๐บ is bipartite. Then ๐๐ = −๐1 and then ๐ผ(๐บ) = Proof: ๐๐ ≤ ๐ฅ ๐ ๐ด๐ฅ ๐ฅ๐๐ฅ ๐๐1 2๐1 ๐ =2 ∀๐ฅ Let ๐ be the largest independent set in ๐บ. TODO: Draw ๐ฅ So the entries of ๐ will equal ๐ − ๐ and the other vertices will be – ๐ 0 0 | ๐ โ ๐ 1′ ๐ | −๐ (๐ − ๐ ) 0 0 | | − − + − − − − + − − = ๐ โ ๐ 1′ ๐ | ๐ โ (๐ − 2๐ ) 1′ ๐ −๐ (๐ − ๐ ) | ๐ 2 | ] | [ ] [ ๐ 2 (๐๐ − 2๐๐ ) − 2๐ (๐ − ๐ )๐๐ ๐ 2 ๐๐ − 2๐๐ 3 − 2๐ 2 ๐๐ + 2๐๐ 3 ๐ ๐ ๐1 ≤ = =− ๐ (๐ − ๐ )2 + (๐ − ๐ )๐ 2 ๐−๐ ๐ (๐ − ๐ )((๐ − ๐ ) + ๐ ) (๐ − ๐ )2 ๐๐ โ ๐ − ๐๐ ๐ ≤ −๐๐ ๐ (๐ − ๐๐ ) ≤ −๐๐ ๐ −๐๐ ๐ ๐ ≤ ๐ − ๐๐ If |๐๐ | โช ๐ we get a very good bound. Otherwise we don’t. ∑ ๐๐ = 0 We can also look at the trace of ๐ด2 = ∑(๐๐ )2 On the other hand, it is also ∑ ๐๐ Only for regular graphs, this is ๐ โ ๐ So it follows that the average square value: ๐ด๐ฃ๐(๐2๐ ) = ๐ ๐ด๐ฃ๐|๐๐ | ≤ √๐ Recall that ๐1 = ๐! It turns out that in random graphs (regular or nearly regular). With high probability ∀๐ ≠ 1 ⇒ |๐๐ | = ๐(√๐) In almost all graphs, |๐1 | โซ |๐๐ | −๐ ๐ 2√๐๐ Therefore: ๐ ≤ ๐−๐๐ ≤ ๐+2√๐ ≤ ๐ 2๐ √๐