Trees and Cut-sets Trees Rooted Trees Path Lengths in Rooted Trees Prefix Codes Binary Search Trees Spanning Trees and Cut-sets Minimum Spanning Trees Transport Networks 1 Basic Definitions and Properties tree : A connected (undirected) graph that contains no simple cycle. forest : A collection of disjoint trees. terminal node : A vertex of degree 1 in a tree. properties of trees: There is a unique path between every two vertices in a tree. The number of vertices is one more than the number of edges in a tree. A tree with two or more vertices has at least two leaves. equivalent definitions of trees A graph in which there is a unique path between every pair of vertices is a tree. A connected graph with e = v - 1 is a tree. A graph with e = v - 1 that has no circuit(cycle) is a tree. 2 Rooted Trees directed tree: A directed graph is said to be a directed tree if it becomes a tree when the directions of the edges are ignored. rooted tree: A directed tree is called a rooted tree if there exists exactly one vertex whose incoming degree is 0 and the incoming degrees of all other vertices are 1. internal node or branch node : A vertex of degree large than 1 in a tree. son node, father node, and brother node descendant c of node a: There is a directed path from a to c. Also, a is called to be an ancestor of c. the subtree with a as a root: The subgraph T’ = (V’, E’) of T such that V’ contains a and all of its descendants, and E’ contains the edges in all directed paths emanating from a. ordered tree: A rooted tree with the edges incident from each branch node labeled with integers 1, 2, …, i, … tree isomorphism m-ary tree: An ordered tree in which every branch node has at most m sons. regular tree: Every branch node of an m-ary tree has exactly m sons. binary tree, left subtree, and right subtree 3 Path Lengths in Rooted Trees path length : The number of edges in the path from the root to the vertex. height of a tree : The maximum of the path lengths in the tree. single elimination tennis tournament the number of games played = one less than the number of players in the tournament In a regular binary tree, let I denote the sum of the path lengths of all the branch nodes and E denote the sum of the path lengths of the leaves in a rooted tree. E = I + number of edges in the binary tree = I + 2i, where i is the number of branch nodes. 4 Prefix Codes entropy code : One might wish to represent more frequently used letters with shorter sequences and less frequently used codes with longer sequences so that the overall length of the string will be reduced. How one at the receiving end can unambiguously divides a long string of 0s and 1s into sequences of letters? Prefix code prefix code : A set of sequences is said to be a prefix code if no sequence in the set is a prefix of another sequence in the set. binary tree of the weights w 1 ,w 2 ,...,w t : Given a set of weights w 1 ,w 2 ,...,w t , w 1 w 2 ... w t , a binary tree that has t leaves with the weightsw 1 ,w 2 ,...,w t assigned to the leaves. the weight of a binary tree for the weights w 1 ,w 2 ,...,w t : W T t = w i l w i i 1 , where l w i is the path length of the leaf to which the weight w i is assigned . A binary tree T for the weights w 1 ,w 2 ,...,w t is said to be an optimal tree if W(T) is minimum. 5 Huffman code algorithm : A procedure for constructing a optimal tree for a given set of weights. Observation : We can obtain an optimal tree T for the weights w 1 ,w 2 ,...,w t , from an optimal tree T’ for the weights w 1 w 2 ,w 3 ,...,w t . Proof. There is an optimal tree for the weights w 1 ,w 2 ,...,w t in which the leaves to which w1 and w2 are assigned are brothers. Let a be a branch node of largest path length, and the weights assigned to the sons of a are wx and wy. We have l(w1) = l(w2) = l(wx) = l(wy). [If l(w1) < l(wx) then a contradiction will occur.] Let T denote an optimal tree for w 1 ,w 2 ,...,w t in which the leaves to which w1 and w2 are assigned are brothers. Let T W( T denote coalescing tree of T . ) = W( T ) + w1 + w2 Let T’ be an optimal tree for w 1 w 2 ,w 3 ,...,w t . Let T be the tree obtained from T’ be replacing the leaf to which w1 + w2 is assigned by a subtree of w1 and w2. W(T) = W(T’) + w1 + w2 If W(T) > W( T ), then W(T’) > W( T ), contradicts to the assumption. Q.E.D. The problem of constructing an optimal tree for t weights can be reduced to that of constructing one from t - 1 weights. 6 Binary Search Trees search tree for the keys K1 , K 2 ,..., K n : A binary tree with n branch nodes and n+1 leaves, the branch nodes are labeled K 1 , K 2 ,..., K n , and the leaves are labeled K 0 , K1 , K 2 ,..., K n , such that, for the branch node with the label K i , its left subtree contains only vertices with labels K j , j i , and its right subtree contains only vertices with only K j , j i . The maximum number of comparisons the search procedure carried out in the worst case = the height of the corresponding search tree For a given set of n keys, a search tree whose height is log(n+1) will correspond to the best possible search procedure. Since there always exists a binary tree of height log(n+1) for any n, the problem is solved. Search frequencies key K1 frequency u1 K2 u2 … … Kn un key <K1 >K1, <K2 … >Kn frequency w0 w1 … wn The total number of comparisons will equal n n j 1 j 0 u j [ l ( K j ) 1] w j l ( K j ) , where l(Kj) is the path length of the branch node that is labeled Kj and l’(Kj) is the path length of the leaf that is labeled Kj in the search tree. D. Knuth gave an optimal algorithm for the above problem. 7 Spanning Trees and Cut-sets spanning tree of a connected graph : A spanning tree of a connected graph is a spanning subgraph of the graph which is a tree. branch (or tree edge) : An edge of the graph that is in the tree. chord/link (or non-tree edge) : An edge of the graph that is not in the tree. The set of the chords of a tree is referred to as the complement of the tree. A connected graph always contains a spanning tree. cut-set : A (minimal) set of edges in a graph such that the removal of the set will increase the number of connected components in the remaining subgraph, whereas the removal of any proper subset of it will not. 8 The concepts of spanning trees, circuits, and cut-sets are closely related. fundamental system of circuits : For a given spanning tree, a unique circuit can be obtained by adding to the spanning tree each of the chords. The set of e - v + 1 circuits obtained in this way is called the fundamental system of circuits relative to the spanning tree. A circuit in the fundamental system is called a fundamental circuit. Since a fundamental circuit contains exactly one chord of the spanning tree, it is referred to as the fundamental circuit corresponding to the chord. For every branch in a spanning tree there is a corresponding cut-set. For a given spanning tree, the set of the v - 1 cut-sets corresponding to the v - 1 branches of the spanning tree is called the fundamental system of cut-sets relative to the spanning tree. A cut-set in the fundamental system of cut-sets is called a fundamental cut-set. Since a fundamental cut-set contains exactly one tree branch, it is referred to as the fundamental cut-set corresponding to the branch. 9 Theorem 6.1 A circuit and the complement of any spanning tree must have at least one edge in common. Proof : If there is a circuit that has no common edge with the complement of a spanning tree, the circuit is contained in the spanning tree. This is impossible as a tree cannot contain a circuit. Q.E.D. Theorem 6.2 A cut-set and any spanning tree must have at least one edge in common. Proof : If there is a cut-set that has no common edge with a spanning tree, the removal of the cut-set will leave the spanning tree intact, contradicts to the definition of a cut-set. Q.E.D. Theorem 6.3 Every circuit has an even number of edges in common with every cut-set. Proof : Corresponding to a cut-set, there is a partition of the vertices of the graph into two subsets. Therefore, a path connecting two vertices in one subset menu traverse the edges in the cut-set an even number of items, since a circuit is a path from some vertex to itself. Q.E.D. 10 Theorem 6.4 For a given spanning tree , let D e1 , e 2 , e 3 ,..., e k be a fundamental cut-set in which e 1 is a branch and e 2 , e 3 ,..., e k are chords of the spanning tree. Then, e 1 is contained in the fundamental circuits corresponding to e i for i 2 , 3 ,..., k . Moreover, e 1 is not contained in any other fundamental circuits. Proof: Let C be the fundamental circuit corresponding to the chord e 2 . Note that e 2 is in both C and D. Since C and D have an even number of edges in common and e1 is the only other edge that can possibly be in both C and D, e 1 must be contained in C. A similar argument can be applied to the fundamental circuits corresponding to the edges e 3 , e 4 ,..., e k . On the other hand, let C be the fundamental circuit corresponding to any chord not in D. C cannot contain e 1 , because otherwise , C and D will have e 1 as the only common edge. Q.E.D. Theorem 6.5 For a given spanning tree, let C e1 , e 2 , e 3 ,..., e k be a fundamental circuit in which e 1 is a chord and e 2 , e 3 ,..., e k are branched of the spanning tree. Then, e 1 is contained in the fundamental cut-sets corresponding to e i for i 2 , 3 ,..., k . Moreover, e1 is not contained in any other fundamental cut-sets. 11 Minimum Spanning Trees The weight of a spanning tree is defined to be the sum of the weights of the branched of the tree. minimum spanning tree : A minimum spanning tree (MST) is one with minimum weight. Kruskal’s algorithm Construct a subgraph of the weighted graph in step-by-step manner, examining the edges one at a time in increasing ordering of weights. An edge will be added in the partially constructed subgraph if its inclusion does not yield a circuit, and will be discarded otherwise. The construction terminates when all the edges have been examined. The constructed subgraph is connected and contains no circuit. It has minimum weight. 12 Prim’s algorithm Among all edges incident with a vertex, the edge with the smallest weight must be in an MST. [Notice that we assume that the weights of edges are distinct here.] {v1,v2} = the smallest weight edge connecting with v1, T = the spanning tree not containing {v1,v2} U = T {v1,v2}, the fundamental circuit corresponding to the chord {v1,v2} U = {v1,v2} {v1,vx, …, v2} Removing the edge {v1,vx} yields a spanning tree whose weight is smaller than that of T. Let G’ be the graph obtained from G by coalescing the vertices v1 and v2 with which the edge e is incident in G, and T’ be an MST of G’. Let T denote a subgraph of G consisting of all the edges in T’ together with the edge e, T is an MST of G. 13 W(T) = W(T’) + w(e) If T were not a MST of G, MST such that T , which contains e W( T ) < W(T). Let T be the tree obtained from v2 and removing e. T T by coalescing v1 and is a MST of G’, W( T ) < W(T’) contradicts to the assumption that T’ is an MST of G’. Q.E.D. Let {v1,v2} be the edge of the smallest weight. The edge {v1,v2} must be included in an MST, we can coalesce v1 and v2 to obtain G’, and then try to determine the MST of G’. The step is repeated until G’ becomes a single vertex. 14 Transport Networks A weighted directed graph is said to be a transport network if the following conditions are satisfied: 1. It is connected and contains no loops. 2. There is one and only one vertex in the graph that has no incoming edge. 3. There is one and only one vertex in the graph that has no outgoing edge. 4. The weight of each edge is a nonnegative real number. source : the vertex that has no incoming edge. sink : the vertex that has no outgoing edge. capacity : the weight of an edge. flow, : is an assignment of a nonnegative number (i, j) to each edge (i, j) such that (i, j) w(i, j) for each edge (i, j). [The amount of material to be shipped through a route cannot exceed the capacity. ] all i i , j all k j , k for each vertex j except the source a and the sink z . [The amount of material flowing into a vertex must equal the amount of material flowing out of the vertex.] the value of the flow, v all i a , i all k k , z saturated of an edge (i, j) : if (i, j) = w(i, j) unsaturated of an edge (i, j) : if (i,j) < w(i, j) maximum flow : A flow that achieves the largest possible value. 15 cut : A cut-set of the undirected graph, obtained from the transport network by ignoring the direction of the edges, that separate the source from the sink . The notation P ,?P is used to denote a cut that divides the vertices into two subsets P and P , where the subset P contains the source and the subset P contains the sink. capacity of a cut, w P ,?P : the sum of the capacities of those edges incident from the vertices in P to the vertices in P ; that is , w P ,?P w i , j . i P , j P Theorem 6.6 The value of any flow in a given transport network is less than or equal to the capacity of any cut in the network. Proof : Let be a flow and P ,?P be a cut. For the source a, a , i j , a a , i v . alli allj alli For a vertex p a in P, p , i j , p 0 alli allj 16 We have p , i j , p p P alli allj p , i j , p v p P ; alli p P ; allj p P ; i P p , i p P ; i P p , i j , p j , p p P ; j P p P ; j P Note that p , i j , p . p P ; i P Thus , (*) becomes v p P ; j P p P ; i P But, since p P ; j P v (*) p , i p P ; j P j , p j , p 0, we have p , i w p , i w P , P . p P ; i P (**) Q.E.D. p P ; j P Eq. (**): For any cut P ,?P , the value of a flow equal the sum of flows in the edges from the vertices in P to the vertices in P minus the sum of flows in the edges from the vertices in P to the vertices in P. 17 Constructing a Maximum Flow in a Transport Network Iterated improvement: For any cut P ,?P , the values of a flow in a transport network equal the sum of flows in the edges from the vertices in P to the vertices in P minus the sum of flows in the edges from the vertices in P to the vertices in P. Whenever we can construct a flow the value of which is equal to the capacity of some cut, we can be certain that is a maximum flow. Labeling procedure: 1. The source a is labeled (-, ). It means that (out from nowhere) the source can supply an infinite amount of material to the other vertices. 2. A vertex b that is adjacent from a is labeled (a+, b), where b is equal to w(a, b) - (a, b), if w(a, b) > (a, b); it is not labeled if w(a, b) = (a, b). 3. [forward] Those vertices that are adjacent to or from the labeled vertices are scanned. A vertex q that is adjacent from b is labeled (b+, q), where q is equal to the min{b, w(b, q) - (b, q)}, if w(b, q) > (b, q). It is not labeled if w(b, q) = (b, q). The total incoming flow into q from b can be increased by q. 18 4. [backward] A vertex q is labeled (b-, q), where q is equal to the min{b, (b, q)}, if (b, q) > 0. It is not labeled if (b, q) = 0. The total incoming flow from q to b can be decreased by q. 5. The vertex q might be adjacent to or from more than one labeled vertex. If we repeat the procedure of labeling the vertices that are adjacent to or from the labeled vertices, on of the following two cases shall arise: 1. Sink z is labeled, say, (y+, z). Increase the flow in (y, z) by z. Notice that y must be labeled either (q+, y) or, (q-, y) with y z from some q. If y is labeled (q+, y), increase the flow in (q, y) by z. If y is labeled (q-, y), decrease the flow in (y, q) by z. So the increment from y to z is compensated. 2. Sink z is not labeled. Denote all the labeled vertices by P and all the unlabeled vertices by P . The fact that sink z is not labeled means that the flow in each of the edges incident from the vertices in P to the vertices in P is equal to the capacity of that edge, and that the flow in each of the edges incident from the vertices in P to the vertices in P is equal to zero. We have thus obtained a flow, the value of which is equal to the capacity of the cut P ,?P . By (**), we find a maximum flow. 19 Intersection Graphs hereditary property intersection graph : Let F be a family of nonempty set. The intersection graph of F is obtained by representing each set in F by a vertex and connecting two vertices by an edge if and only if their corresponding sets intersect. Marczewski, 1945 : When F is allowed to be an arbitrary family of sets, the class of graphs obtained as intersection graphs is simply all undirected graphs The intersection graph is called the interval graph if F is a family of intervals on a linearly ordered set. unit interval graph proper interval graph Roberts, 1969 : The classes of unit interval graph and proper interval graph coincide. circular-arc graph proper circular-arc graph : not only is no arc properly contained in another but also no pair of arcs together cover the entire circle. permutation graph : A permutation diagram consists of n points on two parallel lines and n straight line segments matching the points. The intersection graph of the line segments is called a permutation graph. triangulated graph property : Every simple cycle of length strictly greater than 3 posses a chord. 20 triangulated graph (chordal graph) G is a triangulated graph if and only if G is the intersection graph of a family of subtrees of a tree. transitive orientation property : Each edge can be assigned a one-way direction in such a way that the resulting oriented graph (V,F) satisfies the following condition: abF and bcF imply acF (a,b,c V). comparability graph Three typical graphs Property C: G is a comparability graph G is a comparability graph Property C : Property T: G is a triangulated graph Property T : G is a triangulated graph Interval graph T+ C permutation graph C+ C split graph T+ T 21