Introduction to Trees Trees Motivation When at a node in a singly linked list, there is no choice as to which node can be visited next: when here we can only go here In a doubly linked list we have a choice of 2 nodes: if here then here or here If decisions or multiple possibilities are inherent in an application, then more than one or two nodes should be able to Data structures allowing this include trees and graphs: Tree Graph Trees can be regarded as falling between linear lists and graphs in the hierarchy of structural complexity. Trees Tree Definitions There are many ways of defining a tree. We will look at 4 definitions, and as we do, we will define many other aspects 1) Definition in terms of Graphs: A tree is a connected, undirected graph that contains no circuit. In other words, there is a unique path between any 2 nodes. Not a tree, because there is a circuit: We use the adjective connected since one graph can be considered as consisting of two disconnected parts: The adjective undirected allows the definition to be more general: directed graph or digraph: considered as one graph undirected graph: (restricted, since you can’t go against an(general, arrow) since you can go both ways be Trees Because computer applications of trees require that we start at one designated node in the tree in order to reach the root Using directed edges allows us to designate the node where we should start as the root node. Definition: The root node has indegree of 0. Indegree ≡ The number of directed edges pointing at a node. Outdegree ≡ The number of directed edges pointing away from a node. Leaf nodes have outdegree of 0. For Example: This root has: indegree = 0 and outdegree = 4 This internal node has: indegree = 1 and outdegree = 3 Trees To impose some standardization so that we can visually pick out the root more easily, computer scientists requ bottom rooted top rooted side rooted (European) (American) (compromise) When we root a directed tree, it is no longer necessary to draw the arrows on the directed edges. They are und (From now on, we will usually draw trees as Trees 2) Definition in terms of Number of Edges and Nodes: A tree is a finite connected graph such that: no. of Nodes = no. of Edges + 1 For Example: 3 nodes 2 edges 5 nodes 4 edges 8 nodes 7 edges 3) Definition in terms of Connectivity: A graph is a tree iff the removal of any edge would disconnect the graph: For Example: Definition: A collection of disconnected trees is called a forest: For Example: Removing this, or any other edge, disconnects the g A forest of 5 trees. Trees 4) Recursive Definition (Defining trees in terms of trees.) (This is the definition that computer scientists prefer.) A tree is a finite set of nodes which is either empty (root=NULL) or there is one node called the root node and the remaining nodes can be partitioned into disjoint sets each of which is also a tree. These 3 disjoint partitions are also trees – and with respect to the larger structure, they are called subtrees. 3 1 2 This recursive definition for trees gives us some easily defined search algorithms, which we will study later. Trees Defining Parts of a Tree: Many scientific and medical terms seem obscure to us because these disciplines originated millennia ago within ancie Many medical terms derive from ancient Greek. stetho·scope Greek: lungs ophthalmo·scope eye to watch osteo·myel·itis bone marrow inflamation Even though the terminology came from common, everyday words, it is now foreign to us because of the language di More recent scientific disciplines kept Latin as their common language of discourse. Fortunately for us, computer science did not make the mistake of obscuring its definitions in ancient Latin or Greek. It uses current, common everyday English words like mouse, bugs, chips, etc. (More obscure terms mainly come fro However, there is a drawback. For some concepts within our discipline, there may be more than one common word, Botanical Generic predecessor successor Familial Relations ancestor – parent – father → descendent – child – son → brothers or siblings Generic ← root ------------------ start node ← branch ------------- edge or arc ← branch node ---- internal node ← leaf node ---- external node ↑ vertex is another name for node Trees Kinds of Trees – in terms of branching factor (outdegree). Unary Trees – outdegree for every node ≤ 1. (a singly linked list) Binary Trees – outdegree for every node ≤ 2. Ternary Trees – outdegree for every node ≤ 3. . . . m - ary Trees – outdegree for every node ≤ m. m Note: The indegree for every node in a directed tree, except the root, is 1. Trees Mathematical Properties of Trees Max no. of Nodes per Level: . . . Binary Tree Ternary Tree m-ary Tree 0 1 1 1 2 3 2 4 9 3 8 27 m3 . . . . . . . . . . . . . . . In general: h 2h 3h mh . . . Level 1 . . . m . . . m2 . . . . . . . . . Total max no. of nodes: h 2k k 0 . . . h 3k k 0 3 h 1 1 2 . . . These formulas may be proved by induction: 2 h 1 1 h mk k 0 m h 1 1 m 1 Trees Definition The path length of a node in a tree is the no. of edges from the root to that node. path length = 0 path length = 1 height (or depth) =4 path length = 2 path length = 3 path length = 4 The height or depth of a tree is the max over all the path lengths of its nodes. Trees Definition An m-ary tree is regular if every one of its internal nodes has exactly m sons. For each of the following kinds of trees, state which are regular and which are not regular: Binary Ternary regular regular regular not regular not regular regular Trees Definition Question: For a regular binary tree of height h, what is the maximum no. of leaves it can have? Answer: 2h Question: What is the max no. of leaves for a regular m-ary tree of height h ? Answer: mh Such trees are called full trees. For example: Full Binary Trees non-Full Binary Trees Trees Definition The min no. of leaves that a regular m-ary tree of height h can have is: 1 + h(m-1) An example, for binary trees: m = 2 h = 4 So for a binary tree of height 4, the min no. of leaves for such a tree is: 1 + 4(2-1) = 1 + 4 = 5 An example, for ternary trees: m = 3 h = 5 So for a ternary tree of height 5, the min no. of leaves for such a tree is: 1 + 5(3-1) = 1 + 10 = 11 Trees Definition Complete Binary Tree – Such a tree of height h is a full tree up to level h-1 and is filled in from left-to-right a For example: Complete Binary Trees . . . Not Complete Trees More Mathematical Properties Question: How is the height h related to the total number of nodes, n, in a: 1) full binary tree? 2) full m-ary tree? Recall that the no. of nodes for a full binary tree of height h is: 2h+1−1 So, to get h in terms of n just solve the following equation for h: 2h+1−1 = n 2h+1 = n+1 h + 1 = log2 (n + 1) h = log2 (n + 1) − 1 Similarly, since mh+1−1 = n for full m-ary trees, the solution is easily determined to be: m −1 h = logm [n(m−1) + 1] − 1 Questions: Will these log functions always return integer values? Why, or why not? Trees Mathematical Properties Question: How do we determine h in terms of n for complete binary trees? Hint: Derive the formula by looking at the trees below, and filling in and observing the number pattern in the table: 1 node h=0 5 nodes h =2 2 nodes h=1 6 nodes h=2 3 nodes h=1 7 nodes h=2 height h 4 nodes h=2 8 nodes h=3 nodes n 0 1 1 2 1 3 2 4 2 5 Note that the height h increases by 1 at every power of 2, and is simply related as the logarithm of 2 6 that power of 2 2 7 But the log2 of the intervening integers results in decimal numbers, not integers. How can we denote the truncati 3 8 For complete binary trees: h = ⎿log2(n)⏌ ⋮ ⋮ For complete m-ary trees: h = ⎿logm(n)⏌ The pair of delimiters ⎿ ⏌ is called the floor function, and is defined as: ⎿ x⏌ ≡ the largest integer ≤ x – e.g. ⎿2.135⏌ = 2 ⎿2.975⏌ = 2 The ceiling function ⎾ ⏋ is the complement of the floor function, and is defined as: ⎾ x⏋ ≡ the smallest integer ≥ x– e.g. ⎾2.135⏋ = 3 ⎾2.975⏋ = 3