Binary Trees (and Big “O” notation) CS-2303 System Programming Concepts (Slides include materials from The C Programming Language, 2nd edition, by Kernighan and Ritchie and from C: How to Program, 5th and 6th editions, by Deitel and Deitel) CS-2303, C-Term 2010 Binary Trees 1 Definitions • Linked List • A data structure in which each element is dynamically allocated and in which elements point to each other to define a linear relationship • Singly- or doubly-linked • Stack, queue, circular list • Tree • A data structure in which each element is dynamically allocated and in which each element has more than one potential successor • Defines a partial order CS-2303, C-Term 2010 Binary Trees 2 Binary Tree • A linked list but with two links per item struct treeItem { type payload; treeItem *left; treeItem *right; }; payload left payload left payload payload left left right payload right left payload left right right right CS-2303, C-Term 2010 Binary Trees 3 right payload left right Binary Tree (continued) • Binary tree needs a root struct treeItem { type payload; treeItem *left; treeItem *right; }; struct treeItem *root; • Binary trees often drawn with root at top! • Unlike ordinary trees in the forest • More like the root systems of a tree CS-2303, C-Term 2010 Binary Trees 4 Definitions (continued) See Deitel & Deitel, §12.7 K & R, §6.5 • Binary Tree • A tree in which each element has two potential successors • Subtree • The set of nodes that are successors to a specific node, either directly or indirectly • Root of a tree • The node of the tree that is not the successor to any other node, all other nodes are (directly or indirectly) successors to it CS-2303, C-Term 2010 Binary Trees 5 Binary Tree struct treeItem { type payload; treeItem *left; treeItem *right; payload left }; struct treeItem *root; right payload payload left left right right payload left payload right left payload left right right CS-2303, C-Term 2010 Binary Trees 6 payload left right Purpose of a Tree • (Potentially) a very large data structure • Capable of storing very many items • In an orderly way • Need to find items by value • I.e., need to search through the data structure to see if it contains an item with the value we want • Need to add new items • If value is not already in the tree, add a new item … • …so that it can be easily found in future • Why not use a linked list? CS-2303, C-Term 2010 Binary Trees 7 Searching and Adding to a Binary Tree • Look recursively down sequence of branches until either payload left – Desired node is found; or – Null branch is encountered • Replace with ptr to new item • Decide which branch to follow based on payload right payload payload left left right right payload left payload right left payload left right right CS-2303, C-Term 2010 Binary Trees 8 payload left right Example — Searching a Tree typedef struct _treeItem { char *word; // part of payload int count; // part of payload _treeItem *left, *right; } treeItem; treeItem *findItem(treeItem *p, char *w) { if (p == NULL) return NULL; // item not found int c = strcmp(w, p->word); if (c == 0) return p; else if (c < 0) return findItem(p->left, w); else return findItem(p->right, w); } CS-2303, C-Term 2010 Binary Trees 9 Example — Adding an Item treeItem *addItem(treeItem *p, char *w) { if (p == NULL){ p = malloc(sizeof(treeItem)); char *c = malloc(strlen(w)+1); Why do this? p->word = strcpy(c, w); p->count = 1; p->left = p->right = NULL; return p; }; int c = strcmp(w, p->word); if (c == 0) p->count++; else if (c < 0) p->left = addItem(p->left, w); else p->right = addItem(p->right, w); return p; } CS-2303, C-Term 2010 Binary Trees 10 Binary Tree • Question:– how many calls to addItem for a tree with 106 nodes? – Assume balanced – I.e., approx same number of nodes on each subtree payload left right payload payload left left right right payload left payload right left payload left right right CS-2303, C-Term 2010 Binary Trees 11 payload left right Answer • Approximately 20 calls to addItem • Note:– – 210 = 1024 103 – Therefore 106 220 – Therefore it takes approximately 20 two-way branches to cover 106 items! • How many comparisons would it take to search a linked list of 106 items? CS-2303, C-Term 2010 Binary Trees 12 Observation • Problems like this occur in real life all the time • Need to maintain a lot of data • Usually random • Need to search through it quickly • Need to add (or delete) items dynamically • Need to sort “on the fly” • I.e., as you are adding and/or deleting items CS-2303, C-Term 2010 Binary Trees 13 Questions? CS-2303, C-Term 2010 Binary Trees 14 Binary Trees (continued) • Binary tree does not need to be “balanced” • i.e., with approximate same # of nodes hanging from right or left • However, it often helps with performance • Multiply-branched trees • Like binary trees, but with more than two links per node CS-2303, C-Term 2010 Binary Trees 15 Binary Trees (continued) • Binary tree does not need to be “balanced” • i.e., with approximate same # of nodes hanging from right or left • However, it helps with performance • Time to reach a leaf node is O(log2 n), where n is number of nodes in tree • Multiply-branched trees • Like binary trees, but with more than two links per node CS-2303, C-Term 2010 Binary Trees 16 Order of Traversing Binary Trees • In-order • Traverse left sub-tree (in-order) • Visit node itself • Traverse right sub-tree (in-order) • Pre-order • Visit node first • Traverse left sub-tree • Traverse right sub-tree • Post-order • Traverse left sub-tree • Traverse right sub-tree • Visit node last CS-2303, C-Term 2010 Binary Trees 17 Question • Suppose we wish to print out the strings stored in the tree of the previous example in alphabetical order? • What traversal order of the tree should we use? CS-2303, C-Term 2010 Binary Trees 18 Another Example of Binary Tree x = (a.real*b.imag - b.real*a.imag) / sqrt(a.real*b.real – a.imag*b.imag) = x / sqrt * . a - * . real CS-2303, C-Term 2010 b . imag b Binary Trees … . real a 19 imag Question • What kind of traversal order is required for this expression? • In-order? • Pre-order? • Post-order? CS-2303, C-Term 2010 Binary Trees 20 Binary Trees in Compilers • Used to represent the structure of the compiled program Note: Deitel & Deitel, Ch 12 exercises, • Optimizations contain a series on building a compiler • • • • • Common sub-expression detection Code simplification Loop unrolling Parallelization Reductions in strength – e.g., substituting additions for multiplications, etc. • Many others CS-2303, C-Term 2010 Binary Trees 21 Questions? CS-2303, C-Term 2010 Binary Trees 22 “Big O” notation New Topic CS-2303, C-Term 2010 Binary Trees 23 Linked Lists Again • Linear data structure • Easy to grow and shrink • Easy to add and delete items • Time to search for an item – O(n) CS-2303, C-Term 2010 Binary Trees 24 Binary Trees Again • Non-linear data structure • Easy to grow and shrink • Easy to add and delete items • Time to search for an item – O(log n) CS-2303, C-Term 2010 Binary Trees 25 Definition — Big-O “Of the order of …” • A characterization of the number of operations in an algorithm in terms of a mathematical function of the number of data items involved • O(n) means that the number of operations to complete the algorithm is proportional to n • E.g., searching a list with n items requires, on average, n/2 comparisons with payloads CS-2303, C-Term 2010 Binary Trees 26 Big-O (continued) • • • • • O(n): proportional to n – i.e., linear O(n2): proportional to n2 – i.e., quadratic O(kn) – proportional to kn – i.e., exponential … O(log n) – proportional to log n – i.e., sublinear • O(n log n) • Worse than O(n), better than O(n2) • O(1) – independent of n; i.e., constant CS-2303, C-Term 2010 Binary Trees 27 Anecdote & Questions:– • In the design of electronic adders, what is the order of the carry-propagation? • What is the order of floating point divide? • What is the order of floating point square root? • What program have we studied in this course that is O(2n)? i.e., exponential? CS-2303, C-Term 2010 Binary Trees 28 Questions on Big-O? CS-2303, C-Term 2010 Binary Trees 29