Connecting with Computer Science, 2e Chapter 8 Data Structures Objectives • In this chapter you will: – Learn what a data structure is and how it’s used – Learn about single-dimensional and multidimensional arrays and how they work – Learn what a pointer is and how it’s used in data structures – Learn that a linked list allows you to work with dynamic information Connecting with Computer Science, 2e 2 Objectives (cont’d.) • In this chapter you will (cont’d.): – Understand that a stack is a linked list and how it’s used – Learn that a queue is another form of a linked list and how it’s used – Learn that a binary tree is a data structure that stores information in a hierarchical order – See an overview of several sorting routines Connecting with Computer Science, 2e 3 Why You Need to Know About…Data Structures • Data structures: – Organize the data in a computer • Efficiently access and process data – All programs use some form of data structure – There are many occasions for using data structures Connecting with Computer Science, 2e 4 Data Structures • Defined as a way of organizing data • Types of data structures in memory – Arrays, lists, stacks, queues, trees • File structures organize storage media data • Computer memory is organized into cells – Memory cell has a memory address and content – Memory addresses are organized consecutively – Data structures hide memory implementation details Connecting with Computer Science, 2e 5 Arrays • Set of contiguous memory cells – Used for storing the same type of data • Simplest memory data structure • Consists of a set of contiguous memory cells – Memory cells store the same type of data • Usefulness: – Storing similar kinds of information in memory • Sorted or left as entered – One array name for a number of similar items Connecting with Computer Science, 2e 6 Arrays (cont’d.) Figure 8-2, Arrays make program logic easier to understand and use Connecting with Computer Science, 2e 7 How an Array Works • Java example – int[ ] aGrades = new int[5]; • • • • • • “int[ ]” indicates array will hold integers “aGrades” identifies the array name “new” keyword specifies new array being created “int[5]” reserves five memory locations “=” sign assigns aGrades as “manager” of the array “;” (semicolon) indicates end of statement reached • Hungarian notation standard is used to name “aGrades” Connecting with Computer Science, 2e 8 How an Array Works (cont’d.) Figure 8-3, Five contiguous memory cells managed by aGrades in a single-dimensional array Connecting with Computer Science, 2e 9 How an Array Works (cont’d.) • Element: memory cell in an array • Dimension: levels created to hold array elements – aGrades: single-dimensional array (row of mailboxes) • Offset specifies distance between memory locations – Array’s first position: referenced as position 0 – Next array position: referenced as position 1 • Found by using starting memory location plus one offset – Third array position: referenced as position 2 • Found by using starting memory location plus two offsets Connecting with Computer Science, 2e 10 How an Array Works (cont’d.) Figure 8-5, Arrays start at position 0 and use an offset to know where the next element is located in memory Connecting with Computer Science, 2e 11 How an Array Works (cont’d.) • Index (subscript) – Indicates memory cell to access in the array • Looks at element’s position • Placed between square brackets ( [ ] ) after array name • Integer placed in “[ ]” for access – Example: aGrades[0] = 50; • Position 0: first position – Array with positions 0 to 4 • Five memory cells or addresses • Upper bound: highest array position • Lower bound: lowest array position Connecting with Computer Science, 2e 12 How an Array Works (cont’d.) Figure 8-6, The array with all elements stored Connecting with Computer Science, 2e 13 Multidimensional Arrays • Multidimensional arrays – Consists of two or more single-dimensional arrays – Multiple rows stacked on top of each other • Apartment building mailboxes and tic-tac-toe boards • Creating the tic-tac-toe board – char[ ][ ] aTicTacToe = new char[3][3]; • Assignment: aTicTacToe[1][1] = ’X’; – Place X in second row of the second column • Arrays beyond three dimensions are difficult to manage Connecting with Computer Science, 2e 14 Multidimensional Arrays (cont’d.) Figure 8-7, A multidimensional array is like apartment mailboxes stacked on top of each other Connecting with Computer Science, 2e Figure 8-8, Tic-tac-toe board 15 Multidimensional Arrays (cont’d.) Figure 8-9, First row of the tic-tac-toe Figure 8-10, Second and third rows of board the tic-tac-toe board Connecting with Computer Science, 2e 16 Multidimensional Arrays (cont’d.) Figure 8-11, Storing a value in an array location Connecting with Computer Science, 2e 17 Multidimensional Arrays (cont’d.) Figure 8-12, Three-dimensional array Connecting with Computer Science, 2e 18 Uses of Arrays • Advantages – Allow sequential access of memory cells – Retrieve and store data with element array name and data type – Easy to create – Useful for people who write computer programs • Disadvantages – Require a lot of overhead for insertions – Memory cell data is only accessed sequentially Connecting with Computer Science, 2e 19 Lists • Hold dynamic lists of data – Lists vary in size – Examples: class enrollment, cars being repaired, email in-boxes • Appropriate whenever amount of data is unknown or can change • Three basic list forms: – Linked lists – Queues – Stacks Connecting with Computer Science, 2e 20 Linked Lists • Use noncontiguous memory locations to store data – Each element points to the next element in line • Does not have to be contiguous with previous element – – – – Used when exact number of items is unknown Store data noncontiguously Maintain data and address of next linked cell Examples: names of students visiting a professor, points scored in a video game, list of spammers • Basic constructs for more advanced data structures – Queues and stacks: pointer based Connecting with Computer Science, 2e 21 Linked Lists (cont’d.) • Pointers: memory variable containing the address of a memory cell as its data • Illustration: linked list game – – – – – – Students sit in a circle with piece of paper Paper has box in the upper left corner and center Upper left box indicates a student number Center box divided into two parts Students indicate favorite color in left part of center Professor has a piece of paper with a number only Connecting with Computer Science, 2e 22 Linked Lists (cont’d.) Figure 8-14, Structure of a linked list Connecting with Computer Science, 2e 23 Linked Lists (cont’d.) • Piece of paper represents a two-part node – Data (the first part, the color) – Pointer (the student ID number) • Professor’s piece: head pointer with no data • Last student: pointer’s value is NULL • Inserting new elements – No resizing needed – Create new “piece of paper” with dual node structure – Realign pointers to accommodate new node (paper) Connecting with Computer Science, 2e 24 Linked Lists (cont’d.) Figure 8-15, Inserting an element into a linked list Connecting with Computer Science, 2e 25 Linked Lists (cont’d.) • Similar procedure for deleting items – Modify pointer of element preceding target item – Students deleted from list without moving elements • Use dynamic memory allocation – More efficient than arrays – Memory cells need not be contiguous Connecting with Computer Science, 2e 26 Linked Lists (cont’d.) Figure 8-16, Deleting an element from a linked list Connecting with Computer Science, 2e 27 Stacks • List in which the next item to be removed is the item most recently stored – “Push” items on to the list to store new items – “Pop” items off the list to retrieve current items • Examples – Restaurant spring-loaded plate holder or text editor • Peeking – Looking at the stack’s top item without removing it • LIFO data structure – Last or most recent item pushed (put) onto the stack • Becomes first item popped (removed) from the stack Connecting with Computer Science, 2e 28 Stacks (cont’d.) Figure 8-17, The stack concept Connecting with Computer Science, 2e 29 Uses of a Stack • Processes lines of program source code • Source code is logically organized into procedures – Keep track of procedure calls with a stack – Address of procedure popped off stack Connecting with Computer Science, 2e 30 Back to Pointers • Stack pointer – Keeps track of where to remove or add an item in a data structure • Check stack before applying pop or push operations • Stacks – Memory locations organized into logical structures • Facilitates reading from them and writing to them Connecting with Computer Science, 2e 31 Back to Pointers (cont’d.) Figure 8-18, Stack pointer is decremented when the item is popped off Connecting with Computer Science, 2e 32 Queues • Another type of linked list – – – – Implements first in, first out (FIFO) storage system Insertions made at the end Deletions made at the beginning Similar to that of a waiting line Connecting with Computer Science, 2e 33 Uses of a Queue • Printer example – First item printed • Document waiting longest – Current item deleted from queue • Next item printed – New documents • Placed at the end of the queue • Insertions of new data occur at the rear of the queue • Removal of data occurs at the front of the queue Connecting with Computer Science, 2e 34 Pointers Again • Head pointer tracks beginning of queue • Tail pointer tracks end of queue • Queue containing no items – Both the head and tail pointer point to same location • Dequeue operation – Remove item (oldest entry) from the queue • Head pointer changed to point to the next item in list • Enqueue operation – Item placed at list end – Tail pointer updated Connecting with Computer Science, 2e 35 Pointers Again (cont’d.) Figure 8-19, A queue uses a FIFO structure Connecting with Computer Science, 2e 36 Pointers Again (cont’d.) Figure 8-20, Removing an item from the queue Connecting with Computer Science, 2e Figure 8-21, Inserting an item into the queue 37 Trees • Hierarchical data structure similar to organizational or genealogy charts – Node or vertex: position in the tree Figure 8-22, Tree data structure Connecting with Computer Science, 2e 38 Trees (cont’d.) • Binary tree – – – – – – – – Each node has at most two child nodes Node can have zero, one, or two child nodes Left child: child node to the left of the parent node Right child: child node to the right of the parent node Root: node that begins the tree Leaf node: node that has no child nodes Depth (level): distance from root node Height: longest path length in the tree Connecting with Computer Science, 2e 39 Trees (cont’d.) Figure 8-23, Tree nodes Connecting with Computer Science, 2e Figure 8-24, The level and height of a binary tree 40 Uses of Binary Trees • Binary search tree: a type of binary tree – Data value of left child node is less than the value of parent node – Data value of right child node is greater than the value of parent node • Useful for searching through stored data – Storing information in a hierarchical representation Connecting with Computer Science, 2e 41 Uses of Binary Trees (cont’d.) Figure 8-25, A file system structure can be stored as a binary search tree Connecting with Computer Science, 2e 42 Searching a Binary Tree • Three components in a binary search tree node: – Left child pointer, right child pointer, data • Root pointer contains root node’s address – Provides initial access to the tree • If left or right child pointers contain a null value – Node is not a parent to other nodes down that specific path • If both left and right pointers contain null values – Node is not a parent down either path • Binary tree must be defined properly to be searchable Connecting with Computer Science, 2e 43 Searching a Binary Tree (cont’d.) Figure 8-26, A node in a binary search tree Connecting with Computer Science, 2e 44 Searching a Binary Tree (cont’d.) • Search routine – – – – Start at the root position Determine if path moves to left child or right Move in direction of data (left or right) If left pointer NULL • No node to traverse down the left side – If left pointer does have a value • Path continues down that side – If value looking for is found • Stop at that node Connecting with Computer Science, 2e 45 Searching a Binary Tree (cont’d.) Figure 8-27, Searching a binary tree for the value 8 Connecting with Computer Science, 2e 46 Searching a Binary Tree (cont’d.) Figure 8-28, Searching a binary tree for the value 1 Connecting with Computer Science, 2e 47 Sorting Algorithms • Sorting: leverages data structures to organize data – Example of data being sorted: • • • • Words in a dictionary Files in a directory Index of a book Course offerings at a university • Many algorithms for sorting – Each has advantages and disadvantages • Focus: selection and bubble sorts Connecting with Computer Science, 2e 48 Selection Sort • Selection sort: mimics manual sorting – Starts at first value in the list – Processes each element looking for smallest value – After smallest value found, it is placed in first position • Moves first position value to location originally containing smallest value – Sort moves on looking for next smallest value – Continues to “swap places” • Simple to use and implement • Inefficient for large lists Connecting with Computer Science, 2e 49 Selection Sort (cont’d.) Figure 8-29, A selection sort Connecting with Computer Science, 2e 50 Bubble Sort • Bubble: older and slower sort method – Start with the last element in the list – Compare its value to that of the item just above – If smaller, change positions and continue up list • Continue comparison until smaller item found – If not smaller, next item compared to item above – Check until smallest value “bubbles” to the top – Process repeated for list less first item • Simple to implement • Inefficient for large lists Connecting with Computer Science, 2e 51 Bubble Sort (cont’d.) Figure 8-30, A bubble sort Connecting with Computer Science, 2e 52 Bubble Sort (cont’d.) Figure 8-31, The bubble sort continues Connecting with Computer Science, 2e 53 Other Types of Sorts • Quicksort: incorporates “divide and conquer” logic – – – – Two small lists are easier to sort than one large list Uses recursion to break down problem All sorted sub-lists are combined into single sorted set Fast but difficult to comprehend Connecting with Computer Science, 2e 54 Other Type of Sorts (cont’d.) • Merge sort: similar to the quicksort – Continuously halves data sets using recursion – Sorted halves are merged back into one list – Time efficient, but not as space efficient as quicksort • Insertion sort: simulates manual sorting of cards – Requires two lists – Not complex, but inefficient for list size fewer than 1000 • Shell sort: uses insertion sort against expanding data set Connecting with Computer Science, 2e 55 One Last Thought • Algorithms – Used everywhere in the computer industry – Knowing how to work with data structures and sorting algorithms is necessary to begin writing computer programs – Many algorithms are written and available for use • Knowing the tools available and which sort routine will perform best for a situation saves time Connecting with Computer Science, 2e 56 Summary • Data structures organize data • Basic data structures – Arrays, lists, queues, stacks, trees • Arrays – Store data contiguously – May have one or more dimensions • Linked lists – Store data in dynamic containers – Use pointers for noncontiguous storage • Pointer: contains memory cell address as its data Connecting with Computer Science, 2e 57 Summary (cont’d.) • Stack – Linked list structured as LIFO container • Queue – Linked list structured as FIFO container • Tree – Hierarchical structure consisting of nodes – Binary tree: nodes have at most two children – Binary search tree: efficient for searching for information Connecting with Computer Science, 2e 58 Summary (cont’d.) • Sorting algorithms – Organize data within structure • Examples: selection sort, bubble sort, quicksort, merge sort, insertion sort, shell sort • Sorting routines – Analyzed by code, space, and time complexities Connecting with Computer Science, 2e 59