powerpoint lecture

advertisement
Connecting with Computer
Science, 2e
Chapter 8
Data Structures
Objectives
• In this chapter you will:
– Learn what a data structure is and how it’s used
– Learn about single-dimensional and multidimensional
arrays and how they work
– Learn what a pointer is and how it’s used in data
structures
– Learn that a linked list allows you to work with
dynamic information
Connecting with Computer Science, 2e
2
Objectives (cont’d.)
• In this chapter you will (cont’d.):
– Understand that a stack is a linked list and how it’s
used
– Learn that a queue is another form of a linked list and
how it’s used
– Learn that a binary tree is a data structure that stores
information in a hierarchical order
– See an overview of several sorting routines
Connecting with Computer Science, 2e
3
Why You Need to Know About…Data
Structures
• Data structures:
– Organize the data in a computer
• Efficiently access and process data
– All programs use some form of data structure
– There are many occasions for using data structures
Connecting with Computer Science, 2e
4
Data Structures
• Defined as a way of organizing data
• Types of data structures in memory
– Arrays, lists, stacks, queues, trees
• File structures organize storage media data
• Computer memory is organized into cells
– Memory cell has a memory address and content
– Memory addresses are organized consecutively
– Data structures hide memory implementation details
Connecting with Computer Science, 2e
5
Arrays
• Set of contiguous memory cells
– Used for storing the same type of data
• Simplest memory data structure
• Consists of a set of contiguous memory cells
– Memory cells store the same type of data
• Usefulness:
– Storing similar kinds of information in memory
• Sorted or left as entered
– One array name for a number of similar items
Connecting with Computer Science, 2e
6
Arrays (cont’d.)
Figure 8-2, Arrays make program logic easier to understand and use
Connecting with Computer Science, 2e
7
How an Array Works
• Java example
– int[ ] aGrades = new int[5];
•
•
•
•
•
•
“int[ ]” indicates array will hold integers
“aGrades” identifies the array name
“new” keyword specifies new array being created
“int[5]” reserves five memory locations
“=” sign assigns aGrades as “manager” of the array
“;” (semicolon) indicates end of statement reached
• Hungarian notation standard is used to name
“aGrades”
Connecting with Computer Science, 2e
8
How an Array Works (cont’d.)
Figure 8-3, Five contiguous memory cells managed by
aGrades in a single-dimensional array
Connecting with Computer Science, 2e
9
How an Array Works (cont’d.)
• Element: memory cell in an array
• Dimension: levels created to hold array elements
– aGrades: single-dimensional array (row of mailboxes)
• Offset specifies distance between memory locations
– Array’s first position: referenced as position 0
– Next array position: referenced as position 1
• Found by using starting memory location plus one
offset
– Third array position: referenced as position 2
• Found by using starting memory location plus two
offsets
Connecting with Computer Science, 2e
10
How an Array Works (cont’d.)
Figure 8-5, Arrays start at position 0 and use an offset to
know where the next element is located in memory
Connecting with Computer Science, 2e
11
How an Array Works (cont’d.)
• Index (subscript)
– Indicates memory cell to access in the array
• Looks at element’s position
• Placed between square brackets ( [ ] ) after array name
• Integer placed in “[ ]” for access
– Example: aGrades[0] = 50;
• Position 0: first position
– Array with positions 0 to 4
• Five memory cells or addresses
• Upper bound: highest array position
• Lower bound: lowest array position
Connecting with Computer Science, 2e
12
How an Array Works (cont’d.)
Figure 8-6, The array with all
elements stored
Connecting with Computer Science, 2e
13
Multidimensional Arrays
• Multidimensional arrays
– Consists of two or more single-dimensional arrays
– Multiple rows stacked on top of each other
• Apartment building mailboxes and tic-tac-toe boards
• Creating the tic-tac-toe board
– char[ ][ ] aTicTacToe = new char[3][3];
• Assignment: aTicTacToe[1][1] = ’X’;
– Place X in second row of the second column
• Arrays beyond three dimensions are difficult to
manage
Connecting with Computer Science, 2e
14
Multidimensional Arrays (cont’d.)
Figure 8-7, A multidimensional array is
like apartment mailboxes stacked on
top of each other
Connecting with Computer Science, 2e
Figure 8-8, Tic-tac-toe board
15
Multidimensional Arrays (cont’d.)
Figure 8-9, First row of the tic-tac-toe Figure 8-10, Second and third rows of
board
the tic-tac-toe board
Connecting with Computer Science, 2e
16
Multidimensional Arrays (cont’d.)
Figure 8-11, Storing a value in an array location
Connecting with Computer Science, 2e
17
Multidimensional Arrays (cont’d.)
Figure 8-12, Three-dimensional array
Connecting with Computer Science, 2e
18
Uses of Arrays
• Advantages
– Allow sequential access of memory cells
– Retrieve and store data with element array name and
data type
– Easy to create
– Useful for people who write computer programs
• Disadvantages
– Require a lot of overhead for insertions
– Memory cell data is only accessed sequentially
Connecting with Computer Science, 2e
19
Lists
• Hold dynamic lists of data
– Lists vary in size
– Examples: class enrollment, cars being repaired, email in-boxes
• Appropriate whenever amount of data is unknown or
can change
• Three basic list forms:
– Linked lists
– Queues
– Stacks
Connecting with Computer Science, 2e
20
Linked Lists
• Use noncontiguous memory locations to store data
– Each element points to the next element in line
• Does not have to be contiguous with previous element
–
–
–
–
Used when exact number of items is unknown
Store data noncontiguously
Maintain data and address of next linked cell
Examples: names of students visiting a professor,
points scored in a video game, list of spammers
• Basic constructs for more advanced data structures
– Queues and stacks: pointer based
Connecting with Computer Science, 2e
21
Linked Lists (cont’d.)
• Pointers: memory variable containing the address of
a memory cell as its data
• Illustration: linked list game
–
–
–
–
–
–
Students sit in a circle with piece of paper
Paper has box in the upper left corner and center
Upper left box indicates a student number
Center box divided into two parts
Students indicate favorite color in left part of center
Professor has a piece of paper with a number only
Connecting with Computer Science, 2e
22
Linked Lists (cont’d.)
Figure 8-14, Structure of a linked list
Connecting with Computer Science, 2e
23
Linked Lists (cont’d.)
• Piece of paper represents a two-part node
– Data (the first part, the color)
– Pointer (the student ID number)
• Professor’s piece: head pointer with no data
• Last student: pointer’s value is NULL
• Inserting new elements
– No resizing needed
– Create new “piece of paper” with dual node structure
– Realign pointers to accommodate new node (paper)
Connecting with Computer Science, 2e
24
Linked Lists (cont’d.)
Figure 8-15, Inserting an element into a linked list
Connecting with Computer Science, 2e
25
Linked Lists (cont’d.)
• Similar procedure for deleting items
– Modify pointer of element preceding target item
– Students deleted from list without moving elements
• Use dynamic memory allocation
– More efficient than arrays
– Memory cells need not be contiguous
Connecting with Computer Science, 2e
26
Linked Lists (cont’d.)
Figure 8-16, Deleting an element from a linked list
Connecting with Computer Science, 2e
27
Stacks
• List in which the next item to be removed is the item
most recently stored
– “Push” items on to the list to store new items
– “Pop” items off the list to retrieve current items
• Examples
– Restaurant spring-loaded plate holder or text editor
• Peeking
– Looking at the stack’s top item without removing it
• LIFO data structure
– Last or most recent item pushed (put) onto the stack
• Becomes first item popped (removed) from the stack
Connecting with Computer Science, 2e
28
Stacks (cont’d.)
Figure 8-17, The stack concept
Connecting with Computer Science, 2e
29
Uses of a Stack
• Processes lines of program source code
• Source code is logically organized into procedures
– Keep track of procedure calls with a stack
– Address of procedure popped off stack
Connecting with Computer Science, 2e
30
Back to Pointers
• Stack pointer
– Keeps track of where to remove or add an item in a
data structure
• Check stack before applying pop or push operations
• Stacks
– Memory locations organized into logical structures
• Facilitates reading from them and writing to them
Connecting with Computer Science, 2e
31
Back to Pointers (cont’d.)
Figure 8-18, Stack pointer is decremented when the item
is popped off
Connecting with Computer Science, 2e
32
Queues
• Another type of linked list
–
–
–
–
Implements first in, first out (FIFO) storage system
Insertions made at the end
Deletions made at the beginning
Similar to that of a waiting line
Connecting with Computer Science, 2e
33
Uses of a Queue
• Printer example
– First item printed
• Document waiting longest
– Current item deleted from queue
• Next item printed
– New documents
• Placed at the end of the queue
• Insertions of new data occur at the rear of the queue
• Removal of data occurs at the front of the queue
Connecting with Computer Science, 2e
34
Pointers Again
• Head pointer tracks beginning of queue
• Tail pointer tracks end of queue
• Queue containing no items
– Both the head and tail pointer point to same location
• Dequeue operation
– Remove item (oldest entry) from the queue
• Head pointer changed to point to the next item in list
• Enqueue operation
– Item placed at list end
– Tail pointer updated
Connecting with Computer Science, 2e
35
Pointers Again (cont’d.)
Figure 8-19, A queue uses a FIFO structure
Connecting with Computer Science, 2e
36
Pointers Again (cont’d.)
Figure 8-20, Removing an item
from the queue
Connecting with Computer Science, 2e
Figure 8-21, Inserting an item into
the queue
37
Trees
• Hierarchical data structure similar to organizational
or genealogy charts
– Node or vertex: position in the tree
Figure 8-22, Tree data structure
Connecting with Computer Science, 2e
38
Trees (cont’d.)
• Binary tree
–
–
–
–
–
–
–
–
Each node has at most two child nodes
Node can have zero, one, or two child nodes
Left child: child node to the left of the parent node
Right child: child node to the right of the parent node
Root: node that begins the tree
Leaf node: node that has no child nodes
Depth (level): distance from root node
Height: longest path length in the tree
Connecting with Computer Science, 2e
39
Trees (cont’d.)
Figure 8-23, Tree nodes
Connecting with Computer Science, 2e
Figure 8-24, The level and height
of a binary tree
40
Uses of Binary Trees
• Binary search tree: a type of binary tree
– Data value of left child node is less than the value of
parent node
– Data value of right child node is greater than the
value of parent node
• Useful for searching through stored data
– Storing information in a hierarchical representation
Connecting with Computer Science, 2e
41
Uses of Binary Trees (cont’d.)
Figure 8-25, A file system structure
can be stored as a binary search tree
Connecting with Computer Science, 2e
42
Searching a Binary Tree
• Three components in a binary search tree node:
– Left child pointer, right child pointer, data
• Root pointer contains root node’s address
– Provides initial access to the tree
• If left or right child pointers contain a null value
– Node is not a parent to other nodes down that specific
path
• If both left and right pointers contain null values
– Node is not a parent down either path
• Binary tree must be defined properly to be
searchable
Connecting with Computer Science, 2e
43
Searching a Binary Tree (cont’d.)
Figure 8-26, A node in a binary search tree
Connecting with Computer Science, 2e
44
Searching a Binary Tree (cont’d.)
• Search routine
–
–
–
–
Start at the root position
Determine if path moves to left child or right
Move in direction of data (left or right)
If left pointer NULL
• No node to traverse down the left side
– If left pointer does have a value
• Path continues down that side
– If value looking for is found
• Stop at that node
Connecting with Computer Science, 2e
45
Searching a Binary Tree (cont’d.)
Figure 8-27, Searching a binary tree for the value 8
Connecting with Computer Science, 2e
46
Searching a Binary Tree (cont’d.)
Figure 8-28, Searching a binary tree for the value 1
Connecting with Computer Science, 2e
47
Sorting Algorithms
• Sorting: leverages data structures to organize data
– Example of data being sorted:
•
•
•
•
Words in a dictionary
Files in a directory
Index of a book
Course offerings at a university
• Many algorithms for sorting
– Each has advantages and disadvantages
• Focus: selection and bubble sorts
Connecting with Computer Science, 2e
48
Selection Sort
• Selection sort: mimics manual sorting
– Starts at first value in the list
– Processes each element looking for smallest value
– After smallest value found, it is placed in first position
• Moves first position value to location originally
containing smallest value
– Sort moves on looking for next smallest value
– Continues to “swap places”
• Simple to use and implement
• Inefficient for large lists
Connecting with Computer Science, 2e
49
Selection Sort (cont’d.)
Figure 8-29, A selection sort
Connecting with Computer Science, 2e
50
Bubble Sort
• Bubble: older and slower sort method
– Start with the last element in the list
– Compare its value to that of the item just above
– If smaller, change positions and continue up list
• Continue comparison until smaller item found
– If not smaller, next item compared to item above
– Check until smallest value “bubbles” to the top
– Process repeated for list less first item
• Simple to implement
• Inefficient for large lists
Connecting with Computer Science, 2e
51
Bubble Sort (cont’d.)
Figure 8-30, A bubble sort
Connecting with Computer Science, 2e
52
Bubble Sort (cont’d.)
Figure 8-31, The bubble sort continues
Connecting with Computer Science, 2e
53
Other Types of Sorts
• Quicksort: incorporates “divide and conquer” logic
–
–
–
–
Two small lists are easier to sort than one large list
Uses recursion to break down problem
All sorted sub-lists are combined into single sorted set
Fast but difficult to comprehend
Connecting with Computer Science, 2e
54
Other Type of Sorts (cont’d.)
• Merge sort: similar to the quicksort
– Continuously halves data sets using recursion
– Sorted halves are merged back into one list
– Time efficient, but not as space efficient as quicksort
• Insertion sort: simulates manual sorting of cards
– Requires two lists
– Not complex, but inefficient for list size fewer than
1000
• Shell sort: uses insertion sort against expanding
data set
Connecting with Computer Science, 2e
55
One Last Thought
• Algorithms
– Used everywhere in the computer industry
– Knowing how to work with data structures and sorting
algorithms is necessary to begin writing computer
programs
– Many algorithms are written and available for use
• Knowing the tools available and which sort routine
will perform best for a situation saves time
Connecting with Computer Science, 2e
56
Summary
• Data structures organize data
• Basic data structures
– Arrays, lists, queues, stacks, trees
• Arrays
– Store data contiguously
– May have one or more dimensions
• Linked lists
– Store data in dynamic containers
– Use pointers for noncontiguous storage
• Pointer: contains memory cell address as its data
Connecting with Computer Science, 2e
57
Summary (cont’d.)
• Stack
– Linked list structured as LIFO container
• Queue
– Linked list structured as FIFO container
• Tree
– Hierarchical structure consisting of nodes
– Binary tree: nodes have at most two children
– Binary search tree: efficient for searching for
information
Connecting with Computer Science, 2e
58
Summary (cont’d.)
• Sorting algorithms
– Organize data within structure
• Examples: selection sort, bubble sort, quicksort, merge
sort, insertion sort, shell sort
• Sorting routines
– Analyzed by code, space, and time complexities
Connecting with Computer Science, 2e
59
Download