CS 122 – Jan. 9 • • • • • The nature of computer science General solutions are better What is a program? OO programming Broad outline of topics: – Program design, GUIs, recursion, sorting, collections • Handout: – Review of Java (errors, I/O, ArrayList) Computer science • • • • The study of how we solve problems How we manipulate and represent information Ultimately, think of rep’n of brain & knowledge Inspiration for hierarchical data structures • In problem solving, we crave general solutions – Peanut butter & jelly sandwich any sandwich – Average of 3 numbers No restriction Program • What kinds of programs have you seen, written? • Software can be classified in many ways. First, consider this breakdown: simple & not-so-simple • “Simple” problems can be solved easily with 1 source file – Input, calculations, output – Ex. How many vowels does a 9-letter word typically have? • Object oriented technique for larger concerns • Critical for us to design first – why? Review • You can find my notes for previous course here – http://cs.furman.edu/~chealy/cs11/notes.ppt • Java API documentation – http://docs.oracle.com/javase/7/docs/api/ • Handout highlights some important things to remember: common errors, I/O, ArrayList class CS 122 – Jan. 11 • OO programming – Good for development of larger programs – Encapsulation • Examples • Handout – Review elementary Java OO programming • In the problem description, look for nouns first • A noun can be a primitive type or a class • Often we define our own classes – E.g. Card, Deck, Board, Player • An object has – Attributes (i.e. little nouns) – Operations • In a larger program, we have several source files (i.e. classes) Philosophy • OO programming means we look for nouns first • A “noun” could be – Primitive type – Class object, a complex noun with attributes + operations • We define classes that accurately reflect the information in our program, but not every detail (abstraction). • Defining your own class might not be necessary. – Sometimes the type/class we want already exists in the language or API. – Your class might have just 1 attribute or just 1 instance Encapsulation • Separate implementation from use – Outside class implementation (e.g. in Driver), we should not be concerned with object rep’n or how operations implemented – Logic & strategy of a game does not depend on how information is stored internally – This is why attributes are declared private • Only be concerned with “interface” information – Names of methods (including constructors), necessary parameters, return type Example • Let’s design a small OO program pertaining to information about mountains. • Mountain class – Attributes for: continent, name, height – Constructors – Can have get/set methods for each attribute • Driver class with main( ) function – Contains the logic that actually does something with the data. Let’s read a text file containing mountain information, and list those over 10,000 feet. continued • Outline of driver class: – Read file – For each line: • Parse information contained in the line • Create mountain object • Add to array or Arraylist – Close file – For each mountain • If it’s over 10,000 feet, print it out • Have we solved the problem? • See review handouts Consider these (From handout) • Count the number of words in a text file • Modify mountain program to display average height of all the mountains • Compute the cost of tiling a floor – What class should be defined? Its attributes & operations? CS 122 – Jan. 13 • • • • Briefly discuss room tiling problem Aggregation Interfaces Inheritance • Handouts: – Encryption (interfaces) – Shapes (Please show me your lab if you have not already done so.) Generalizing • If you can tile/carpet a single room, you can do a whole house, etc. • A House contains Rooms – Create a House class that has an attribute representing a collection (i.e. ArrayList) of Room objects – To find total cost for house, write a loop that traverses this list of rooms, finding the cost for each one. • Can extend outward to more layers: Subdivision, City • This class relationship is aggregation, or “has a” Illustration • After extending our Room program to include a House class, we now have 3 classes: – Driver class • Create a House object • Find the total cost to tile/carpet this house – House class • Contains collection of Rooms • For each room, call appropriate room.findCost( ) and add them all up. – Room class • Already has a formula to find its cost Organization • How is an OO program organized? • Look at Driver/main( ) to see what a program is doing. • Look at the other classes to see what the program is made of (big nouns), and what it is capable of (operations) Class building helpers • Designing and implementing a class can be time consuming. There are 2 OO features to help. They start with the letter “I” – Interface – Inheritance • An interface is a to-do list of methods we want for a class • An interface is not a class! • We say that a class implements an interface. • It’s possible for 2+ classes to implement the same interface. Examples • Geometrical shapes like Rectangle, Circle – ShapeInterface.java – Rectangle.java – See handout • Encryption – Different ways to do it – Need to define functions to encrypt & decrypt – Many possible implementations! • Note: compiler will complain if you forget to implement all methods listed in interface Inheritance • Another class relationship – “is a” rather than “has a” • Useful for what? • To improve existing class, add functionality – Random has nextInt( ) and nextDouble( ), but not nextLetter( ) or nextWord( ) • To make a more specialized class – Publication Book or Magazine – Animal Fish, Reptile, Bird • When you inherit something, you can override it Shape example • Shape class • Rectangle, Cube and Sphere inherit from shape • 3-D inherits from 2-D – Yes, you can have inheritance among interfaces! • Rectangle implements 2-D • Cube and Sphere implement 3-D • Can represent all these relationships with UML diagram CS 122 – Jan. 18 • Inheritance • Polymorphism • Handouts: – Zoo – Speaker – Pet Zoo program • What inheritance relationships do we have among the classes: Animal, Bird, Fish, Penguin Reptile, Snake? • ArrayList<Animal> • “super” is called implicitly, for default constructor – For other kinds of constructors, call super( ) first to initialize inherited attributes. • exercise( ) uses instanceof – saves us from having to implement exercise( ) in every subclass • We override toString( ) Override • When a method is called, how do we know where to find its implementation? Penguin p = new Penguin( ); p.fun( ); – Which fun( ) is being called? – What if fun( ) does not exist in Penguin.java? • You can look up, but not down when looking for attribute/method. – E.g. equals( ) and toString( ) are defined in Object, and we normally override these. Polymorphism • What does this word mean? • At any time, an object can change its dynamic type by a call to a constructor. • When declared, object is given a static type to ensure program will compile. • Examples – Speaker (interface) – Pet (inheritance) CS 122 – Jan. 20 • Handling exceptions • Begin study of GUIs – Start with simple applets – Responding to events • Handouts: Exceptions Steps in writing a GUI Applet examples Exceptions • What can go wrong when a program runs? (run-time errors) – File not found, number format, etc. • Important errors have their own classes that we can include • To allow program to continue gracefully, use try/catch block • Example: Read a menu, and average the prices, ignoring all the words – See handout Steps in creating GUI • Define & initialize the components you want – Where? In main( ), init( ), or constructor – Component = object to show on screen • Decide how things should be drawn – Details in paint function • Create listener objects – Figure out which components (e.g. button) will generate events for this listener. – Define how listener should respond to event Typical Applet Structure • My applet class – Attributes – Constructor • Inner Listener class – somewhere in here call repaint( ) • Create objects that we will draw • Call the listener – Paint method Applet examples • Applet = small program, not stand-alone application. To be run in a browser. • In each program, you’ll see 3 parts – init( ) or constructor – paint( ) if we want to draw geometric shapes – Listener class, with object declared in init( ) • FirstApplet – doesn’t have listener. We just take input string and go. • MouseApplet3 – User can click, and blue square follows! Look at mousePressed( ) • MouseApplet4 – alternate blue/red Applets, continued • In mouseApplet4: How is the grid color changing? • How often does a certain code fragment get executed? – init( ), constructor once at the beginning – Listener when the event occurs – paint when we need to refresh the screen CS 122 – Jan. 23 • • • • Review mouse applets Applet that draws polygon “Containment hierarchy” GUI Application • Handouts: – Draw polygon (PolygonApplet) – Containment hierarchy – EasyFrame Drawing polygon • GUI components: – Frame contains panel, which contains: 2 Labels, 2 Text fields, 1 Button • Need to attach listener to button! • Data structure for our polygon – ArrayList of Point objects – When ready to draw, create Polygon to paint in applet: • X-array • Y-array • Number of points We can create Polygon in the listener or in paint( ) GUIs • GUIs have components, and some components generate events – Components are placed into “panels” • Ex. Polygon applet – Components of the pop-up frame: Frame has a Panel Panel has 2 labels, 2 text fields, 1 button – Attach listener to button, so we know how to add a new point – Draw polygon using x array, y array, # points GUIs • Containment hierarchy for realistic GUI: – Usually 1 frame (window), comes with its own default panel: “content pane” – Create more panels to add to content pane – Put components into panels; panel has some layout that determines where stuff goes • What events could take place? – ActionEvent for button – ChangeEvent for slider – CaretEvent for typing in text field (ignore this one) Don’t memorize – these events are listed in handout First examples • “EasyFrame” has no events, but is application. – Note use of JFrame. This is the starting point for any GUI application. Example • Containment hierarchy can look like this • Frame – Panel • North panel – 2 labels • West panel – Panel for first name » Label » Text field – Panel for last name » Label » Text field CS 122 – Jan. 25 • Any questions on containment hierarchy? – Note that we define frame in driver, and have separate classes for panels – In panel constructor, good place to add listener • Handouts: – Text area – Tabs & layouts – Rectangle frame (with control panel) First examples • “EasyFrame” has no events, but is application. – Note use of JFrame. This is the starting point for any GUI application. • “Text area” example: – Scroll pane – Has an input frame, and an output frame – Listener for button Tabs & layouts • There are different ways to lay out components in a GUI. • In the “Tab” program, a frame contains a “tabbed pane”, which contains 5 panels • 4 panels show possible layouts – – – – Flow (default) Border (when you add stuff, say where it goes) Grid Box (uses “glue”; when initialized, specify vert or horiz orientation) • With JLabel, you can center / justify text. RectangleFrame • We can draw geometric objects in panel instead of applet – This program is similar to one in the lab • Frame contains 2 panels – Specified in frame’s constructor – Input panel listens to Action event from a button – Output panel repaints the rectangle • paintComponent( ) is analogous to paint( ) that you saw in applets. CS 122 – Jan. 27 • More interactive features in a GUI! – Radio buttons • Button group to enforce mutual exclusion – Check boxes • Making a GUI easier to use • Handouts: – Pizza (choices) – Day/night (special features) Pizza GUI • Allow user to make choices – Radio buttons – mutually exclusive choice – Check boxes – can have any # checked – We’ll use the isSelected( ) method • Any change will affect price, which is shown in a JLabel object. Can use the same listener. • Button group for radio buttons – to make sure only 1 can be selected • Where / how do we update price? • I decided not to have separate classes for each panel Day / Night GUI • User-friendly features in a GUI – Enable / disable a button whenever appropriate – Mnemonics – Tool tips • We have a picture panel and button panel – We have listener for each of 2 buttons – One panel needs to talk to the other send parameter! • Can you guess appearance / behavior from the code? When are the various functions invoked? CS 122 – Jan. 30 • Jlist – Maintains a list of objects (e.g. file names) • • • • • Sliders Communication between panels Review event train of thought Border, split pane, scroll pane Handouts: – JList (pick image) and borders – JSlider – Events Review events Component Interface Must implement To get input, use in implementation [mouse] MouseListener mousePressed( ) getX( ), getY( ) JButton ActionListener [text field].getText( ) actionPerformed( ) [radio button].isSelected( ) JList ListSelectionListener valueChanged( ) getSelectedValue( ) JSlider ChangeListener getValue( ) stateChanged( ) CS 122 – Feb. 1 • Recursion – Definition – How function calls work – Purpose & examples • Handouts: – Odd number / factorial / Fibonacci – Exponential and Pascal formulas Recursion • When a function calls itself • A more technical definition – within a single thread of control, when a function is called and an earlier instance of the same function has not yet returned • It’s a problem-solving technique – You have a problem but only know how to solve a small case – Break a problem down until it can be expressed in simple cases Function calling • Review what happens when main calls f, and f calls g. – While in g, how do we know where we’ve been? How do we know where to return to? Function calls to g may appear many times throughout the program. • In recursion, the common scenario is: main calls f, and f calls itself! – When f calls itself, it’s not yet returning – How does process not go on forever? – Need a “base case”; it must always be achievable. • Direct vs. indirect recursion Recursion • We’ll start with math formulas, as in defining a sequence of numbers • You’ll notice definitions with 2 parts: – Base case – Recursive case • How would you define these sequences? 8, 13, 18, 23, 28, … 3, 6, 12, 24, … • Well known recursive problems: – Factorial, Fibonacci numbers, Pascal’s triangle Thinking recursively • Recursion is a different way of approaching the problem of having to repeat something – It’s accomplishing the same thing as a loop – Theoretically, loops and recursion are equivalent, and we can convert one to the other. • Express some problem in simpler terms – an important skill – Ex. fact(6) = 6 * fact(5) – How does this continue? Formula examples • Let’s examine recursive implementations of – Computing the nth odd number – Factorial – Fibonacci • Refer to handout CS 122 – Feb. 3 • For HW: how to check cube color validity? • Simple recursion examples – Exponentiation – Pascal’s triangle • Number of recursive calls inefficiency • Towers of Hanoi • Handouts: – Towers of Hanoi Inefficiency • Recursion can be inefficient when you run it • For Fibonacci and Pascal’s triangle, the number of function calls increases rapidly. Why? • How many total function calls are needed for Pascal(4,2)? – Pascal(a, b) calls Pascal(a – 1, b – 1) and Pascal(a – 1, b) – We can draw a tree of all the function calls made. – Implementations vary, but we can encounter base case whenever b = 0 or a = b. # calls for Pascal(a, b) • • • • a = b or b = 0 1 call 2,1 1 + 1 + myself = 3 3,1 calls(2,0) + calls(2,1) + myself = 1+3+1 = 5 4,1 calls(3,0) + calls(3,1) + myself = 1+5+1 = 7 • • • • 2,2 3,2 4,2 5,2 1 call calls(2,1) + calls(2,2) + myself = 3+1+1 = 5 calls(3,1) + calls(3,2) + myself = 5+5+1 = 11 calls(4,1) + calls(4,2) + myself = 7+11+1 = 19 Towers of Hanoi • How can we move n disks from peg A to peg B if: – Can only move one disk at a time – Cannot place a disk on top of a smaller one – See handout CS 122 – Feb. 6 Recursion • Any repetitive action can be expressed recursively – In other words, we can convert from loop to recursion – Some programming languages only allow recursion • Examples • Handout: – Practice recursion Remarks • Earlier we saw how to define a sequence recursively. What if we interleave 2 sequences? – 8, 12, 14, 20, 20, 28, 26, 36, … • Beware of infinite recursion – stack overflow – In my example “Limit” program, we could call ourselves thousands of times before running out of memory for the program. Practice • For lab, see handout Practice.java • How would you approach the problem of – Counting from 1 to 10? 10 to 1? Counting by 5’s? Print the alphabet? – Printing 10 stars? N stars? – You always need a base case, and a recursive case (at least 1 of each) • Converting an integer to/from binary string CS 122 – Feb. 8 Recursion: • More on number base conversion • Another array application: search – Linear search strategy – Binary search strategy • (Some searches require backtracking) • Handouts: – Recursive searches – 8 queens CS 122 – Feb. 10 Backtracking • Recursion that takes care of mistakes, wrong turns – Good for various types of searches • Need to find solution to some problem without trying every combination • We’ll look at a couple of examples from chess – 8 queens problem – Knight’s tour • Handout: – Knight’s Tour 8 Queens • Is there some way to place 8 queens on a chess board so that no one can get captured? – (There are no other pieces on the board, just the queens) • There are over 4 billion ways to arrange the queens • Don’t write 8 nested loops – waste Outline of solution • Place one queen in the first column. • Place the next queen in the next column, in a place safe from attack • If not able to place this queen, back up! – It means previous combination of choices does not produce a solution. We got stuck. • In the code, look for – Pre and post condition (comment for us) – Base case = we’re done – When we backtrack, see value of done • To find all solutions, set done to false to pretend we failed. Print solution in base case. Knight’s tour • Is it possible for a knight to make legal moves from square to square on a (5x5) chess board visiting all the squares exactly once? (8x8 board would be too slow) • Code is a little ugly, due to knight moves and checking if off the board • Details – We start in a corner – There are 8 possible directions to choose from – Cells are numbered 1-25 • Actually, an elegant iterative solution exists: – Start in a corner & continue to rotate in same direction using outermost possible squares. CS 122 – Feb. 13 • Backtracking – We saw some chess examples – How would we find all solutions to a problem, not just one? • Something amazing: finding our way through a maze • Handouts: – Maze Maze • Famous example of backtracking • solve( ) is heart of solution – returns boolean • Start at solve(0, 0) – Done = false – If blocked, return false // base cases – If at exit, return true Else // recursive case: – Mark this square as being tried – Done = solve (go up) – if not done, try another direction If done, mark the successful path cascade returns continued • blocked( ) checks – Wall – Out of bounds – Already tried that cell • Where do we see backtracking – In the code? – In the output? • Similar problems: 3-D, finding size of contiguous area or tumor CS 122 – Feb. 15 • Example: Upside-down numbers • Defining things with recursion – Specifically, sets of strings – Recursive function to test whether a given string has this property – Generating these strings. (How do we decide on length?) • Handout – Defining strings Recursive definitions • Recursion is good for defining things. If we can define a set of strings, then we can – Generate possible strings in that set – Test for membership – like a compiler checking syntax • We start with a description of a set of strings (“language”) – Ex. Strings consisting of 1 or more a’s. • Can be helpful to write down example words • Identify base case, recursive rule • Write a formal definition (can express it in Java) – See handout for examples Accepting a string • Starts with an ‘a’, followed by zero or more ‘b’s Function isGood(String s) if s is “a”, return true else if s length <= 1, but not “a”, return false else // length 2 or more verify that s ends with ‘b’ call isGood(s minus its last character) • String is of the form anbn Function isGood(String s) if s is “”, return true else if s length is odd, return false else // even length verify s begins with ‘a’ and ends with ‘b’ call isGood(s minus first and last char’s) CS 122 – Feb. 17 • Recursion with sets of strings: – Accepting a string – Generating possible strings: how do we decide on the length? concept of random recursion • Handouts – Accept – Generate – Lobster Random recursion • Let the machine decide how far to go • Often, this means we are counting up, rather than down – Ex. To return a random factorial, we effectively multiply 1 * 2 * 3, … and stop at some random place • Technique is useful when you want to generate random text – How many words do you want? – Inside recursive function, we “decide” whether to take base case or not. CS 122 – Feb. 20 • Using recursion to … – Generate all possible permutations – Evaluate a numerical expression • Handouts – Review for midterm, including old test w/answers – Permute – Evaluator Permutations • How do we permute 4 objects { W, X, Y, Z } ? – – – – Write down W, then permute the 3 objects { X, Y, Z } Write down X, then permute the 3 objects { W, Y, Z } Write down Y, then permute the 3 objects { W, X, Z } Write down Z, then permute the 3 objects { W, X, Y } • See what is happening in general? – Each object gets swapped to the front temporarily – Rest of list is permuted recursively • What is the base case? Expression evaluation • • • • What is a mathematical expression? Expr = “terms” separated by + or – Term = “factors” separated by * or / Factor = single number or another expression! • Indirect recursion • For simplicity, implementation assumes 1 digit operands • Program works also for non-nested expressions: start with these CS 122 – Feb. 22 • Recap grammar in Old Mother Hubbard • More about expressions – Define recursively in form of a grammar – Introduce alternate notations: prefix and postfix • Recursively defined image • Handouts: – GUI recursion (Fractal) Making couplets rhyme • She went to the ____ to buy him a { ____ [ But when she came back, he was ____ the ] ____ } • The 2nd and 4th blanks must rhyme. Select these words at the same time! (at….at) • She went to the ____ to buy him a (ee….ee) • Between the rhyming words, we have a phrase that can be chosen independently. Expression grammar • “An expression is 1+ terms separated by + or –” • “A term is 1+ factors separated by * or /” • “A factor is a single number or an expression” expr term | term + term | term – term term factor | factor * factor | factor / factor factor number | expr number 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 • In later courses, we’ll look at this in more detail to work out subtleties. Notations • There are 3 ways to write down a mathematical expression. We’ll study them in detail soon. • Infix (default): operator sits between operands • Prefix: operator appears before operands • Postfix: operator appears after operands • Advantage of prefix & postfix – No ambiguity about order of operations – So, we never need grouping symbols • Can we define prefix and postfix expressions formally? GUI recursion • Drawing tiled images – To make this work, the applet has to know where images are. – drawImage() lets us display image without JLabel. It has several versions based on parameters. • Fractal image. Organization: • Applet contains panel which contains: – Button panel • 2 buttons • 2 labels – Koch snowflake panel Keeps track of level. For each level, 5 points on the side of a triangle. CS 122 – Feb. 27 • Design • Software life cycle • Classes – discovery – Class relationships • State • Handouts: – OO design concepts – Store design example Design • Software life cycle… includes maintaining code • Don’t try to do everything at once – Waterfall model: risk of over-designing, creating classes you’ll never use – Spiral approach: go thru iterations, phases • Creating a design – Discover classes & their relationships – Consider sequence of events “states”, although this is not needed for simple programs Design • In real life, classes can be complex, with many attributes and operations. • For example: Airline, Army, Bank, Casino, Hotel, Restaurant, Store • Programs including such a class may need other classes as well, such as Employee, Customer, Product • The state you are in determines the effect / functionality of input devices (buttons, dials). – Consider accelerator pedal. Examples • Class relationships (database) for a Store • What states would we encounter in… – – – – – – a microwave oven? a vending machine? an ATM? a calculator? a casino? buying an airline ticket? • A GUI has buttons that need to respond based on state; multi-purpose buttons. CS 122 – Feb. 29 • Lab recap (designs) • ATM design example • Handouts – – – – Answers to test Complete list of lab questions ATM design ATM button & state table Other examples • In any design, we need to consider – What buttons do we need? – What events do we pay attention to? – How do events and states relate? • For next day, please be ready to discuss: – Design a child’s game that will display a clock and ask the user to enter the time. Determine if the time entered is correct. Keep track of score. Different players may play. – Design a similar game that tests arithmetic – Design a vending machine CS 122 – March 2 • Design examples – ATM – Game – Vending machine • What does it mean for a program to be in some state? – What states are in a solitaire game? • Handout: – Game design Review design • For ATM, buttons A, B, C have different purpose depending on state – that’s why they have generic names – Otherwise we’d need a lot more buttons! • Take note of – Attributes and operations of each class – Table of states and buttons • For vending machine, – What buttons do we need? – What events do we pay attention to? – How do events and states relate? Vending machine design • Machine contains products • Product has price, quantity – can be replenished • Need to accept coins and user selection as “input event” that may signal a state change. • Machine class: Product class: – Attributes are: product [ ] totalMoney password / key moneyInserted – Operations are: emptyMoney( ) create all products Attributes are: name price quantityAvailable Operations are: buy( ) replenish( ) Need coins & selection Operator choice “coin detector” OK Coin return Operator key Increment moneyInserte d Grab selection from text box If money >= price --# product Refund if excess moneyInserted =0 Grab password If correct, state = operator State = coins Grab choice Go to appropriate state Replenish me Incr # product State = operator Empty me totalMoney = 0 State = operator CS 122 – March 12 • Sorting – Many possible ways to accomplish the task • Brings up concept of algorithm efficiency • Handout: – Selection and insertion sort Sorting • Some methods do better depending on type of data or how distributed • Easy algorithms are not the fastest – Most efficient algorithms are clever • One possible method, “swap” sort: for i = 0 to n – 1, for j = i+1 to n – 1, if a[ i ] and a[ j ] are out of order, swap them Famous methods Ordinary Selection Bubble Insertion Swap Recursive Merge Quick Special Counting Radix Bucket Stooge Examples • Selection sort: Find the largest value and swap it into first, find 2nd largest value and put it 2nd, etc. • Bubble sort: Scan the list and see which consecutive values are out of order & swap them • Insertion sort: Place the next element in the correct place by shifting other ones over to make room • Swap sort: For each pair of elements, see if they are in correct sequence. If not, swap them. Check out • Enjoy this Web site demo of sorting methods http://cg.scs.carleton.ca/~morin/misc/sortalg • Doing operations in parallel can speed up sorting – In bubble sort, we can swap #1 and #2 at the same time as #3 and #4, etc. CS 122 – March 14 Recursive sorts • Merge sort: Split list in half until just 1-2 elements. Merge adjacent lists by collating them. • Quick sort: We want “smaller” elements on left, and “larger” elements on right. Recursively continue on these 2 subarrays. – Can be tricky to code • Handouts: – Code for recursive sorts – Quick sort Quick sort • Reminiscent of merge sort, but the tough part is how we partition the (sub)-array • Given a slice of the array a[ p .. r ], – Need to decide on suitable pivot/threshold value. – Start with left and right hands at both ends – Work inward. Stop when you see • Big number (> pivot) on the left • Small number (< pivot) on the right, and – Swap numbers if necessary – The loop stops when the hands converge. This marks end of the first partition for the recursive calls. CS 122 – March 16 • Sorting using the Java API • Radix sort • Bucket sort • Handout: – Sorting using the API Sorting with API • If a is an array of primitive values, use Arrays.sort(a) • If a is an array of objects, use Arrays.sort(a, comparatorObject) • If a is an ArrayList, use Collections.sort(a, comparatorObject) • Purpose of comparator is to tell Java how to compare 2 values belonging to some class. We need to create our own comparator class first! • Arrays and Collections also have other helpful methods for max, min, search, shuffle…. Special sorts • If we make some assumptions about the input data, it is possible to sort in linear time. In other words, by only using singly nested loops. • Radix sort – Dates from the time of punch cards – Assume an upper bound on # digits of each number – For each decimal place from right to left: • Create 10 buckets labelled 0-9 • Place each number into bucket based on its digit @ this decimal place • Concatenate the buckets Bucket sort • Works well if values evenly distributed between lowest and highest • Partition the range of values into n buckets – Typically n = size of array – For example, if you want 10 buckets, and the range of all values is 0-200, then the first bucket holds values 0-20, the next bucket holds values 20-40, etc. • For each value in array, place into appropriate bucket • Within each bucket, sort elements using any method. Ideally there would be very few elements per bucket. • Concatenate the buckets. CS 122 – March 19 • Finish sorting – Counting sort – Stooge sort • Collections (data structures) – Aggregate data types other than array – Defining our own data types that can hold many objects – Example: “Bag” data structure • Handout: – Halloween (bag with ArrayList) Sorting • Counting sort – Good when you have small number of distinct values or a small range repetition – Use 2nd array to record distribution of values – Use distribution to tell you how many of each value to put into final array – Example A = [ 3, 1, 2, 1, 2, 0, 3, 0, 1, 3, 2, 3 ] C = [ 2, 3, 3, 4 ] B = [ 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3 ] Sorting • Stooge sort – Needlessly complex, approx. on order of n2.7 – Purpose purely to study complexity of algorithms, originally a textbook exercise – Given a (sub) array a[p..r], First, swap endpoints a[p] and a[r] if they are out of order if p..r consists of 3+ elements, recursively call yourself 3 times: first 2/3 of list, last 2/3 of list, first 2/3 of list again Collections • Very often we want to maintain a lot of data of the same type • Many ways to conceive of a collection of data • May be linear or nonlinear • A general procedure: – Write an interface – what ops should it do? – Choose an underlying representation. Then you are ready to implement the interface – Test the implementation. If not satisfied, you may need to change the rep’n Considerations • Common operations to a collection include: – Insert, delete, search/get, sort, shuffle, how many, print whole thing out • Design questions for the rep’n – – – – Where do elements get added? Does it matter? Can I have duplicate values? Are the values always sorted? Among (add, search, remove), which operation will be performed most frequently? – Answers to these questions will help decide on rep’n Iterator • Once you have a collection implemented, may be helpful to have systematic way to visit all elements. • Iterator = another class whose sole purpose is to traverse your collection – A collection may have > 1 type of iterator. • Iterator is an interface in the Java API. Methods: – next( ) – hasNext( ) – remove( ) -- we rarely need this one Iterators • Here’s one way to include an iterator with your collection: • Somewhere you need to define the iterator class – Implement the 3 required methods – Since we are not interested in removing elements, just throw UnsupportedMethodException • In your collection’s implementation, include a method that will return an iterator object. • Whoever has an iterator object (e.g. Driver class) can call next( ) and hasNext( ) as needed. Example: Bag • Let’s create our own collection: the Bag • Purpose – to simulate contents of a bag/sack/box – we are generally uninterested in object’s location inside the Bag. – Drawing numbers for bingo or similar game. • Bag interface should include what operations? – In particular, what operation(s) might a Bag have that other collections would not? And vice versa? CS 122 – March 21 • Data often collected into some kind of collection or data structure – May be linear or nonlinear, as appropriate • Another linear data structure: linked list • Handouts: – Linked list guide – Linked list implementation Linked List • Linear data structure, with individual elements called nodes. • Unlike an ArrayList, the individual elements are not numbered. – There is no direct way to obtain, say, the 10th element. • Very little information is needed to maintain list: – List itself consists of a head and tail. Head refers to first element in the list. Tail = last. – Each node in the list knows its previous and next element • So, instead of an array(list) of data, we have a linked list of node objects, each of which holds some data LL Implementation • First, create Node class – 3 attributes: data, prev, next – The data can be of any type. For this reason, we typically define Node generically. – The prev and next are also Node objects. • LinkedList class – Attributes: head, tail (optional: # of elements) – Operations: insert & remove are most important! – What else? – (Almost every class should have toString( ), at least for debugging) LL handout • Examine the various operations such as remove( ) and sort( ) • We can test the LL implementation with Driver and Tester classes. – Prints node address so we can see the actual “links” • We have a separate Node class Notes on LL • Be sure to work out insert/delete operations. There are special cases to consider (see handout). • Singly linked versus doubly linked • Compared to ArrayList, which LL operations are slower, faster? • The Java API already has a LinkedList class. Creating our own is a good exercise to learn how to implement collections, but in practice not necessary. CS 122 – March 23 • 2 more linear data structures: – Stack – Queue (we’ll study this a little later) • Handouts: – – – – Intro to Stacks and Queues Stack implementations Application: Balanced parentheses Application: Postfix evaluation Stack • A linear data structure where we insert & remove elements from the same end, e.g. the “top”. – – – – “Last in – first out” philosophy. The insert operation is called push Delete operation is called pop For example, what is left in a stack after performing these operations: push(3); push(7); push(4); pop( ); push(1); pop( ); pop( ); • Stacks are good when you want things to be processed backwards & forwards – Ex. Maze; certain card games; Recursion & backtracking Implementation • Stack Interface – Useful operations: push, pop, peek, size, toString( ) • (Several implementations possible based on rep’n) • Representation: ArrayList of any type of object. – In other words, we can define a generic stack to make the implementation as general (useful) as possible • Boundary cases – Need to handle empty stack situation. When is this a problem? – Could also impose a logical limit on size. How? Why? • API has a Stack implementation Applications • Some applications we will examine: – Checking balanced parentheses in a mathematical expression – Evaluating a postfix expression – Converting an infix to a postfix expression • You will see stacks in later courses – When a computer program runs, the system must maintain a run-time stack to keep track of nested function calls. When a function returns, how does the system remember where to return back to? Balanced ( ) • • • • Push every ( Pop a ( on every ) Ignore all other characters Result: When do we say input is OK? – Stack should be empty when done with input. – Bad if we have extra ( at the end, or no ( to pop when we see a ). – Need to guard against a run-time error (empty stack). Evaluate postfix • Every operator needs 2 operands (the most recent 2) • So, push numbers on the stack • When you see an operator: – – – – Pop a number, and call it y Pop another number, and call it x Evaluate x OP y (don’t get this backwards!) Push the result • When done, pop the answer. Example • • • • Our postfix expression is: 25 5 – 6 7 + * 9 + When you see a number… push. When you see an operator… pop 2, evaluate, push. When no more input, pop answer. 25 25 5 5 25 – 20 6 7 + * 9 + 6 20 7 6 20 13 9 20 260 260 269 CS 122 – March 26 • Math expressions – Evaluating postfix – Evaluating prefix – Converting from infix to postfix • Handout: – Infix to postfix worksheet – Infix to postfix (Convert.java incomplete for lab) Evaluating expressions • How can a computer evaluate a numerical expression? • First, need to introduce some notation: – Infix: operator goes between numbers – Prefix: operator goes before numbers – Postfix: operator goes after numbers • It turns out that evaluating postfix expressions is relatively quick & easy. • So, our approach is a 2-step process: – Convert an infix expression into postfix – Evaluate the postfix expression Eval prefix ? • Incidentally, how would you evaluate a prefix expression? Not as easy. Try some examples: • +23 Push +, push 2 … • +–234 Push +, push -, push 2, … • +23–4 Push +, push 2, … Infix Postfix • Observations: – The order of the numbers inside the expression does not change – Operators appear later in the postfix version, and possibly out of their original order • Let’s look at simple examples first – How does 1 + 2 become 1 2 + ? – How does 1 + 2 – 3 become 1 2 + 3 – ? – Compare what happens to 1 + 2 * 3 and 1 * 2 + 3. What is different, and why? continued • 1 + 2 * 3 becomes 1 2 3 * +: push +, push *, and when done pop *, pop + • 1 * 2 + 3 becomes 1 2 * 3 +: push *, pop * & push +, when done pop + • 1 * 2 / 3 + 4 – 5 becomes 1 2 3 * / 4 + 5 –: push *, pop * push /, pop / push +, pop + push -, at end pop • 1 + 2 * 3 – 4 becomes 1 2 3 * + 4 –: push +, push *, pop * pop + push -, at end pop - More examples • ( 1 + 2 ) * 3 becomes 1 2 + 3 * What happens when we get to the ) ? • 2 – (3 – (4 – 5 * 6) + 7) * 8 becomes 23456*––7+8*– push -, push (, push -, push (, push -, push *, at ) pop *-(, at + pop – push +, at ) pop +(, … • Contrast these similar cases: 1 – 2 – 3 becomes 1 2 – 3 – 1 – (2 – 3) becomes 1 2 3 – – Moral of story • Don’t worry about the operands. Just print them out as you see them. • Always push ( • When you see ), pop until you see matching ( • When you see an operator, pop those of same or higher precedence. Stop popping when you get to ( or an operator of lower precedence. Then push me. CS 122 – March 28 • Collections – array, ArrayList, LinkedList, Stack, Queue, … • Issues – How do you want to use your data? – How should they be (logically) arranged? In a linear fashion? • Let’s look at queues and hashing • Handout: – Queue – Hashing (to see how fast searches are) Queue • Similar to stack, except: – Philosophy is “first in, first out” not “last in, first out” – As a result, we add elements to one end of the queue, and remove from the opposite end – Names of operations: enquque and dequeue • Purpose – Collect data before it can be processed – Data needs to be buffered for some time (Has this happened to you?) – Producer / consumer, Reader / writer, input / output, client / server • Can be tricky to implement with array • (Java API has no Queue class) Set interface • Often we want to store values in a collection, and our situation calls for: – No duplicates – Order of elements doesn’t matter (freedom to place elements arbitrarily) • Desired operations: – Add, remove, contains, iterator • Java API has 2 Set implementations that are more efficient than ArrayList or Linked List. – HashSet and TreeSet • Important to understand concepts of hashing & trees. Hashing • Hash table = A fancy array-type of data structure • Purpose – We want quick search; respectable insert, delete times – May want to access any element any time, so we don’t want the restrictions of stack or queue – There is no notion of a previous or next value – We have no need to sort the data – Our “bag” data structure could be a hash table if we desire • Major feature: each element has a hash value or hash “code” that determines where into the hash table the object belongs. How to hash • Where does an object go into a hash table? • The object needs to have some intrinsic hash value. Can be determined by an instance method hashCode( ). – Example of hash code could be some unique attribute value, such as ID number. • Next, we use a hash function to convert a hash value into the index into the hash table. – Why? Because hash values can be very large. – Use a mod function based on the size of the hash table. In other words • Hash values are usually not very interesting. • hashCode( ) functions just have to make sure that the objects have a good chance of having unique hash values. • The interesting part of hashing is figuring out where data winds up in the hash table – this is determined by the hash function. • Hash table size is usually a prime number. – We would like objects to map to different places in the hash tables. Otherwise we have a collision. CS 122 – March 29 • • • • Hash code Hash function Collisions, and what to do about them Java’s “Hashtable” – An array-type data structure allowing you to index on any value, not just a number. – Hashing implementation done behind the scenes. – Useful methods: put, get, containsKey, keys • Handout: Hashtable of foods Collisions • Let’s suppose our hash table has size 11. • A simple hash function would say index = hashCode % 11 • What happens if we insert the values 23 and 34? 23 % 11 = 1, and 34%11 = 1. Both values map to the [ 1 ] entry of the table. • Ex. What objects can live at position [ 0 ] ? • There are 2 ways to resolve this situation. – Chaining: allow multiple values to reside inside each cell of the hash table. – Probing: On a collision, find a different nearby place to sit. Chaining • Basically it means that each cell of the hash table maintains a linked list of objects. Each object has a hash code that maps to this address. • The Java API HashSet class uses this technique. Probing • Also known as open addressing • If cell c is already occupied, find a nearby alternative. • Linear probing – Try c+1, c+2, c+3, … until you find a space – At end of the table, wrap around to [ 0 ] and continue – Try example with size 11 table: 24,45,3,87,65,32,1,12,17, using h(k) = k mod 11. • Quadratic probing – Try c+1, c+4, c+9, … instead • Example: Apply hash function h(k)=(2k+1) mod 7 for k = 10, 15, 5, 8, 3, 20, 6. What happens? Load factor • Tells you how much of the hash table is occupied • In practice, when load factor exceeds some threshold, we enlarge hash table and rehash. • In Java’s HashSet, the initial size is 16, and the threshold load factor is 0.75 before doubling size. CS 122 – April 2 • The last major subject of course is nonlinear data structures – Graphs (e.g. network) – Trees (e.g. hierarchy) • Consider graphs first, since trees are a special case – To model something that “looks like” a graph – Relationships between objects we can quantify/measure • Graph terminology • Creating our own Graph class: – Representation, desired ops, applications Graph Graph • A nonlinear data structure • Useful to model any kind of network, or set of relationships • Questions we may want to ask: – How many vertices / edges are there? – Does an edge exist from x to y? – How far apart are x and y? – How many edges incident on x? (i.e. find the degree) – How many nodes are within some distance from x? – Is y reachable from x? – Is there a systematic way to visit every node and return back to the beginning? Graph ideas • A graph consists of – A set of nodes (i.e. vertices) – A set of edges • The purpose of an edge is to “connect” two vertices: to make them adjacent to each other. • Sometimes… – The graph may be weighted: We may want to label a cost or distance on each edge. – The graph may be directed: This indicates one-way traffic. Representations • Adjacency list – For each vertex in the graph, we maintain a list (e.g. linked list or array list) of other vertices that are directly connected to this one • Adjacency matrix – A 2-d array – The vertices are in some order, such as alphabetical order (A, B, C, …) – The entry in row / column indicates whether there is an edge or not. (1 or 0) – Elegantly handles weighted and directed graphs. Example A A B C D 1 1 1 1 1 B 1 C 1 1 D 1 1 E G H 1 1 1 1 1 E F F 1 1 1 G 1 H 1 1 1 1 1 What can we determine about the graph, when given its adjacency matrix? Adjacency matrix • For an ordinary graph – The matrix is symmetric. For example, if A is adjacent to C, then C is adjacent to A. – No vertex is adjacent to itself. So, the main diagonal is all 0’s. • For a directed graph – Matrix not likely to be symmetric • For a weighted graph – We still put 0’s along the main diagonal, to indicate zero cost or distance. – To represent non-adjacent, use ∞ value. Possible design • Graph Attributes – # vertices, # edges – List of vertices – Adjacency matrix • Graph Operations – – – – – – – – Create empty graph Add vertex; add edge getVertex(key) edgeExists(v1, v2) degree(v) Neighbor iterator(v) BFS & DFS iterators (v) Is connected, etc. • Design of Vertex depends on graph’s application… • Vertex attributes – – – – Name Marked Level Others depending on app. • Vertex operations – Mark / unmark / is marked – Get and set level – equals Some details • What do entries in adjacency matrix mean? >0 adjacent 0 myself –1 not adjacent (be careful) • addVertex( ) – need to re-allocate array space – This is the price we pay for using array not ArrayList • addEdge( ) – the no parameter version uses default 1 • findVertex(String) – used by iterator, edgeExists( ), addEdge( ) • neighborIterator( ): just look at 1 row of adj matrix! CS 122 – April 4 • Graph implementation – Relies on Vertex class – Array representation of Vertices and edges – Desired operations? • Neighbor iterator • Handout: – Graph Implementation • Attributes: Vertex [ ], int [ ] [ ] • When creating a graph we may do operations in this order: – Create empty graph (in default constructor) – Add in all the vertices, then the edges • Desired operations? – – – – – Add vertex; add edge Return # vertices; # edges See if some edge exists (or return the distance label) Find degree of a vertex Neighbor iterator Applications • Besides basic graph operatons, we may have specific questions, depending on why we are using a graph • Finding a path (sequence of vertices) between 2 vertices – In particular, find the shortest path • Finding a cycle • Seeing if the graph is connected or not • Let us visit all the vertices Iterator • Here is one way to implement an iterator • Create an inner class, containing – – – – – Attributes needed during the traversal Constructor initializing attributes hasNext( ) next( ) remove( ) – honestly, we won’t need this so you can leave its implementation empty • Create a method that returns iterator object • In driver/test class, use hasNext/next as needed Neighbor iterator • A systematic way to traverse all neighbors of a vertex • Read across one row of the adjacency matrix – Row number will remain constant – Column number will change • Constructor: initialize column number to first column representing a neighbor • hasNext: have we reached the end of adj mat? • Next: return corrent column #, and advance it to next neighbor’s column • (Alternative approach: create ArrayList in constructor) CS 122 – April 11 • Systematic ways to visit all vertices in a (connected) graph – Breadth-first search (BFS) – Depth-first search (DFS) • Handouts: – Worksheet for BFS, DFS – BFS / DFS code Breadth-first search • We assign a level to each vertex. • Start with some vertex, and assign it level 0. • Fan out in all directions at once (well, in some order). – All the neighbors of the first vertex become the “level 1” vertices. – Neighbors of the level 1 vertices become “level 2” – In general, at level n, all the neighbors that do not already have a level number become “level n+1” – Continue until all vertices assigned. Depth-first search • Start at some vertex. • Continue down a path as far as you can until you reach a dead end. • Then you backtrack to some point where you had a choice / fork in the road, and you now try a different route. • For this to work, there needs to be a way to select a “next” vertex to go to – we use the neighbor iterator (e.g. alphabetical order). Using BFS / DFS • Iterators for BFS or DFS can be used to answer questions about a graph. Let’s try these… • Finding path between 2 vertices – DFS: handle it like a maze – BFS: may be easier, and can find all paths simultaneously • How to find a cycle – Which iterator should we use? The Vertex “mark” attribute may be useful here. • Is everybody connected? CS 122 – April 13 • Important graph application: Dijkstra’s shortest path algorithm • Handout: – Dijkstra worksheet (2 pages: instructions plus examples) Dijkstra’s algorithm • How do you find the shortest path in a network? • General case solved by Edsger Dijkstra, 1959 4 7 9 6 7 3 8 4 2 1 3 6 • Let’s say we want to go from “A” to “Z”. • The idea is to label each vertex with a number – its best known distance from A. As we work, we may find a cheaper distance, until we “mark” or finalize the vertex. 1. Label A with 0, and mark A. 2. Label A’s neighbors with their distances from A. 3. Find the lowest unmarked vertex and mark it. Let’s call this vertex “B”. 4. Recalculate distances for B’s neighbors via B. Some of these neighbors may now have a shorter known distance. 5. Repeat steps 3 and 4 until you mark Z. A 4 7 2 B 3 C 4 Z First, we label A with 0. Mark A as final. The neighbors of A are B and C. Label B = 4 and C = 7. Now, the unmarked vertices are B=4 and C=7. The lowest of these is B. Mark B, and recalculate B’s neighbors via B. The neighbors of B are C and Z. – If we go to C via B, the total distance is 4+2 = 6. This is better than the old distance of 7. So re-label C = 6. – If we go to Z via B, the total distance is 4 + 3 = 7. A 4 7 2 B 3 C 4 Z Now, the unmarked vertices are C=6 and Z=7. The lowest of these is C. Mark C, and recalculate C’s neighbors via B. The only unmarked neighbor of C is Z. – If we go to Z via C, the total distance is 6+4 = 10. This is worse than the current distance to Z, so Z’s label is unchanged. The only unmarked vertex now is Z, so we mark it and we are done. Its label is the shortest distance from A. A 4 7 2 B 3 C 4 Z A Postscript. I want to clarify something… The idea is to label each vertex with a number – its best known distance from A. As we work, we may find a cheaper distance, until we “mark” or finalize the vertex. When you mark a vertex and look to recalculate distances to its neighbors: – We don’t need to recalculate distance for a vertex if marked. So, only consider unmarked neighbors. – We only update a vertex’s distance if it is an improvement: if it’s shorter than what we previously had. 4 7 2 B 3 C 4 Z Dijkstra path • The answer is not just a number • In implementation, need some way to store the actual path taken. • Possibilities: – When Dijkstra’s algorithm is done, work backwards from the destination vertex to deduce the path. Look at the vertex labels of Z and its neighbors to see which neighbor logically comes before Z. For example, if label (Y) + edge YZ = label(Z). – During Dijkstra, when labeling a vertex, also store the list of vertices encountered along the way from A. CS 122 – April 16 • Tree: a special kind of graph – Good for storing information hierarchically – Often we want (rooted) binary trees • Traversing a tree – More interesting options besides BFS and DFS! • Handouts: – Tree with preorder (simple implementation) – Tree traversals (worksheet) Tree • Defined to be a connected acyclic graph – All vertices are connected to all the others (note that this does not mean adjacent to all the others). not disconnected, no isolated vertices – No cycle exists anywhere in the graph • In CS most trees we use are binary trees – There is a root vertex at the top – Each node has up to 2 children. – Specialized terminology: parent, left & right children, sibling, ancestor, descendant • How would we design a Node class for trees? Trees vs. graph • A tree is a special kind of graph • Trees may be: – Rooted • Among the rooted trees, one important kind is the binary tree! In it, each node has left & right subtree. – Non-rooted. Also called “free trees” – (We could make use of inheritance if we wanted all these possibilities implemented.) • Binary tree class design: – Attributes for a root and number of nodes – Constructor(s), size( ), toString( ), preorderIterator( ) Building a binary tree • Suppose we want to create a mathematical expression • Simplest tree is a root with 2 children. – Ex. 1 + 2: the root is “+”, 1 is left child, 2 is right child • We don’t add nodes / edges one at a time! • Instead, combine 2 subtrees with a common parent. – Ex. For (1 + 2) * (3 + 4), we create little trees for 1 + 2, and 3 + 4, and then combine them with * on top. • In this application, not interested in removing Binary tree traversals • Important operation is to visit all vertices iterator – General procedures like BFS and DFS do not take the hierarchical nature of the data into account. They “skip around” too much. • 3 ways to traverse a binary tree, each done recursively. – Preorder = process this node, and then call myself with each child – Inorder = call myself with my left child, process this node, then call myself with my right child – Postorder = call myself with each child, then process myself – If “process” means print, traversals give rise to 3 exprression notations: prefix, infix and postfix. Practice • Is it possible to determine the tree, given a traversal? • If we’re given an expression in prefix, infix, or postfix notation, we can build the tree. • Some notes: – Prefix and postfix notations never need parentheses – In prefix, an operator is applied to the 2 “numbers” (atoms or subexpressions) that follow – Similarly, in postfix, an operator is applied to the 2 numbers that precede it. • A little more difficult for general binary tree. Tree & traversal • Given a (binary) tree, we can find its traversals. √ • How about the other way? – Mathematical expression had enough context information that 1 traversal would be enough. – But in general, we need 2 traversals, one of them being inorder. • Example: Draw the binary tree having these traversals. Postorder: SCXHRJQT Inorder: SRCHXTJQ – Hint: End of the postorder is the root of the tree. Find where the root lies in the inorder. This will show you the 2 subtrees. Continue with each subtree, finding its root and subtrees, etc. • Another example... Inorder: Postorder: RHBJEGCAKDF RBEJHCKFDAG CS 122 – April 18 • Review tree traversals • Binary Search Trees • Handout – 5 ways to arrange 3 nodes Tree & traversal • Given a (binary) tree, we can find its traversals. √ • How about the other way? – Mathematical expression had enough context information that 1 traversal would be enough. – But in general, we need 2 traversals, one of them being inorder. • Example: Draw the binary tree having these traversals. Postorder: SCXHRJQT Inorder: SRCHXTJQ – Hint: End of the postorder is the root of the tree. Find where the root lies in the inorder. This will show you the 2 subtrees. Continue with each subtree, finding its root and subtrees, etc. • Another example... Inorder: Postorder: RHBJEGCAKDF RBEJHCKFDAG Binary search tree • Special kind of tree, an excellent data structure – Efficient searches, insertions, deletions – Operations should take logarithmic time • In the Java API, the TreeSet class is a kind of binary search tree called a red-black tree. – BST with extra functionality to make sure the tree stays “balanced” as we insert/delete elements – In ordinary BST, it’s possible for it to degenerate into a linked list if we’re unlucky • Binary search idea may already be familiar to you: guessing games, spell checker. BST Motivation • Store data so it will be easy to find later. – Does this list contain 52? 5,8,12, 13, 15, 18, 20, 25, 30, 32, 36, 38, 44, 45, 58, 61, 62, 77, 80 • Key property of BST: each node’s value is: – For all nodes: left child you right child – Or better yet: all in L subtree you all elements in R subtree • Then, it’s easy to tell where to search for something, or where a new element must go. BST operations • • • • • • insert (x) delete (x) find (x) predecessor (x) successor (x) And we can generalize, in case x is not inside: – closestBefore (x) – closestAfter (x) Search, min, max • Search(k) – Just follow binary search strategy. – Starting at root, follow left or right child… until you find k or discover it can’t exist. – May be recursive or just a while loop. • Next, how would you find …? – Min key value in the tree – Max key value in the tree CS 122 – April 20 • BST operations – find, insert – predecessor, successor, – delete • Efficiency of BST – Tradeoff ? – balancing Successor • Pred & succ are analogous/symmetric cases, so let’s just look at successor function. Try example. succ(x): if right(x) not empty return min(right(x)) else // succ is lowest ancestor of x whose // left child is also ancestor of x y = parent(x) while y && x == right(y) x = y The while loop goes up y = parent(x) the tree until x is the left return y child of its parent Insert, delete • Insert(k) is like search – The new key becomes a left/right child as appropriate. – Special case if tree initially empty. • Delete(victim) – Careful… there are 3 cases, depending on how many children the victim has. – If victim has no children: it’s a leaf. Just tell parent to set its left or right child pointer to null. – If victim has 1 child: Reset the victim’s parent and child to point to each other now. – What if victim has 2 children? Delete, continued • Deleting a node that has 2 children: – Replace the victim with succ(victim) – Now, need to repair pointers where succ(victim) came from. Turns out that succ(victim) has at most 1 child. It can’t have a left child… if it did, then it’s not really the succ. Go back to how we calculated succ(x) to see why it’s true. So, treat the removal of succ(victim) like the 0 or 1 child delete case. • Example: insert 10, 5, 16, 3, 7, 12; then delete 3, 16, 10. Comparison • Comparing a TreeSet with an ArrayList, how long does it take to insert 1,000,000 elements, and then perform a futile search? • TreeSet: 10.504s to insert; 0.001s to search • ArrayList: 1.891s to insert; 0.161s to search • What is the tradeoff? Balance • BST operations should take O(log2 n) time. • But in the worst case, a BST acts like a linked list. – Example: insert elements in sorted order • Goal: as we insert elements, the BST should maintain balance • How to define/quantify balance? – We can measure the height of each node – For each node, the height of its 2 children do not differ by more than 1. • What if unbalanced? Rotate. CS 122 – April 23 • Analyzing binary trees – Some definitions – Quantitative relationship between height of tree and # of nodes. (balance) Binary trees • Each node has 2 children. • Very useful for CS applications • Special cases – Full binary tree = each node has 0 or 2 children. Suitable for arithmetic expressions. Also called “proper” binary tree. – Complete binary tree = taking “full” one step further: All leaves have the same depth from the root. As a consequence, all other nodes have 2 children. Binary tree properties • Suppose we have a full binary tree • n = total number of nodes, h = height of tree • Let’s establish lower and upper bounds on… – – – – The number of leaves The number of internal nodes The total number of nodes The height (The answer to this one will show us why it’s important for a tree to maintain balance!) • Let’s repeat the above assuming it’s a full binary tree.