UIL Computer Science Contest – Advanced Content Mike Scott University of Texas at Austin Computer Science - Advanced Content Agenda • A brief look at Computer Science • New topics for contest (Java 5.0) – – – – – – – enhanced for loop Regular Expressions printf Scanner class Generic Collections new general data structures auto boxing and unboxing • Doing programming problems Computer Science - Advanced Content A Brief Look at Computer Science • The UIL CS contest emphasizes programming • Most introductory CS classes, both at the high school and college level, teach programming • … and yet, computer science and computer programming are not the same thing! • So what is Computer Science? Computer Science - Advanced Content What is Computer Science? • Poorly named in the first place. • It is not so much about the computer as it is about Computation. • “Computer Science is more the study of managing and processing information than it is the study of computers.” -Owen Astrachan, Duke University Computer Science - Advanced Content So why Study Programming? • Generally the first thing that is studied in Chemistry is stoichiometry. – Why? It is a skill necessary in order to study more advanced topics in Chemistry • The same is true of programming and computer science. Computer Science - Advanced Content • “What is the linking thread which gathers these disparate branches into a single discipline? …it is the art of programming a computer. It is the art of designing efficient and elegant methods of getting a computer to solve problems, theoretical or practical, small or large, simple or complex.” - C. A. R. Hoare • Sir Tony Hoare. Turing Award Winner. Inventor of the quicksort algorithm Computer Science - Advanced Content • “Programming is unquestionably the central topic of computing. In addition to being important, programming is an enormously exciting intellectual activity. In its purest form, it is the systematic mastery of complexity. For some problems, the complexity is akin to that associated with designing a fine mechanical watch, i.e., discovering the best way to assemble a relatively small number of pieces into a harmonious and efficient mechanism. For other problems, the complexity is more akin to that associated with putting a man on the moon, i.e, managing a massive amount of detail. In addition to being important and intellectually challenging, programming is a great deal of fun. Programmers get to build things and see them work.. What could be more satisfying? “ - John V. Guttag, Professor at MIT research in AI, medical systems, wireless networking Computer Science - Advanced Content Outcomes of Computer Science Computer Science - Advanced Content Computer Science Skills • Yes, the number of pure programming jobs has declined • But, the total number of jobs in information technology as surpassed the previous peak from 2000 • “Even as computer science students are being encouraged to take more courses outside their major, students in other disciplines are finding more often that they need to use, design and sometimes write computer programs.” - Steve Lohr, NY Times, August 2005 Computer Science - Advanced Content New Topics for UIL • Newest version of Java is 5.0 • A major release that added many new features, many of which make it easier to write programs • more features = more flexibility = greater complexity Computer Science - Advanced Content The enhanced for loop • a.k.a. the for-each loop • alternative for iterating through a set of values for(Type loop-variable : set-expression) statement • Set-expression is an array or a collection such as ArrayList – more generally, anything that implements the Iterable interface • Type must be the data type of the elements of the array or collection • logic error (not a syntax error) if try to modify an element in array via enhanced for loop Computer Science - Advanced Content Enhanced for loop public static int sumListOld(int[] list) { int total = 0; for(int i = 0; i < list.length; i++) { total += list[i]; System.out.println( list[i] ); } return total; } public static int sumListEnhanced(int[] list) { int total = 0; for(int val : list) { total += val; System.out.println( val ); } return total; } Computer Science - Advanced Content Enhanced for loop int[] list = {1, 2, 3, 4}; for( int val : list) { val = val + 2; // logic error System.out.println( val ); } for(int val : list) System.out.print( val ); Computer Science - Advanced Content Regular Expressions • “A regular expression (abbreviated as regexp, regex or regxp) is a string that describes or matches a set of strings, according to certain syntax rules. Regular expressions are used by many text editors and utilities to search and manipulate bodies of text based on certain patterns. Many programming languages support regular expressions for string manipulation.” Computer Science - Advanced Content Regular Expressions in Java • In Java the Pattern class represents regular expressions • Regular expressions are specified as Strings • Regular expressions used for the split method in the String class to parse the String based on the regular expression Computer Science - Advanced Content Syntax of Regular Expressions Construct Matches x The character x . Any character +, e.g. x+ One or more times, matches patterns of 1 or more x’s in a row *, e.g. x* Zero or more times, matches patterns of 0 or more x’s in a row [], e.g. [abc] Used to enclose character classes. Match one of the included characters or expressions. Matches any a, b, or c -, e.g. [a-z] Used to specify a range, match any lower case letter ^, e.g. [^a] Negation. Everything but the specified expression. Matches everything besides lower case a. (^ by itself is start of line) \d Predefined character class. (PCC) Any digit. Same as [0-9] \D PCC. A non digit. Same as [^0-9] \s PCC. A whitespace character: Same as [ \t\n\x0B\f\r] \S PCC. A non whitespace character. Same as [^\s] Computer Science - Advanced Content More Regular Expressions Construct Matches \w PCC. A word character. Same as [a-zA-Z_0-9] \W PCC. A non word character. Same as [^\w] • traditional split questions String s = “Fair’s got nothing to do with it.”; String[] result = s.split( “\\s+” ); // regular expression for 1 or more white space characters. // The split method breaks use the regular expression as // “delimiters”. for(String val : result) System.out.println( val ); • What if there were two spaces between each word and the regular expression was just “\s”? Computer Science - Advanced Content Another Regex Example • Given the regular expression “a*b” • What is the output of the following code? String s = “Someabrandombtextabout nothing” + “aaaabdoes itaaamakeabSense?”; String[] result = s.split( “a*b” ); for(String val : result) System.out.println( val ); Computer Science - Advanced Content Regular Expression Example • Given a*b • Where do matches occur in: • “Someabrandombtextabout nothingaaaabdoes itaaamakeabSense?” • “Someabrandombtextabout nothingaaaabdoes itaaamakeabSense?” • 5 matches Computer Science - Advanced Content Regular Expression Example • • • • Order matters a*b is not the same as b*a Given b*a and the previous String “Someabrandombtextabout nothingaaaabdoes itaaamakeabSense?” • “Someabrandombtextabout nothingaaaabdoes itaaamakeabSense?” • 12 matches Computer Science - Advanced Content printf From Cay Horstman’s homepage. “The march of progress.” 1980: C printf("%10.2f", x); 1988: C++ cout << setw(10) << setprecision(2) << showpoint << x; 1996: Java java.text.NumberFormat formatter = java.text.NumberFormat.getNumberInstance(); formatter.setMinimumFractionDigits(2); formatter.setMaximumFractionDigits(2); String s = formatter.format(x); for (int i = s.length(); i < 10; i++) System.out.print(' '); System.out.print(s); 2004: Java System.out.printf("%10.2f", x); Computer Science - Advanced Content printf details • public PrintStream printf(String format, Object... args) • format - A format string as described in Format string syntax. May contain fixed text and one or more embedded format specifiers • args - Arguments referenced by the format specifiers in the format string. If there are more arguments than format specifiers, the extra arguments are ignored. The number of arguments is variable and may be zero. Computer Science - Advanced Content printf details – Formatter class • The format specifiers for general, character, and numeric types have the following syntax: %[argument_index$][flags][width][.precision]conversion • • • • • The optional argument_index is a decimal integer indicating the position of the argument in the argument list. The first argument is referenced by "1$", the second by "2$", etc. The optional flags is a set of characters that modify the output format. The set of valid flags depends on the conversion. The optional width is a non-negative decimal integer indicating the minimum number of characters to be written to the output. The optional precision is a non-negative decimal integer usually used to restrict the number of characters. The specific behavior depends on the conversion The required conversion is a character indicating how the argument should be formatted. The set of valid conversions for a given argument depends on the argument's data type. Computer Science - Advanced Content Conversion Types for Printf Conversion Description ‘b’ or ‘B’ Result is a boolean ‘c’ or ‘C’ Result is a character ‘d’ Result is a decimal integer ‘o’ Result is an octal (base 8) integer ‘x’ or ‘X’ Result is a hexadecimal (base 16) integer ‘e’ or ‘E’ The result is a floating point number formatted as a decimal (base 10) number in computerized scientific notation ‘f’ The result is a floating point number formatted as a decimal (base 10) number. Computer Science - Advanced Content printf Flags Flag Description - Result will be left justified + Result will always include a sign ‘‘ Result will include a leading space for positive values ‘0’ Result will be 0 padded ‘(‘ Result will enclose negative numbers in parenthesis Computer Science - Advanced Content printf Examples double d = 12.555; System.out.printf("%8.2f", d); System.out.printf(“%+8.2f”, d); System.out.printf(“%8.2d”, d); System.out.printf(“%-+8.2f”, (int)d); System.out.printf(“%+010.4f”, d); double[] gpas; //code to init String[] names; //code to init for(int i = 0; i < gpas.length; i++) System.out.printf(“Name: %-s GPA: %5.2f”, names[i], gpas[i]); Computer Science - Advanced Content Scanner class • A new class to allow simple parsing of input. • Makes it easier to get input from System.in • Methods for getting nextLine, nextInt, nextDouble, next String • Set delimiters with regular expressions, default is whitespace Scanner s = new Scanner(System.in); System.out.print("Enter your name: "); String name = s.nextLine(); System.out.print("Press Enter to continue: "); s.nextLine(); Computer Science - Advanced Content Hooking a Scanner up to a File import java.util.Scanner; import java.io.File; import java.io.IOException; public class ReadAndPrintScores { public static void main(String[] args) { try { Scanner s = new Scanner( new File("scores.dat") ); while( s.hasNextInt() ) { System.out.println( s.nextInt() ); 12 35 12 } 12 45 } 12 catch(IOException e) { System.out.println( e ); 12 } } } 13 57 scores.dat Computer Science - Advanced Content Quick and Dirty Version import java.util.*; import java.io.* public class ReadAndPrintScores { public static void main(String[] args) throws IOException { Scanner s = new Scanner( new File("scores.dat") ); while( s.hasNextInt() ) { System.out.println( s.nextInt() ); } } } Computer Science - Advanced Content Generic Classes • Java 5.0 introduces “generics” • Previously genericity was achieved through inheritance and polymorphism – every object in Java was an Object • Caused a lot of casting and uncertainty about whether data types were correct • UIL will focus on generic collections Computer Science - Advanced Content Generic ArrayList • old ArrayList ArrayList list = new ArrayList(); list.add(“UIL”); list.add(“CS”); String s = (String)list.get(0); • new ArrayList ArrayList<String> list = new ArrayList<String>(); list.add(“UIL”); list.add(“CS”); String s = list.get(0); • Old style still works – raw collections that generate compile time warnings Computer Science - Advanced Content New Data Structures – Hash Table • Hash Table – no Java class – HashSet and HashMap use Hash Tables as the internal storage container to implement a Set and a Map – Hash Tables rely on fast access into arrays if index is known – Can achieve O(1) performance for add, access, and remove – challenge is handling collisions Computer Science - Advanced Content New Data Structures – Priority Queue • Priority Queue – like a Queue, but every item added has a priority – items get to move in front of other items in already present in Queue that have a lower priority – example: how patients are seen in an emergency room – PriorityQueue class in the Java Standard Library Computer Science - Advanced Content New Data Structures - Heap • A complete binary tree where every node has a value more extreme (greater or less) than or equal to the value of its parent • min and max heaps • example below is a min heap • min value at the root 5 12 32 37 7 45 55 13 50 Computer Science - Advanced Content 9 Auto Boxing and Unboxing • In Java there is a big difference between the primitive variables and objects • Collections could only contain objects • Adding primitives was a pain ArrayList list = new ArrayList(); for(int i = 0; i < 10; i++) list.add( new Integer(i) ); for(int i = 0; i < list.size(); i++) System.out.println( ((Integer)list.get(i)).intValue() ); • Primitives have to be boxed or wrapped into objects Computer Science - Advanced Content Auto boxing and unboxing • The system now automatically boxes and unboxes primitives when adding to collections or other times an object is needed ArrayList<Integer> list = new ArrayList<Integer>(); for(int i = 0; i < 10; i++) list.add( i ); for(int i = 0; i < list.size(); i++) System.out.println( list.get(i) + 2 ); // OR for( int val : list ) System.out.println( val + 2 ); Computer Science - Advanced Content Programming Problems • Regional and State level only, for now… • I am terrible artist – if I practiced I might get better • Programming problems – the only way to get better is to pratice • Sources of problems – UIL CS website. Old problems – online contests and judges Computer Science - Advanced Content Keys to Success • Reading from an input file – setting up a Scanner • Hard problems -> Design first • Test your solution against the sample data • Realize judges data is different and may test boundary cases you didn’t consider Computer Science - Advanced Content Example Problem 1 Brain Plan Program Name: brain.java Input File: brain.in Researchers are developing non-invasive devices that allow patients to control robotic arms using their minds. These devices examine the brainwave readings in patients to determine what action the robotic arm should take. Before the device can function, it needs to be programmed to associate certain brainwave patterns with robotic arm movements. To aid in this effort, two test subjects have had their brainwave readings taken when trying to get the robotic arm to perform specific actions. Your job is to generalize the sets of brainwave readings into brainwave patterns that the device can use for comparisons in upcoming trials. Brainwave scans from the test subjects show only active and inactive portions of the brain. From these scans, a pattern can be deduced by determining where the scans agree and where they disagree. Areas of agreement indicate portions of the pattern that should be active (or inactive) while disagreements indicate portions of the pattern that should be marked as unimportant. Computer Science - Advanced Content Example Problem 1 Brainwave scans are strings of 18 characters where each character represents a portion of the brain that is either 'A'=Active or 'I'=Inactive. Patterns are also strings of 18 characters where 'A'=Active, 'I'=Inactive, and '*'=Unimportant. For instance, the following pair of brainwave scans: AAAAIIIIIIIIIAAIAI IIAAIIIIAIIIIAIIAI gives rise to the pattern: **AAIIII*IIIIA*IAI Computer Science - Advanced Content Example Problem 1 Input The first line will contain a single integer n indicating the number brainwave scans pairs that need to have their patterns calculated. Each pair of the next 2n lines will contain brainwave scans for different actions. Output For each brainwave scan pair in the input, output the corresponding brainwave pattern on its own line. Example Input File 3 AAAAIIIIIIIIIAAIAI IIAAIIIIAIIIIAIIAI AAAAAAAAAIIIIIIIII IIIIIIIIIAAAAAAAAA AIAIAIAIAAIAIAIAIA IAIAIAIAIAIAAAIAAI Example Output To Screen **AAIIII*IIIIA*IAI ****************** *********AIA*AIA** Computer Science - Advanced Content Brain Problem Design • Hook up Scanner to file. – No path info! • Read line for number of data sets • For each data set – read in line 1 – read in line 2 – iterate through each line and compare to create result • print out result • a relatively easy problem Computer Science - Advanced Content Example Problem 2 Juggling Numbers Program Name: juggle.java Input File: juggle.in Did you ever wonder how jugglers can keep track of all of those balls while not letting any of them fall? They do it by mastering a certain number of basic throws and then chaining them together in exciting ways. One convenient side-effect of this technique is that it is possible to represent a jugging pattern fairly well with a string of single digits such as “5313” or “441441”. Each digit represents a single throw, with the height of the throw corresponding to the size of the number (the exception is the 0 digit, which represents no ball is thrown during that step). For instance, a ball thrown with a height of 5 will have to be caught five steps later in the sequence. Not all sequences are possible for jugglers, however, since they can only catch one ball during any given step. In the sequence “321” all three balls would be landing during step 4! It’s very useful to be able to determine whether or not a sequence is possible. Computer Science - Advanced Content Input The first line of input will contain a single integer n indicating the number of datasets. The following n lines will each contain a sequence of digits (0-9) representing a juggling pattern. Each sequence will contain from 1 to 40 digits. Output For each dataset in the input, determine if there is any step where more than one ball must be caught. If there is no such step, then the sequence is valid and the string “VALID” should be displayed. Otherwise, for sequences with steps where multiple balls have to be caught simultaneously, the sequence is invalid, and we want to know the number of balls that will drop the first time the juggler misses. This value should be one less than the total number of balls that need to be caught during the first step where the juggler has to catch more than one. In the case of an invalid sequence, display “DROPPED X on step Y”, where the first drop occurs on step Y, with X balls missed, and steps are numbered starting at 1. Note the sequences will likely end with some balls still in the air. Solutions should treat this situation as if the sequence ended in just enough zeros (“0”) to ensure all balls were caught. Computer Science - Advanced Content Example Input File 4 333333333 441441441441 333321 445441441441 Example Output To Screen VALID VALID DROPPED 2 on step 7 DROPPED 1 on step 8 Computer Science - Advanced Content Judges data file for Juggle Expected Output 10 333333333 441441441441 333321 445441441441 5 09 90 0123456789 9876543210 0009070301 VALID VALID DROPPED 2 on step 7 DROPPED 1 on step 8 VALID VALID VALID VALID DROPPED 8 on step 10 DROPPED 1 on step 11 Computer Science - Advanced Content