Finite-State Machines, Part 2 Derived from the on-line notes by Dr. Matt Stallmann. Stripping out comments Problem: Output each of the words contained in a Java source file while excluding any that occur in comments. For this example— • A word will be defined as any string of alphabetic characters. • Comments are defined as strings that begin with the characters // and end with the newline character '\n'. The FSM along with the program used to accomplish this task is as follows. Table form of FSM: State A–Z a–z / \n other IGNORE IN_WORD ONE_SLASH IGNORE IGNORE IN_WORD IN_WORD ONE_SLASH IGNORE IGNORE ONE_SLASH IN_WORD IN_COMMENT IGNORE IGNORE IN_COMMENT IN_COMMENT IN_COMMENT IGNORE IN_COMMENT Lecture 22 Programming Concepts—Java 1 import java.io.*; public class WordsNotInComments{ // words are output on transitions out of the IN_WORD state public static void main(String[] args) { if (args.length == 1) { try { BufferedReader br = new BufferedReader( new FileReader(args[0])); char ch; int next; final int IGNORE = 0; final int IN_WORD = 1; final int ONE_SLASH = 2; final int IN_COMMENT = 3; int state = IGNORE; String s = ""; while ((next = br.read()) != −1) { ch = (char) next; switch (state) { case IGNORE: if (Character.isLetter(ch)) { s = "" + ch; state = IN_WORD; } else if (ch == ’/’) state = ONE_SLASH; break; case IN_WORD: if (Character.isLetter(ch)) { s += ch; // state = IN_WORD; } else { System.out.println(s); if (ch == ‘/’); state = ONE_SLASH; else state = IGNORE; } break; case ONE_SLASH: if (’/’ == ch) state = IN_COMMENT; else if (Character.isLetter(ch)) { s = "" + ch; CSC 216 Lecture Notes Fall 2006 2 state = IN_WORD; } else state = IGNORE; break; case IN_COMMENT: if (ch == ’\n’) state = IGNORE; break; default: System.out.println("Invalid state: " + state); } } } catch(IOException e) { System.out.println("File error: " + e); System.out.println("usage: java WordsNotInComments filename"); } } else System.out.println("usage: java WordsNotInComments filename"); } } Consider the input “if (ch == ’/’)”. Let's trace through the states the program moves into. IGNORE, IN_WORD, IN_WORD, IGNORE, IGNORE, IN_WORD, IN_WORD, IGNORE, IGNORE, IGNORE, IGNORE, IGNORE, ONE_SLASH, IGNORE, IGNORE. What is output? if ch Exercise: How would you modify the table form of the FSM to strip out comments but pass anything else through to the output? Answer: State / \n Other PRINT ONE_SLASH PRINT PRINT Lecture 22 Programming Concepts—Java 3 ONE_SLASH COMMENT PRINT PRINT COMMENT COMMENT PRINT COMMENT When will output be performed? When we’re in the PRINT state. When is extra output required? When we’re in state ONE_SLASH and encounter something other than a slash. Horner's Rule A finite-state machine can also be used to convert an ASCII string of characters representing a real number to its actual numerical value. It performs the same operation as the java.lang.Float.parseFloat(String s) function and uses an algorithm known as Horner's Rule. The letters shown in the FSM stand for the following: c s v p - current character sign of the number value of the number power CSC 216 Lecture Notes Fall 2006 4 Note that this FSM assumes that the string contains a valid floating point number that • starts with an optional + or –, • has at least one digit, an optional decimal point, • and any number (including 0) of digits before and after the decimal point. A value of 0 is returned if an invalid string is encountered. Here is a function that implements this FSM. Table form of FSM: State + or – . digit other START INTEGER DECIMAL INTEGER ERROR INTEGER ERROR DECIMAL INTEGER ERROR DECIMAL ERROR ERROR DECIMAL ERROR Lecture 22 Programming Concepts—Java 5 public class Parser { static double toDouble(String s) { double sign = 1; // sign of number (either 1 or −1) double value = 0; // current value of the number double power = 0.1; // current power of 10 for // digits after decimal point int i = 0; final int START = 0; final int INTEGER = 1; final int DECIMAL = 2; final int ERROR = 3; int state = START; char ch; //current character in string while (state != ERROR && i < s.length()) { ch = s.charAt(i++); switch (state) { case START: if (ch == ’.’) state = DECIMAL; else if (ch == ’−’) { sign = −1.0; state = INTEGER; } else if (ch == ’+’) state = INTEGER; else if (Character.isDigit(ch)) { value = ch − ’0’; state = INTEGER; } else state = ERROR; break; case INTEGER: if (ch == ’.’) state = DECIMAL; else if (Character.isDigit(ch)) value = 10.0 * value + (ch − ’0’); else { CSC 216 Lecture Notes Fall 2006 6 value = 0.0; state = ERROR; } break; case DECIMAL: if (Character.isDigit(ch)) { value += power * (ch − ’0’); power /= 10.0; } else { value = 0.0; state = ERROR; } break; default: System.out.println("Invalid state: " + state); } } return sign * value; } public static void main(String[] args) { if (args.length == 1) System.out.println(toDouble(args[0])); } } Question: What would be the right way to add error-checking to the program? The State pattern The code that we have given is not very object oriented. It checks which state the FSM is in in several places. In an object-oriented program, we can use polymorphism (Lecture 7, Horstmann §11.3) to achieve the same effect. This will allow us to check the state in only one place. Lecture 22 Programming Concepts—Java 7 The first step is to define a State interface. Consider the table form of the FSM. The rows of the table represent the different states. The columns of the table represent the different behaviors of each state. Therefore, what methods should be defined in the State interface? public interface State { public void onPlus(); public void onMinus(); public void onPoint(); public void onDigit(); public void onOther(); } How should the States be defined? class Start implements State { ... } class Integr implements State { ... } class Decimal implements State { ... } Where should the States be defined? As inner classes of Parser How should these methods be implemented for each state? Let’s take the onPoint() method, for example. In class Start, it is defined as … public void onPoint() { state = decimal; }; (We need a variable of type State to hold the current state.) In class Integr, it is defined as … public void onPoint() { state = decimal; }; In class Decimal, it is defined as … public void onPoint() { throw new CSC 216 Lecture Notes Fall 2006 8 NumberFormatException(); }; Exercise: Divide up into groups, with each group responsible for implementing one of the following methods in all three classes. void void void void onMinus(); onPlus(); onDigit(); onOther(); Write each method on a separate index card. Then divide into three groups, with each group taking a different class. Pass the cards for each method to the group that is taking that class, and have the group assemble the various methods into a class definition. Lecture 22 Programming Concepts—Java 9