CS 1302 – Ch 19, Binary I/O Sections 19.1-19.4.1, 19.6-19.6 Pages 710-715, 724-729 Review Questions Liang’s Site Programming Exercises any Sections 19.1 – Introduction 1. An important part of programming is the ability to store data in files and read it back in. All files are stored in binary. For example: 0100101001100001011101100110000100100000011010010111001100100000011000010010000001 1001110111001001100101011000010111010000100000011011000110000101101110011001110111 0101011000010110011101100101001000000111010001101111001000000110110001100101011000 01011100100110111000101110 2. Definitions: a. Character Encoding – Associates a unique (usually) value (number) with each character (glyph) in a character set. For example, the integer 97 is associated with the letter “a” in the English character set. This mapping is a convenient way to store and transmit data. b. Encode –The process of turning a character into its unique value. In this chapter, we think of encoding as writing characters to a file storing the associated values. c. Decode – The process of using turning a value into a character. In this chapter we think of decoding as reading a value from a file and turning it into a character. 3. There are two general ways for a program to save data to disk: a. Using a standard encoding scheme. ASCII and UTF-8 are both universally accepted and are identical for the most common characters in the English character set. When we persist data to file using a standard encoding scheme, we call this file a text file. We say that such a file is human readable because practically any software we use to open the file will detect that it is stored using the UTF-8 encoding, thus properly rendering the characters. b. Using a custom encoding scheme. For example, programming languages can do binary encoding where numbers (in base 10) are represented as their binary equivalent. For example, the integer 6 is “110” in binary. Thus, when a program reads “110”, it can decode that as the integer 6. Thus, a program might store data in a format like this: int, int, double, string, and then repeat. Such a file is called a binary file. Binary encoding also allows you to encode objects, save them as objects, and read them back in as objects. The focus of this chapter is on reading and writing binary files. 1 4. Standard Encoding a. ASCII (American Standard Code for Information Interchange) is a character encoding scheme that maps the English alphabet, some punctuation, and a few other characters into unique identifiers, numbers which can be expressed in either hexadecimal or binary. The table below shows a few of the mappings. Letter ASCII Code HEX Binary Letter ASCII Code HEX Binary a 097 61 01100001 A 065 41 01000001 b 098 62 01100010 B 066 42 01000010 c 099 63 01100011 C 067 43 01000011 d 100 64 01100100 D 068 44 01000100 e 101 65 01100101 E 069 45 01000101 … … … … … … … … b. Originally, ASCII represented 128 characters. Since 27 = 128, 7 bits can be used to represent each unique character. Thus, in a file such as the one above, every 7 bits would be encoded as a particular character. Soon after, ASCII was extended to 8-bit since 8 bits are stored in a byte and the extra bit allows the accommodation of 128 additional characters. ASCII was the standard on the internet until 2007 when UTF-8 was adopted as a world-wide standard. c. UTF-8 (Universal Character Set Transformational Format – 8 bit) represents every character in the Unicode character set. Unicode is a standard for encoding most of the world’s written languages. UTF-8 was designed for backwards compatibility with ASCII. Thus, an ASCII code is the same as a UTF-8 code (code point) for the same character. Finally, Java uses UTF-8 encoding. d. A text file is binary data that is UTF-8 encoded so that (usually) every 8 bits represents a character. A program that uses a text file (WordPad, NotePad, Word, Eclipse, etc) decodes the data and presents as human-readable text. For example, the first 9 bytes of the file above are decoded as: 01001010 J 01100001 a 01110110 v 01100001 a 00100000 2 01101001 i 01110011 s 00100000 01100001 a 5. Binary Encoding – A binary file written with a binary writer class (ObjectOutputStream) in Java (or some other language) is not human-readable; it is designed to be read by programs. For example, Java source programs are stored in text files and can be read by a text editor, but Java classes are stored in binary files and are read by the JVM. The advantage of binary files is that they are more efficient to process than text files. In the example below, in (a), the number 199 is encoded with UTF-8 as the characters “1”, “9”, and “9”. In (b) below, the number 199 is converted to its binary representation as a number (11000111 or C7). Thus, one byte was used to store the number and three bytes were used to store the text based encoding. Text I/O program (a) The Unicode of the character e.g. "199" , Encoding/ Decoding The encoding of the character is stored in the file 00110001 00111001 00111001 0x31 0x39 0x39 Binary I/O program (b) A byte is read/written e.g. 199 , The same byte in the file 11000111 0xC7 Sections 19.2 – How is I/O Handled in Java? 1. A File object encapsulates the properties of a file or a path, but does not contain the methods for reading/writing data from/to a file. In order to perform I/O, you need to create objects using appropriate Java I/O classes. Program Input object created from an input class Output object created from an output class Input stream 01011…1001 File 11001…1011 File Output stream 3 Section 19.3 – Text I/O vs. Binary I/O 1. Text I/O requires encoding and decoding. The JVM converts a Unicode to a file specific encoding when writing a character and coverts a file specific encoding to a Unicode when reading a character. Binary I/O does not require conversions. When you write a byte to a file, the original byte is copied into the file. When you read a byte from a file, the exact byte in the file is returned. 2. Why are binary files necessary? http://chortle.ccsu.edu/java5/notes/chap86/ch86_6.html Section 19.4 – Binary I/O Classes 1. The binary I/O classes: FileInputStream DataInputStream InputStream FilterInputStream BufferedInputStream ObjectInputStream Object OutputStream FileOutputStream BufferedOutputStream FilterOutputStream DataOutputStream ObjectOutputStream PrintStream 2. Input Stream: java.io.InputStream +read(): int Reads the next byte of data from the input stream. The value byte is returned as an int value in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value –1 is returned. +read(b: byte[]): int Reads up to b.length bytes into array b from the input stream and returns the actual number of bytes read. Returns -1 at the end of the stream. +read(b: byte[], off: int, len: int): int Reads bytes from the input stream and stores into b[off], b[off+1], …, b[off+len-1]. The actual number of bytes read is returned. Returns -1 at the end of the stream. +available(): int Returns the number of bytes that can be read from the input stream. +close(): void Closes this input stream and releases any system resources associated with the stream. +skip(n: long): long Skips over and discards n bytes of data from this input stream. The actual number of bytes skipped is returned. +markSupported(): boolean Tests if this input stream supports the mark and reset methods. +mark(readlimit: int): void Marks the current position in this input stream. +reset(): void Repositions this stream to the position at the time the mark method was last called on this input stream. 4 3. OutputStream java.io.OutputStream +write(int b): void Writes the specified byte to this output stream. The parameter b is an int value. (byte)b is written to the output stream. +write(b: byte[]): void Writes all the bytes in array b to the output stream. +write(b: byte[], off: int, Writes b[off], b[off+1], …, b[off+len-1] into the output stream. len: int): void +close(): void Closes this input stream and releases any system resources associated with the stream. +flush(): void Flushes this output stream and forces any buffered output bytes to be written out. 4. FileInputStream/FileOutputStream associates a binary input/output stream with an external file. All the methods in FileInputStream/FileOuptputStream are inherited from its superclasses. FileInputStream DataInputStream InputStream FilterInputStream BufferedInputStream ObjectInputStream Object OutputStream FileOutputStream BufferedOutputStream FilterOutputStream DataOutputStream ObjectOutputStream PrintStream 5. FileInputStream – To construct a FileInputStream, use the following constructors: public FileInputStream(String filename) public FileInputStream(File file) A java.io.FileNotFoundException will occur if you attempt to create a FileInputStream with a nonexistent file. 6. FileOutputStream – To construct a FileOutputStream, use the following constructors: public public public public FileOutputStream(String filename) FileOutputStream(File file) FileOutputStream(String filename, boolean append) FileOutputStream(File file, boolean append) If the file does not exist, a new file is created. If the file already exists, the first two constructors delete the current contents in the file. To retain the current content and append new data into the file, use the last two constructors by passing true to the append parameter. 5 7. (Optional) DataInputStream/DataOutputStream – Reads bytes from the stream and converts them into appropriate primitive type values or strings. Writes primitive type values or strings into bytes and outputs the bytes to the stream. InputStream FilterInputStream DataInputStream +DataInputStream( in: InputStream) OutputStream FilterOutputStream DataOutputStream +DataOutputStream( out: OutputStream) java.io.DataInput +readBoolean(): boolean Reads a Boolean from the input stream. +readByte(): byte Reads a byte from the input stream. +readChar(): char Reads a character from the input stream. +readFloat(): float Reads a float from the input stream. +readDouble(): float Reads a double from the input stream. +readInt(): int Reads an int from the input stream. +readLong(): long Reads a long from the input stream. +readShort(): short Reads a short from the input stream. +readLine(): String Reads a line of characters from input. +readUTF(): String Reads a string in UTF format. java.io.DataOutput +writeBoolean(b: Boolean): void Writes a Boolean to the output stream. +writeByte(v: int): void Writes to the output stream the eight low-order bits of the argument v. +writeBytes(s: String): void Writes the lower byte of the characters in a string to the output stream. +writeChar(c: char): void Writes a character (composed of two bytes) to the output stream. +writeChars(s: String): void Writes every character in the string s, to the output stream, in order, two bytes per character. +writeFloat(v: float): void Writes a float value to the output stream. +writeDouble(v: float): void Writes a double value to the output stream. +writeInt(v: int): void Writes an int value to the output stream. +writeLong(v: long): void Writes a long value to the output stream. +writeShort(v: short): void Writes a short value to the output stream. +writeUTF(s: String): void Writes two bytes of length information to the output stream, followed by the UTF representation of every character in the string s. Section 19.5 – Problem: Copying Files Skip. 6 Section 19.6 – Object I/O 1. ObjectInputStream/ObjectOutputStream enables you to perform binary I/O for objects in addition for primitive type values and strings. In the code examples below, Section 19.6 – Writing Binary Values // Some sample data to write. String s = "Hello Binary"; double x = 3.34; int y = 13; double[] vals = new double[] {3.4, 4.2, 6.7}; // Create output stream. ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("out1.dat")); // Write data oos.writeObject(s); oos.writeDouble(x); oos.writeInt(y); oos.writeObject(vals); // Flush and close output stream. oos.flush(); oos.close(); oos = null; Section 19.6 – Reading Binary Values // Create input stream. ObjectInputStream ois = new ObjectInputStream(new FileInputStream("out1.dat")); try { while ( true ) { // Read data values s = (String)ois.readObject(); x = ois.readDouble(); y = ois.readInt(); vals = (double[])ois.readObject(); } } catch (EOFException eofEx) { System.out.println(" End of file..."); } finally { ois.close(); ois = null; } 7 Section 19.6 – Writing Binary (Custom) Objects In this example we will write Book objects to a binary file. An instance of any class can be written to binary if it implements the Serializable interface. The Serializable interface is a marker interface. When an object is written to binary, we say that it is serialized. Serialization refers to the process of converting an object to 0’s and 1’s. class Book implements Serializable { String name; int pages; public Book(String name, int pages) { this.name = name; this.pages = pages; } public String toString() { return name + ":" + pages; } } Writing Book objects to a binary file: // Create some books. Book b1, b2; b1 = new Book("Zen and the", 343); b2 = new Book("Moby Dick", 875); // Create output stream. ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("out2.dat")); // Write books oos.writeObject(b1); oos.writeObject(b2); 8 Section 19.6 – Reading Binary (Custom) Objects Deserialization refers to the process of converting binary data into objects. This is an example of reading Book objects from a binary file. ArrayList<Book> books = new ArrayList<Book>(); // Create input stream ObjectInputStream ois = new ObjectInputStream(new FileInputStream("out2.dat")); // Try to read books. Will throw exception when EOF. // Input stream does not provide a way to check EOF. try { while ( true ) { Object obj = ois.readObject(); if (obj instanceof Book) { b1 = (Book)obj; books.add( b1 ); } } } catch (EOFException eofEx) { System.out.println(" End of file..."); } finally { ois.close(); ois = null; } 9 Section 19.6 – Writing Binary Objects Associated with Other Objects In this example, a Person has an arraylist of Books. Note that both classes must implement the Serializable interface. class Book implements Serializable { String name; int pages; public Book(String name, int pages) { this.name = name; this.pages = pages; } class Person implements Serializable { String name; ArrayList<Book> books = new ArrayList<Book>(); public Person(String name) { this.name = name; } public String toString() { StringBuilder sb = new StringBuilder(); sb.append( name + "'s books:"); for( Book b : books ) sb.append(b + ", "); public String toString() { return name + ":" + pages; } } return sb.toString(); } } We write Person objects which contain a list of Book objects. // b1 b2 b3 Create some books. = new Book("Zen and the", 343); = new Book("Moby Dick", 875); = new Book("Kite Runner", 224); // Create a person Person p = new Person( "Dave" ); // Add books to person p.books.add(b1); p.books.add(b2); p.books.add(b3); // Create output stream. ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("out3.dat")); // Write Person oos.writeObject(p); Section 19.6 – Reading Binary Objects Associated with Other Objects ObjectInputStream ois = new ObjectInputStream(new FileInputStream("out3.dat")); p = (Person)ois.readObject(); 10 Section 19.6 – Dealing with Exceptions The code as written above can throw a number of Exceptions that must be caught or declared thrown. For example: 1. Writing Binary – Approach 1 import java.util.*; import java.io.*; public class AccountTester { public static void main(String[] args) { ... saveAccounts2( accounts, FILE_NAME ); } public static void saveAccounts2( List<Account> accounts, String fileName ) { try { ObjectOutputStream oos =new ObjectOutputStream(new FileOutputStream( fileName )); for( Account a : accounts ) oos.writeObject(a); oos.close(); oos = null; } catch( Exception e ) { } } Some of the exceptions that be thrown from this code: catch ( NotSerializableException nse ) {} catch ( InvalidClassException ice ) {} catch ( FileNotFoundException fnfe ) {} catch ( IOException ioe ) {} 11 2. Reading Binary – Approach 1 import java.util.*; import java.io.*; public class AccountTester { public static void main(String[] args) { ... readAccounts2( accounts, FILE_NAME ); } public static List<Account> readAccounts2( String fileName ) { List<Account> accounts = new ArrayList<Account>(); ObjectInputStream ois = null; try { ois = new ObjectInputStream( new FileInputStream( fileName ) ); try { while ( true ) { Object obj = ois.readObject(); if (obj instanceof Account) accounts.add( (Account)obj ); } } catch (EOFException eof) { ois.close(); ois = null; return accounts; } } catch (Exception e) { return null; } } 12 Section 19.6 – The transient Modifier Static variables are not serialized. The transient modifier is used to indicate variables that are not to be serialized. It tells the JVM to ignore it when serializing. class Employee implements Serializable { String name; transient Assignment assignment; public Employee(String name) { this.name = name; } public void initJob() { assignment = new Assignment( "Programmer" ); } public String toString() { StringBuilder sb = new StringBuilder(); sb.append( name + "'s job type:" + assignment ); return sb.toString(); } } class Assignment { String jobType; public Assignment(String jobType) { this.jobType = jobType; } public String toString() { return jobType; } } 13 Section 19.6 – Appending a Binary File Each time open a new output stream, a header is written in the binary file. If you open the stream again to write, you will write another header. This causes a problem when you to read the file back in. Thus, when we are appending a file, we must override the writeStreamHeader method to do nothing. File f = new File("out5.dat"); // Create FileOutputStream FileOutputStream fos=new FileOutputStream(f,true); // Determine if file is new. boolean append=f.exists() && f.length()>0; // Create ObjectOutputStream. ObjectOutputStream oos; if (append) { // If appending, create an OOS that overrides writeStreamHeader // to do nothing. oos = new ObjectOutputStream(fos) { protected void writeStreamHeader() throws IOException{} }; } else { // If file is new, use default implementation of OOS which // prints header. oos = new ObjectOutputStream(fos); } // Write books oos.writeObject(b1); oos.writeObject(b2); oos.writeObject(p); Section 19.7 – Random Access Files Skip 14 Chapter 9 Review – Text I/O Writing a Text File // Declare output file. File outFile = new File( "output.txt" ); // We will write these numbers to a file, 3 per line, // comma delimited. int[] nums = {33, 44, 55, 66, 12, 33, 55, 66, 77, 22}; // Declare and create a PrintWriter. PrintWriter writer = new PrintWriter( outFile ); // Write a number. writer.print( nums[0] ); writer.close(); Reading a Text File String name; double salary; // Declare input file. File inFile = new File( "employees.txt" ); // Declare and create Scanner. Scanner input = new Scanner( inFile ); // While there are more things to read... while( input.hasNext() ) { // Read name, salary, and title. name = input.next(); salary = input.nextDouble(); } // Close the Scanner. input.close(); Appending a Text File // We will append these numbers to a file, 1 per line int[] nums = {33, 44, 55, 66 }; // Declare output file. File outFile outFile = new File( "output3.txt" ); // Create a FileWriter FileWriter fw = new FileWriter( outFile, true ); // Declare and create a PrintWriter. PrintWriter writer = new PrintWriter( fw ); // Write a number. writer.println( nums[i] ); // Close the PrintWriter. writer.close(); 15 Chapter 2 Review – Console I/O Writing to the Console (System.out) // We will write these numbers to a file, 3 per line, // comma delimited. int[] nums = {33, 44, 55, 66, 12, 33, 55, 66, 77, 22}; // Declare and create a PrintWriter. PrintWriter writer = new PrintWriter( System.out ); // Write a number. writer.print( nums[i] ); // Close the PrintWriter. writer.close(); Reading From the Console (System.in) String name, title double salary; // Declare and create Scanner. Scanner input = new Scanner( System.in ); // Read name, salary, and title. System.out.print( "Type in name:"); name = input.next(); System.out.print( "Type in salary:"); salary = input.nextDouble(); System.out.print( "Type in title:"); title = input.next(); // Close the Scanner. input.close(); 16