CS 1302 – Ch 19, Binary I/O

advertisement
CS 1302 – Ch 19, Binary I/O
Sections
19.1-19.4.1,
19.6-19.6
Pages
710-715, 724-729
Review Questions
Liang’s Site
Programming Exercises
any
Sections 19.1 – Introduction
1. An important part of programming is the ability to store data in files and read it back in. All files are
stored in binary. For example:
0100101001100001011101100110000100100000011010010111001100100000011000010010000001
1001110111001001100101011000010111010000100000011011000110000101101110011001110111
0101011000010110011101100101001000000111010001101111001000000110110001100101011000
01011100100110111000101110
2. Definitions:
a. Character Encoding – Associates a unique (usually) value (number) with each character (glyph) in a
character set. For example, the integer 97 is associated with the letter “a” in the English character
set. This mapping is a convenient way to store and transmit data.
b. Encode –The process of turning a character into its unique value. In this chapter, we think of
encoding as writing characters to a file storing the associated values.
c. Decode – The process of using turning a value into a character. In this chapter we think of decoding
as reading a value from a file and turning it into a character.
3. There are two general ways for a program to save data to disk:
a. Using a standard encoding scheme. ASCII and UTF-8 are both universally accepted and are identical
for the most common characters in the English character set. When we persist data to file using a
standard encoding scheme, we call this file a text file. We say that such a file is human readable
because practically any software we use to open the file will detect that it is stored using the UTF-8
encoding, thus properly rendering the characters.
b. Using a custom encoding scheme. For example, programming languages can do binary encoding
where numbers (in base 10) are represented as their binary equivalent. For example, the integer 6
is “110” in binary. Thus, when a program reads “110”, it can decode that as the integer 6. Thus, a
program might store data in a format like this: int, int, double, string, and then repeat. Such a file is
called a binary file. Binary encoding also allows you to encode objects, save them as objects, and
read them back in as objects. The focus of this chapter is on reading and writing binary files.
1
4. Standard Encoding
a. ASCII (American Standard Code for Information Interchange) is a character encoding scheme that
maps the English alphabet, some punctuation, and a few other characters into unique identifiers,
numbers which can be expressed in either hexadecimal or binary. The table below shows a few of
the mappings.
Letter ASCII Code HEX
Binary
Letter ASCII Code HEX
Binary
a
097
61 01100001
A
065
41 01000001
b
098
62 01100010
B
066
42 01000010
c
099
63 01100011
C
067
43 01000011
d
100
64 01100100
D
068
44 01000100
e
101
65 01100101
E
069
45 01000101
…
…
…
…
…
…
…
…
b. Originally, ASCII represented 128 characters. Since 27 = 128, 7 bits can be used to represent each
unique character. Thus, in a file such as the one above, every 7 bits would be encoded as a
particular character. Soon after, ASCII was extended to 8-bit since 8 bits are stored in a byte and
the extra bit allows the accommodation of 128 additional characters. ASCII was the standard on the
internet until 2007 when UTF-8 was adopted as a world-wide standard.
c. UTF-8 (Universal Character Set Transformational Format – 8 bit) represents every character in the
Unicode character set. Unicode is a standard for encoding most of the world’s written languages.
UTF-8 was designed for backwards compatibility with ASCII. Thus, an ASCII code is the same as a
UTF-8 code (code point) for the same character. Finally, Java uses UTF-8 encoding.
d. A text file is binary data that is UTF-8 encoded so that (usually) every 8 bits represents a character.
A program that uses a text file (WordPad, NotePad, Word, Eclipse, etc) decodes the data and
presents as human-readable text. For example, the first 9 bytes of the file above are decoded as:
01001010
J
01100001
a
01110110
v
01100001
a
00100000
2
01101001
i
01110011
s
00100000
01100001
a
5. Binary Encoding – A binary file written with a binary writer class (ObjectOutputStream) in Java (or
some other language) is not human-readable; it is designed to be read by programs. For example, Java
source programs are stored in text files and can be read by a text editor, but Java classes are stored in
binary files and are read by the JVM. The advantage of binary files is that they are more efficient to
process than text files. In the example below, in (a), the number 199 is encoded with UTF-8 as the
characters “1”, “9”, and “9”. In (b) below, the number 199 is converted to its binary representation as a
number (11000111 or C7). Thus, one byte was used to store the number and three bytes were used to
store the text based encoding.
Text I/O program
(a)
The Unicode of
the character
e.g. "199"
,
Encoding/
Decoding
The encoding of the character
is stored in the file
00110001 00111001 00111001
0x31
0x39
0x39
Binary I/O program
(b)
A byte is read/written
e.g. 199
,
The same byte in the file
11000111
0xC7
Sections 19.2 – How is I/O Handled in Java?
1. A File object encapsulates the properties of a file or a path, but does not contain the methods for
reading/writing data from/to a file. In order to perform I/O, you need to create objects using
appropriate Java I/O classes.
Program
Input object
created from an
input class
Output object
created from an
output class
Input stream
01011…1001
File
11001…1011
File
Output stream
3
Section 19.3 – Text I/O vs. Binary I/O
1. Text I/O requires encoding and decoding. The JVM converts a Unicode to a file specific encoding when
writing a character and coverts a file specific encoding to a Unicode when reading a character. Binary
I/O does not require conversions. When you write a byte to a file, the original byte is copied into the
file. When you read a byte from a file, the exact byte in the file is returned.
2. Why are binary files necessary? http://chortle.ccsu.edu/java5/notes/chap86/ch86_6.html
Section 19.4 – Binary I/O Classes
1. The binary I/O classes:
FileInputStream
DataInputStream
InputStream
FilterInputStream
BufferedInputStream
ObjectInputStream
Object
OutputStream
FileOutputStream
BufferedOutputStream
FilterOutputStream
DataOutputStream
ObjectOutputStream
PrintStream
2. Input Stream:
java.io.InputStream
+read(): int
Reads the next byte of data from the input stream. The value byte is returned as
an int value in the range 0 to 255. If no byte is available because the end of
the stream has been reached, the value –1 is returned.
+read(b: byte[]): int
Reads up to b.length bytes into array b from the input stream and returns the
actual number of bytes read. Returns -1 at the end of the stream.
+read(b: byte[], off: int,
len: int): int
Reads bytes from the input stream and stores into b[off], b[off+1], …,
b[off+len-1]. The actual number of bytes read is returned. Returns -1 at the
end of the stream.
+available(): int
Returns the number of bytes that can be read from the input stream.
+close(): void
Closes this input stream and releases any system resources associated with the
stream.
+skip(n: long): long
Skips over and discards n bytes of data from this input stream. The actual
number of bytes skipped is returned.
+markSupported(): boolean Tests if this input stream supports the mark and reset methods.
+mark(readlimit: int): void Marks the current position in this input stream.
+reset(): void
Repositions this stream to the position at the time the mark method was last
called on this input stream.
4
3. OutputStream
java.io.OutputStream
+write(int b): void
Writes the specified byte to this output stream. The parameter b is an int value.
(byte)b is written to the output stream.
+write(b: byte[]): void
Writes all the bytes in array b to the output stream.
+write(b: byte[], off: int, Writes b[off], b[off+1], …, b[off+len-1] into the output stream.
len: int): void
+close(): void
Closes this input stream and releases any system resources associated with the
stream.
+flush(): void
Flushes this output stream and forces any buffered output bytes to be written out.
4. FileInputStream/FileOutputStream associates a binary input/output stream with an external file. All
the methods in FileInputStream/FileOuptputStream are inherited from its superclasses.
FileInputStream
DataInputStream
InputStream
FilterInputStream
BufferedInputStream
ObjectInputStream
Object
OutputStream
FileOutputStream
BufferedOutputStream
FilterOutputStream
DataOutputStream
ObjectOutputStream
PrintStream
5. FileInputStream – To construct a FileInputStream, use the following constructors:
public FileInputStream(String filename)
public FileInputStream(File file)
A java.io.FileNotFoundException will occur if you attempt to create a FileInputStream with a
nonexistent file.
6. FileOutputStream – To construct a FileOutputStream, use the following constructors:
public
public
public
public
FileOutputStream(String filename)
FileOutputStream(File file)
FileOutputStream(String filename, boolean append)
FileOutputStream(File file, boolean append)
If the file does not exist, a new file is created. If the file already exists, the first two constructors delete
the current contents in the file. To retain the current content and append new data into the file, use
the last two constructors by passing true to the append parameter.
5
7. (Optional) DataInputStream/DataOutputStream – Reads bytes from the stream and converts them into
appropriate primitive type values or strings. Writes primitive type values or strings into bytes and
outputs the bytes to the stream.
InputStream
FilterInputStream
DataInputStream
+DataInputStream(
in: InputStream)
OutputStream
FilterOutputStream
DataOutputStream
+DataOutputStream(
out: OutputStream)
java.io.DataInput
+readBoolean(): boolean Reads a Boolean from the input stream.
+readByte(): byte
Reads a byte from the input stream.
+readChar(): char
Reads a character from the input stream.
+readFloat(): float
Reads a float from the input stream.
+readDouble(): float
Reads a double from the input stream.
+readInt(): int
Reads an int from the input stream.
+readLong(): long
Reads a long from the input stream.
+readShort(): short
Reads a short from the input stream.
+readLine(): String
Reads a line of characters from input.
+readUTF(): String
Reads a string in UTF format.
java.io.DataOutput
+writeBoolean(b: Boolean): void Writes a Boolean to the output stream.
+writeByte(v: int): void
Writes to the output stream the eight low-order bits
of the argument v.
+writeBytes(s: String): void
Writes the lower byte of the characters in a string to
the output stream.
+writeChar(c: char): void
Writes a character (composed of two bytes) to the
output stream.
+writeChars(s: String): void
Writes every character in the string s, to the output
stream, in order, two bytes per character.
+writeFloat(v: float): void
Writes a float value to the output stream.
+writeDouble(v: float): void
Writes a double value to the output stream.
+writeInt(v: int): void
Writes an int value to the output stream.
+writeLong(v: long): void
Writes a long value to the output stream.
+writeShort(v: short): void
Writes a short value to the output stream.
+writeUTF(s: String): void
Writes two bytes of length information to the output
stream, followed by the UTF representation of
every character in the string s.
Section 19.5 – Problem: Copying Files
Skip.
6
Section 19.6 – Object I/O
1. ObjectInputStream/ObjectOutputStream enables you to perform binary I/O for objects in addition for
primitive type values and strings. In the code examples below,
Section 19.6 – Writing Binary Values
// Some sample data to write.
String s = "Hello Binary";
double x = 3.34;
int y = 13;
double[] vals = new double[] {3.4, 4.2, 6.7};
// Create output stream.
ObjectOutputStream oos =
new ObjectOutputStream(new FileOutputStream("out1.dat"));
// Write data
oos.writeObject(s);
oos.writeDouble(x);
oos.writeInt(y);
oos.writeObject(vals);
// Flush and close output stream.
oos.flush();
oos.close();
oos = null;
Section 19.6 – Reading Binary Values
// Create input stream.
ObjectInputStream ois = new ObjectInputStream(new FileInputStream("out1.dat"));
try
{
while ( true )
{
// Read data values
s = (String)ois.readObject();
x = ois.readDouble();
y = ois.readInt();
vals = (double[])ois.readObject();
}
}
catch (EOFException eofEx)
{
System.out.println("
End of file...");
}
finally
{
ois.close();
ois = null;
}
7
Section 19.6 – Writing Binary (Custom) Objects
In this example we will write Book objects to a binary file. An instance of any class can be written to binary
if it implements the Serializable interface. The Serializable interface is a marker interface. When an object
is written to binary, we say that it is serialized. Serialization refers to the process of converting an object to
0’s and 1’s.
class Book implements Serializable
{
String name;
int pages;
public Book(String name, int pages)
{
this.name = name;
this.pages = pages;
}
public String toString()
{
return name + ":" + pages;
}
}
Writing Book objects to a binary file:
// Create some books.
Book b1, b2;
b1 = new Book("Zen and the", 343);
b2 = new Book("Moby Dick", 875);
// Create output stream.
ObjectOutputStream oos =
new ObjectOutputStream(new FileOutputStream("out2.dat"));
// Write books
oos.writeObject(b1);
oos.writeObject(b2);
8
Section 19.6 – Reading Binary (Custom) Objects
Deserialization refers to the process of converting binary data into objects. This is an example of reading
Book objects from a binary file.
ArrayList<Book> books = new ArrayList<Book>();
// Create input stream
ObjectInputStream ois = new ObjectInputStream(new FileInputStream("out2.dat"));
// Try to read books. Will throw exception when EOF.
// Input stream does not provide a way to check EOF.
try
{
while ( true )
{
Object obj = ois.readObject();
if (obj instanceof Book)
{
b1 = (Book)obj;
books.add( b1 );
}
}
}
catch (EOFException eofEx)
{
System.out.println("
End of file...");
}
finally
{
ois.close();
ois = null;
}
9
Section 19.6 – Writing Binary Objects Associated with Other Objects
In this example, a Person has an arraylist of Books. Note that both classes must implement the Serializable
interface.
class Book implements Serializable
{
String name;
int pages;
public Book(String name,
int pages)
{
this.name = name;
this.pages = pages;
}
class Person implements Serializable
{
String name;
ArrayList<Book> books = new ArrayList<Book>();
public Person(String name)
{
this.name = name;
}
public String toString()
{
StringBuilder sb = new StringBuilder();
sb.append( name + "'s books:");
for( Book b : books )
sb.append(b + ", ");
public String toString()
{
return name + ":" + pages;
}
}
return sb.toString();
}
}
We write Person objects which contain a list of Book objects.
//
b1
b2
b3
Create some books.
= new Book("Zen and the", 343);
= new Book("Moby Dick", 875);
= new Book("Kite Runner", 224);
// Create a person
Person p = new Person( "Dave" );
// Add books to person
p.books.add(b1); p.books.add(b2); p.books.add(b3);
// Create output stream.
ObjectOutputStream oos =
new ObjectOutputStream(new FileOutputStream("out3.dat"));
// Write Person
oos.writeObject(p);
Section 19.6 – Reading Binary Objects Associated with Other Objects
ObjectInputStream ois =
new ObjectInputStream(new FileInputStream("out3.dat"));
p = (Person)ois.readObject();
10
Section 19.6 – Dealing with Exceptions
The code as written above can throw a number of Exceptions that must be caught or declared thrown. For
example:
1. Writing Binary – Approach 1
import java.util.*;
import java.io.*;
public class AccountTester
{
public static void main(String[] args)
{
...
saveAccounts2( accounts, FILE_NAME );
}
public static void saveAccounts2( List<Account> accounts, String fileName )
{
try
{
ObjectOutputStream oos =new ObjectOutputStream(new FileOutputStream( fileName ));
for( Account a : accounts )
oos.writeObject(a);
oos.close();
oos = null;
}
catch( Exception e )
{
}
}
Some of the exceptions that be thrown from this code:
catch ( NotSerializableException nse ) {}
catch ( InvalidClassException ice ) {}
catch ( FileNotFoundException fnfe ) {}
catch ( IOException ioe ) {}
11
2. Reading Binary – Approach 1
import java.util.*;
import java.io.*;
public class AccountTester
{
public static void main(String[] args)
{
...
readAccounts2( accounts, FILE_NAME );
}
public static List<Account> readAccounts2( String fileName )
{
List<Account> accounts = new ArrayList<Account>();
ObjectInputStream ois = null;
try
{
ois = new ObjectInputStream( new FileInputStream( fileName ) );
try
{
while ( true )
{
Object obj = ois.readObject();
if (obj instanceof Account)
accounts.add( (Account)obj );
}
}
catch (EOFException eof)
{
ois.close();
ois = null;
return accounts;
}
}
catch (Exception e)
{
return null;
}
}
12
Section 19.6 – The transient Modifier
Static variables are not serialized. The transient modifier is used to indicate variables that are not to be
serialized. It tells the JVM to ignore it when serializing.
class Employee implements Serializable
{
String name;
transient Assignment assignment;
public Employee(String name)
{
this.name = name;
}
public void initJob()
{
assignment = new Assignment( "Programmer" );
}
public String toString()
{
StringBuilder sb = new StringBuilder();
sb.append( name + "'s job type:" + assignment );
return sb.toString();
}
}
class Assignment
{
String jobType;
public Assignment(String jobType)
{
this.jobType = jobType;
}
public String toString()
{
return jobType;
}
}
13
Section 19.6 – Appending a Binary File
Each time open a new output stream, a header is written in the binary file. If you open the stream again to
write, you will write another header. This causes a problem when you to read the file back in. Thus, when
we are appending a file, we must override the writeStreamHeader method to do nothing.
File f = new File("out5.dat");
// Create FileOutputStream
FileOutputStream fos=new FileOutputStream(f,true);
// Determine if file is new.
boolean append=f.exists() && f.length()>0;
// Create ObjectOutputStream.
ObjectOutputStream oos;
if (append)
{
// If appending, create an OOS that overrides writeStreamHeader
// to do nothing.
oos = new ObjectOutputStream(fos)
{
protected void writeStreamHeader() throws IOException{}
};
}
else
{
// If file is new, use default implementation of OOS which
// prints header.
oos = new ObjectOutputStream(fos);
}
// Write books
oos.writeObject(b1);
oos.writeObject(b2);
oos.writeObject(p);
Section 19.7 – Random Access Files
Skip
14
Chapter 9 Review – Text I/O
Writing a Text File
// Declare output file.
File outFile = new File( "output.txt" );
// We will write these numbers to a file, 3 per line,
// comma delimited.
int[] nums = {33, 44, 55, 66, 12, 33, 55, 66, 77, 22};
// Declare and create a PrintWriter.
PrintWriter writer = new PrintWriter( outFile );
// Write a number.
writer.print( nums[0] );
writer.close();
Reading a Text File
String name;
double salary;
// Declare input file.
File inFile = new File( "employees.txt" );
// Declare and create Scanner.
Scanner input = new Scanner( inFile );
// While there are more things to read...
while( input.hasNext() )
{
// Read name, salary, and title.
name = input.next();
salary = input.nextDouble();
}
// Close the Scanner.
input.close();
Appending a Text File
// We will append these numbers to a file, 1 per line
int[] nums = {33, 44, 55, 66 };
// Declare output file.
File outFile outFile = new File( "output3.txt" );
// Create a FileWriter
FileWriter fw = new FileWriter( outFile, true );
// Declare and create a PrintWriter.
PrintWriter writer = new PrintWriter( fw );
// Write a number.
writer.println( nums[i] );
// Close the PrintWriter.
writer.close();
15
Chapter 2 Review – Console I/O
Writing to the Console (System.out)
// We will write these numbers to a file, 3 per line,
// comma delimited.
int[] nums = {33, 44, 55, 66, 12, 33, 55, 66, 77, 22};
// Declare and create a PrintWriter.
PrintWriter writer = new PrintWriter( System.out );
// Write a number.
writer.print( nums[i] );
// Close the PrintWriter.
writer.close();
Reading From the Console (System.in)
String name, title
double salary;
// Declare and create Scanner.
Scanner input = new Scanner( System.in );
// Read name, salary, and title.
System.out.print( "Type in name:");
name = input.next();
System.out.print( "Type in salary:");
salary = input.nextDouble();
System.out.print( "Type in title:");
title = input.next();
// Close the Scanner.
input.close();
16
Download