This problem focuses on the use of the hashCode() method,... toString() and equals() methods. It also asks you to use...

advertisement
Problem Set 10
Due: 4:30PM, Friday May 10, 2002
Problem 1. Files and hashing, preliminary question (30%)
This problem focuses on the use of the hashCode() method, and touches on the
toString() and equals() methods. It also asks you to use streams. There is
nothing deep about this problem; it's just an exercise for you to work with methods
that each class has inherited from the Java Object class that you should generally
override to have a complete implementation. It also shows you how Java computes
hash codes by default, and it has you write a hash function for a class. Do not be
concerned with the fact that the code you write for this problem doesn't do anything
particularly useful.
A helpful hint: if you are reading or writing files, we suggest that you first create a variable
that indicates the complete path to the directory where the file resides. For example,
// class variable that gives the path where my text files are
public static final String path = "C:\\java\\sampledir\\PS10"
Then, in your method you can create one or more File objects using that path:
// first create file object
File myFile = new File(path, "myfile.dat");
BufferedReader reader = new BufferedReader(new FileReader(myFile));
If you don't use a full path, you will discover that Forte may not be able to find your file.
Write a program (you may do this in a main() method) to:
a.
b.
c.
d.
e.
f.
Open a file called "indata.dat". You should create this file using a text editor.
Read two Integers, two Doubles, four chars and two Strings from the file. You may
lay out the file as you wish.
Write a new class, called CharPair, that has the following methods:
i.
Constructor that takes two chars
ii.
toString() method to allow CharPair objects to be output using "println"
iii.
equals() method that returns true if the first char and second char are equal
in two CharPair objects. See pp 204-206 in the text.
iv.
hashCode() method that hashes CharPair objects with the same chars into
the same hash code, and meets the general desiderata for a hash function.
v.
You may only use the getNumericValue() method of the Character class;
otherwise you must use just the primitive chars in your class.
Create two CharPair objects and a reference to each. Also create a reference for one
of your Strings, Integers and Doubles. ("Create a reference" simply means that you
should declare a variable and assign it to the object. You will use the reference below.)
Invoke hashCode() on all of the objects and references in your program and print out
the results.
Invoke the equals() and toString() methods on your CharPair objects.
g.
h.
Raise the first Double to the first Integer power, and the second Double to the
second Integer power.
Output the new Doubles, one of the Strings you read in, and your new CharPairs to
a file called "outdata.dat"
You should use the BufferedReader and StringTokenizer classes to read from
your file.
Problem 2. Dictionary (70%)
You will build a dictionary of words for spell checking and then use it to correct the
spelling in a test file.
a.
b.
Read the dictionary from a file and construct a hash table containing all the words. The
dictionary is contained in the file "dictionary.dat" on the course Web site and
contains the 250 most common English words. Use a BufferedReader and
StringTokenizer to process the dictionary. Choose an appropriate hash table size for
a dictionary of 250 to 300 words.
Read a second file to add it to the dictionary. This file, dictionary2.dat, contains
some commonly misspelled words, based on a study of Usenet discussion groups.
Each line of the file contains a misspelled word followed by the tab character and the
correct spelling of the word. For example,
minascule
millenium
...
c.
d.
e.
f.
minuscule
millennium
[Did you know that the word millennium is misspelled 57% of the time?]
Read a third file, which is to be spell-checked in a simple way: You'll read every word
in the file and look for the word in the dictionary. You will also look for variations on
the word: remove an ending `s', `es', `ing' or `ed' if they exist. A test file for you to
use is in "testfile.dat". To make this problem easier, you do not need to correctly
handle words that include multiple endings. (I.e., the word "endings" has both the
`ing' and `s' inflections. Your solution can simply consider it to be an unknown word,
even if the word `end' is in the dictionary.) Your solution should ignore case when
checking a word. You don't need to preserve the original case in your output; you
can just output everything in lowercase.
If you find the word or one of its variations in the dictionary, assume it is spelled
correctly or that it is a common misspelling that you know how to correct. If it is a
misspelling, you will correct it, as noted below. Be careful with the alternate endings
for commonly misspelled words.
If you don't find the word, you assume it is misspelled and that you don't know how to
correct it.
Output a file, called "checked.dat" that is a copy of the third input file with the
following differences:
i. If a word is misspelled and you know how to correct it, output the
correctly spelled word
ii. If a word is misspelled and you don't know how to correct it, output
the misspelled word in ALL CAPS.
iii.
If the word is spelled correctly, output it `as is'.
iv. Keep the same line breaks as in the original file, but you do not need
to preserve capitalization or other whitespace.
g.
h.
Remember to close your files. Also, handle IOExceptions in a simple way; printing
the stack trace is sufficient.
This spell checker is extremely crude, of course. We don't keep track of whether a
word is a noun, verb, etc., which would limit the valid endings. There is no
capitalization or punctuation. All of these can be handled, but require substantial
additional effort. This problem does demonstrate the general approach taken by spellchecking programs, though.
We provide the HashTable and HashNode classes and the SimpleMap interface for
your use in this problem. They are on the course Web site.
Extra Credit
You may write a graphical user interface (GUI) for the primary program in the problem set for
a maximum of 40 extra credit points. This is an option in problem sets 6 through 10; you may
get extra credit for building a GUI for one (only) of these problem sets. In general, you are
free to design the GUI as you wish; this is a 'blank piece of paper' exercise. You may not use
System.out.println in your solution; all input and output must be done using Swing. You
should hand in two solutions if you create a GUI:
- Hand in the regular assignment, without a GUI, as described in the assignment. This allows
us to grade the main part of the assignment without having to worry about any potential
errors introduced by your GUI.
- Hand in the full solution using the GUI. This second submission will be graded based only on
the GUI, for a maximum of 40 points. Only the GUI (and its immediate interfaces to the rest of
the code) will be graded. You will receive between 0 and 40 points; even receiving 0 points
cannot hurt or lower your grade on the homework overall.
Specific requirements for Problem Set 10: Your program should use the same files as before
to read in the dictionary words, but instead of reading the text to be spelled from testfile.dat,
your GUI should allow the user to enter the text directly using Swing components.
Additionally, it should display the output in Swing, and the interface should be easy to use.
One sample interface is shown below. The user can enter text in the left text area. The text
area is an instance of JTextArea, which is similar to JTextField, but allows the user to input
multiple lines of text. Pressing the "Spellcheck Text" button runs the spell checker on the text
in the left area, and outputs the result in the right text area. The right text area is not
editable.
Turnin
Turnin Requirements
Hardcopy and electronic copy of ALL source code (all .java files).
Place a comment with your name, username, section, TA's name, assignment
number, and list of people with whom you have discussed the problem set on ALL files
you submit.
Do NOT turn in electronic or hardcopies of compiled byte code (.class files).
Electronic Turnin
To electronically turn in your problem sets, run Netscape
Then go to the 1.00 web page at:
http://command.mit.edu/1.00Spring02
Click on the "Submit Assignment" button. Be sure to set the Selection Bar to Problem
Set 10 or your files may be lost. Finally, go back to the home page and click on the
"View" section and be sure that your files were received. If you submit a file twice, the
latest version will be graded.
Collaboration
For this problem set, you may work together with at most one other person in the class. If you
choose to work with a partner, you must include both of your names on your submission. (If
you have different TAs, be sure you write both TAs' names on your PS.) You will not be
allowed to add the name of your partner after submitting your problem set. Only submit the
problem set ONCE (choose either person).
Penalties
Missing Hardcopy: -10% off problem score if missing hardcopy.
Missing Electronic Copy: -30% off problem score if missing electronic copy.
Late Turnin: -20% off problem score if 1 day late. More than 1 day late = NO CREDIT.
Download