Maps - Princeton High School

advertisement
Maps
In our discussions of collections, we have made
two assumptions about the way elements are
stored in a collection:
 Where we store an element has nothing to
do with the element itself, as in the case of
our set and unorderedlists. In order to
find any item we must search the entire
collection.
 The element is stored based on an index
(numeric integer) value that may or may
not (often not) be related to any piece of
the item itself.
A conceptual view of a map
Map Operations
There are many ways to implement maps.
We will start by looking at a MapADT class and
identifying properties that are common to all
maps.
 Insertion into a Map: adding items to a
map is called associating a value with
its key. Both the value being inserted into
the map and the key for that value must
be stored in a way that makes it easy or at
least possible to find it again.
Map insertion
 In this implementation, a new element is
inserted at the end of the map.
 Where elements are in a map depends on
the type of data storage used in the map
implementation.
 Because values in a map are stored by a
unique key, no two items in the map can
have the same key.
 The usual way of associating a new object
with an already existing key is to replace
the current object within the map’s
storage, returning the current object from
the put method.
The insertion of an element with a preexisting
key value
The value 2 is returned
 Removing associations from a Map:
Both the value being stored and the key
value are removed.
 If the data structure is asked for a set of all
of the items stored in the map, it will not
return either the removed value or the key.
Removing associations from a map
The value 3 is returned
Implementing a Map with Arrays
One way to implement a map is to use an array.
Consider the class ArrayMap, designed to
implement a MapADT interface.
The MapADT interface needs the
implementation to have the following:
 The ArrayMap Class: In order to store
all of the keys and values of a map we are
going to use a MapElement object, which
will store the key and the value together.
 The ArrayMap class will then have a
private-data array of MapElement objects.
 Because we want to add and remove
elements efficiently we are going to use a
count variable to keep track of the number
of associations and only resize the array
when we need to.
 In this application, by storing both the key
and value in one class we can be sure
that we will not mix up associations when
we add and remove elements.
 No changes of the actual value will be
allowed.
 If a new association is added to the map,
a new MapElement instance will be
created for that association.
 The indexOffKey Operation search and
returns the associated key for a value.
 The put Operation: Because things do not
need to be in order for a map, we can just
insert the element at the end of the array.
This method uses the helper method to
determine whether the key-value
association already exists.
 The get Method lets us retrieve an item
using its key value. If the key value is not
associated with any value in the map, the
get method returns null.
 The keySet Method: One way to traverse
a map is to et a set of the keys, iterate
over that set, and then retrieve each
element from the map based on its
individual keys. Unlike a set, there is no
direct way to iterate over a map. The
keySet method returns a set of all of the
keys stored in the list. The keySet method
uses the SetADT interface and the
LinkedSet class.
 The remove Method removes the keyvalue associated for the given key. The
remove method will return the value
associated with that key, if one exists, or
null if it does not exist.
 As part of the remove operation, the
contents array is shifted so that there are
no blanks spaces. Because the order of
the elements does not matter, the last
element is simply moved to fill in the
empty space.
Using a Hash Table to Implement a Map
A hashing function determines the location
of an item
within a collection.
Each location in the table is called a cell or
bucket.
 Consider a simple example. We create an
array that will hold 26 elements. We want
to store names in our array, so we create
a hashing function that links each name to
a position in the array based on the first
letter of the name.
 For example, the name Ann, first letter A,
is mapped to position 0 of the array, and
Doug, first letter D, is mapped to position
3.
 Another simple example: suppose we are
storing important dates in an array of size
10 using hashing. We could define the
hash function to be (year + month +
day)%10. The remainder of any number
divided by 10 will be in the range 0 to 9.
 In a map array the values and keys are
not stored in any particular order, so the
time it takes to access a particular element
depends on the size of the array.
 But in a hashing table the time it takes to
access an element has nothing to do with
the number of elements in the table
because the key value takes you straight
to the element. We don’t have to do
comparisons to find a particular key. We
simply calculate where the key should be.
This means the operations on an element
in a hash table are all O(1).
 However, this only works if each element
maps to a unique position in the table.
What will happen if we try to store the
name Ann and the name Andrew? When
two elements or keys map to the same
location in the table, we have a collision.
But even when collisions can slow things
down, many of the objects we use for key
values have efficient hash methods that
will result in minimal collisions and O(1)
access.
HashMaps
The HashMap implementation provided in the
java library classes are part of the
java.util.package.
Analysis of the two implementations
of a Map
 The two types of maps, array and
hash, use space in similar ways. They
both store the value and keys for all
information contained within the map.
 But because the hash functionality is
provided through a method and not a
lookup table, a hashmap does not
need additional storage space.
Analysis of time complexity of
each operation separately
Analysis of put
 because the order of the elements in
the ArrayMap implementation of Map
does not matter, we can simply add to
the end of the array. Therefore placing
an element in the map takes the
following steps:
1. Check to see whether
the key is already in
the map.
2. Create a new
MapElement to be
added to the map.
3. Check to see whether
the array is full, and if it
is, resize it.
4. Insert the new
MapElement where it
belongs.
 Checking to see whether the key is
already I the map takes O(n), so the
array implementation of put as a whole
has a complexity of O(n).
 In the hash implementation, the
elements are ordered by a hashing
method based on the key value.
Therefore the steps are as follows:
1. Check to see whether
the key is already in
the map.
2. Create a new
MapElement to be
added to the map.
3. Insert the MapElement
where appropriate
based upon the
hashing function.
 Because getting an element based on
its key is O(1), the hash table
implementation of put as a whole has
a complexity of O(1).
Analysis of remove
 Because the order of the elements in
an array does not matter, when we
remove an element the last element in
the array can move to fill in the space
left by the previous association pair of
key and value.
 The steps required for this are:
1. Determine whether the
key given exists in the
map.
2. Remove the
association (if exists)
from the map.
 Determining whether the key exists in
the map has a worst-case complexity
of O(n), so the time for this operation
in an array implementation of a map
is, as before, O(n).
 A hash implementation of a map
would take the same steps, but
determining whether the key exists in
the map is an O(1) time complexity
thanks to the hashing function, so the
remove method in a hashmap has a
complexity of O(1).
Short Questions
1. How is data stored in a map?
2. Why would Social Security numbers be
better map associations for data on
people than, say, birthdays?
3. Why is it important that when values
are removed from the map, their key is
removed also?
4. How many arrays would it take to store
key-value associations if the
MapElement object were not used?
5. Why does storing a count improve the
efficiency of a map? Aside from
returning the size of the method, what
other ways is the count useful?
6. What is the benefit of storing the key
and its value together in a class used
by the map?
7. Why is the indexOffKey method of
ArrayMap a private method?
8. Why is the grouping of keys returned as
a set?
9. What type of value should a hash code
be?
10. What part of the hashmap
implementation makes it more efficient
than an array map?
Exercises
1. Using the expression
x.put(myString.substr(0,1), myString)
and given that x is a map and myString
is a string variable, what is the
relationship between key and value?
2. Describe possible solutions for an
imperfect mapping (where keys share
multiple values). Reworking the keys to
be used is not an option.
3. How would the time complexity of put
and get change in an array map if there
were no count variable?
4. What other data structures can you
think of for a map besides an array or a
hash table?
5. Consider a database to store research
information for elementary studying
countries. They need to record the
name of the country, population, and
name of the current head of state
(ruler). Describe how and why this
could be implemented as a map.
6. Look at the Java API and find another
implementation of a map besides array
and hash. Describe how it is different.
7. Why does a hashmap accept an object
variable if every item needs a hashing
function? What guarantees that the
function will exist?
8. Use the Java API to find out what other
type of collection also has a hash table
implementation. (Hint: it’s in the same
package as hashmap.)
9. Why will the keyset method of a map
never have a time better than O(n) for
any implementation of a map?
Programming Projects
1. Implement the clear method of
ArrayMap based on public void clear( ),
shown in this chapter.
2. Implement the containsKey method of
ArrayMap based on public Boolean
containsKey(Object Key), shown in this
chapter.
3. Implement the keySet method of
ArrayMap based on public SetADT
keySet( ), shown in this chapter.
4. Consider implementing a map to store
information about countries and their
capitol cities. Use the name of the
country as the key. Write a program
that uses your completed ArrayMap
and stores country names and their
capitols. The program should store
several countries and then use an
iteration over the set of keys in order to
print a complete list.
5. Create your own implementation of a
hashmap. Consider a program to store
student records for a high school. The
school has at most 500 students. Write
a program that uses a student number
(from 1 to 500) as a hash-code value in
order to store it in an array. The key
value for themap should be an integer
representation of that student number.
Write a Student class as well as an
ArrayHashMap class.
6. Repeat problem 4 using
java.util.TreeMap as your data
structure. Read the documentation for
TreeMap. What is the time analysis for
retrieval of any value store in the set
with a given key?
7. Repeat problem 5 using
java.util.HashMap. What principles of
object-oriented design made this easy
to implement with your already created
design?
8. Sometimes it is necessary to store
more than one object at a particular
key. Consider a mailbox. Items in the
mailbox are in no particular order.
Implement a Mail object that is able to
store the name of the sender as well as
the receiver of the mail. A map can be
used by a post office to track the mail
that comes through the systems. Each
mailbox has a unique key (its postal
address). However, each must store
multiple pieces of mail. Implement a
program that uses a map, with postal
addresses for keys, that stores a set of
mail currently in the mailbox. Use
java.util.HashMap for your
implementation of a map.
Download