Maps In our discussions of collections, we have made two assumptions about the way elements are stored in a collection: Where we store an element has nothing to do with the element itself, as in the case of our set and unorderedlists. In order to find any item we must search the entire collection. The element is stored based on an index (numeric integer) value that may or may not (often not) be related to any piece of the item itself. A conceptual view of a map Map Operations There are many ways to implement maps. We will start by looking at a MapADT class and identifying properties that are common to all maps. Insertion into a Map: adding items to a map is called associating a value with its key. Both the value being inserted into the map and the key for that value must be stored in a way that makes it easy or at least possible to find it again. Map insertion In this implementation, a new element is inserted at the end of the map. Where elements are in a map depends on the type of data storage used in the map implementation. Because values in a map are stored by a unique key, no two items in the map can have the same key. The usual way of associating a new object with an already existing key is to replace the current object within the map’s storage, returning the current object from the put method. The insertion of an element with a preexisting key value The value 2 is returned Removing associations from a Map: Both the value being stored and the key value are removed. If the data structure is asked for a set of all of the items stored in the map, it will not return either the removed value or the key. Removing associations from a map The value 3 is returned Implementing a Map with Arrays One way to implement a map is to use an array. Consider the class ArrayMap, designed to implement a MapADT interface. The MapADT interface needs the implementation to have the following: The ArrayMap Class: In order to store all of the keys and values of a map we are going to use a MapElement object, which will store the key and the value together. The ArrayMap class will then have a private-data array of MapElement objects. Because we want to add and remove elements efficiently we are going to use a count variable to keep track of the number of associations and only resize the array when we need to. In this application, by storing both the key and value in one class we can be sure that we will not mix up associations when we add and remove elements. No changes of the actual value will be allowed. If a new association is added to the map, a new MapElement instance will be created for that association. The indexOffKey Operation search and returns the associated key for a value. The put Operation: Because things do not need to be in order for a map, we can just insert the element at the end of the array. This method uses the helper method to determine whether the key-value association already exists. The get Method lets us retrieve an item using its key value. If the key value is not associated with any value in the map, the get method returns null. The keySet Method: One way to traverse a map is to et a set of the keys, iterate over that set, and then retrieve each element from the map based on its individual keys. Unlike a set, there is no direct way to iterate over a map. The keySet method returns a set of all of the keys stored in the list. The keySet method uses the SetADT interface and the LinkedSet class. The remove Method removes the keyvalue associated for the given key. The remove method will return the value associated with that key, if one exists, or null if it does not exist. As part of the remove operation, the contents array is shifted so that there are no blanks spaces. Because the order of the elements does not matter, the last element is simply moved to fill in the empty space. Using a Hash Table to Implement a Map A hashing function determines the location of an item within a collection. Each location in the table is called a cell or bucket. Consider a simple example. We create an array that will hold 26 elements. We want to store names in our array, so we create a hashing function that links each name to a position in the array based on the first letter of the name. For example, the name Ann, first letter A, is mapped to position 0 of the array, and Doug, first letter D, is mapped to position 3. Another simple example: suppose we are storing important dates in an array of size 10 using hashing. We could define the hash function to be (year + month + day)%10. The remainder of any number divided by 10 will be in the range 0 to 9. In a map array the values and keys are not stored in any particular order, so the time it takes to access a particular element depends on the size of the array. But in a hashing table the time it takes to access an element has nothing to do with the number of elements in the table because the key value takes you straight to the element. We don’t have to do comparisons to find a particular key. We simply calculate where the key should be. This means the operations on an element in a hash table are all O(1). However, this only works if each element maps to a unique position in the table. What will happen if we try to store the name Ann and the name Andrew? When two elements or keys map to the same location in the table, we have a collision. But even when collisions can slow things down, many of the objects we use for key values have efficient hash methods that will result in minimal collisions and O(1) access. HashMaps The HashMap implementation provided in the java library classes are part of the java.util.package. Analysis of the two implementations of a Map The two types of maps, array and hash, use space in similar ways. They both store the value and keys for all information contained within the map. But because the hash functionality is provided through a method and not a lookup table, a hashmap does not need additional storage space. Analysis of time complexity of each operation separately Analysis of put because the order of the elements in the ArrayMap implementation of Map does not matter, we can simply add to the end of the array. Therefore placing an element in the map takes the following steps: 1. Check to see whether the key is already in the map. 2. Create a new MapElement to be added to the map. 3. Check to see whether the array is full, and if it is, resize it. 4. Insert the new MapElement where it belongs. Checking to see whether the key is already I the map takes O(n), so the array implementation of put as a whole has a complexity of O(n). In the hash implementation, the elements are ordered by a hashing method based on the key value. Therefore the steps are as follows: 1. Check to see whether the key is already in the map. 2. Create a new MapElement to be added to the map. 3. Insert the MapElement where appropriate based upon the hashing function. Because getting an element based on its key is O(1), the hash table implementation of put as a whole has a complexity of O(1). Analysis of remove Because the order of the elements in an array does not matter, when we remove an element the last element in the array can move to fill in the space left by the previous association pair of key and value. The steps required for this are: 1. Determine whether the key given exists in the map. 2. Remove the association (if exists) from the map. Determining whether the key exists in the map has a worst-case complexity of O(n), so the time for this operation in an array implementation of a map is, as before, O(n). A hash implementation of a map would take the same steps, but determining whether the key exists in the map is an O(1) time complexity thanks to the hashing function, so the remove method in a hashmap has a complexity of O(1). Short Questions 1. How is data stored in a map? 2. Why would Social Security numbers be better map associations for data on people than, say, birthdays? 3. Why is it important that when values are removed from the map, their key is removed also? 4. How many arrays would it take to store key-value associations if the MapElement object were not used? 5. Why does storing a count improve the efficiency of a map? Aside from returning the size of the method, what other ways is the count useful? 6. What is the benefit of storing the key and its value together in a class used by the map? 7. Why is the indexOffKey method of ArrayMap a private method? 8. Why is the grouping of keys returned as a set? 9. What type of value should a hash code be? 10. What part of the hashmap implementation makes it more efficient than an array map? Exercises 1. Using the expression x.put(myString.substr(0,1), myString) and given that x is a map and myString is a string variable, what is the relationship between key and value? 2. Describe possible solutions for an imperfect mapping (where keys share multiple values). Reworking the keys to be used is not an option. 3. How would the time complexity of put and get change in an array map if there were no count variable? 4. What other data structures can you think of for a map besides an array or a hash table? 5. Consider a database to store research information for elementary studying countries. They need to record the name of the country, population, and name of the current head of state (ruler). Describe how and why this could be implemented as a map. 6. Look at the Java API and find another implementation of a map besides array and hash. Describe how it is different. 7. Why does a hashmap accept an object variable if every item needs a hashing function? What guarantees that the function will exist? 8. Use the Java API to find out what other type of collection also has a hash table implementation. (Hint: it’s in the same package as hashmap.) 9. Why will the keyset method of a map never have a time better than O(n) for any implementation of a map? Programming Projects 1. Implement the clear method of ArrayMap based on public void clear( ), shown in this chapter. 2. Implement the containsKey method of ArrayMap based on public Boolean containsKey(Object Key), shown in this chapter. 3. Implement the keySet method of ArrayMap based on public SetADT keySet( ), shown in this chapter. 4. Consider implementing a map to store information about countries and their capitol cities. Use the name of the country as the key. Write a program that uses your completed ArrayMap and stores country names and their capitols. The program should store several countries and then use an iteration over the set of keys in order to print a complete list. 5. Create your own implementation of a hashmap. Consider a program to store student records for a high school. The school has at most 500 students. Write a program that uses a student number (from 1 to 500) as a hash-code value in order to store it in an array. The key value for themap should be an integer representation of that student number. Write a Student class as well as an ArrayHashMap class. 6. Repeat problem 4 using java.util.TreeMap as your data structure. Read the documentation for TreeMap. What is the time analysis for retrieval of any value store in the set with a given key? 7. Repeat problem 5 using java.util.HashMap. What principles of object-oriented design made this easy to implement with your already created design? 8. Sometimes it is necessary to store more than one object at a particular key. Consider a mailbox. Items in the mailbox are in no particular order. Implement a Mail object that is able to store the name of the sender as well as the receiver of the mail. A map can be used by a post office to track the mail that comes through the systems. Each mailbox has a unique key (its postal address). However, each must store multiple pieces of mail. Implement a program that uses a map, with postal addresses for keys, that stores a set of mail currently in the mailbox. Use java.util.HashMap for your implementation of a map.