Ch 21 Notes

advertisement
CS 1302 – Ch 21, Sets & Maps
We will cover Sections 1-6 in Chapter 21.
Section 21.1 – Introduction
1. In this chapter we will study the Set and Map interfaces and their common implementations.
Section 21.2 – Sets
1. Set is a sub-interface of Collection. A Set:


Doesn’t allow duplicates
Doesn’t provide random (positional) access
Java provides three common implementations:
a. HashSet – Doesn’t provide any particular ordering.
b. LinkedHashSet – Elements are ordered according to the order they were inserted.
c. TreeSet – Elements are ordered according to Comparable or Comparator.
2. Speed: Consider add, remove, contains
a. HashSet – Very fast, O(log 1)*
b. LinkedHashSet – Very fast, O(log 1)*
c. TreeSet – Fast, O(n)*
* This is called Big ‘O’ notation. You will learn about this in CS 3410. It is a measure of how fast an
algorithm is. We will briefly discuss the graph at the bottom of this page: http://bigocheatsheet.com/
3. What does it mean to say a Set doesn’t allow duplicates? No two elements e1 and e2 can be in the set
such that e1.equals(e2) is true.
1
4. The Set interface hierarchy overview:
2
5. The Set interface hierarchy
«interface»
java.util.Collection<E>
«interface»
java.util.Set<E>
java.util.AbstractSet<E>
«interface»
java.util.SortedSet<E>
java.util.HashSet<E>
+first(): E
+HashSet()
+last(): E
+HashSet(c: Collection<? extends E>)
+headSet(toElement: E): SortedSet<E>
+HashSet(initialCapacity: int)
+tailSet(fromElement: E): SortedSet<E>
+HashSet(initialCapacity: int, loadFactor: float)
«interface»
java.util.LinkedHashSet<E>
+LinkedHashSet()
+LinkedHashSet(c: Collection<? extends E>)
+LinkedHashSet(initialCapacity: int)
+LinkedHashSet(initialCapacity: int, loadFactor: float)
java.util.NavigableSet<E>
+pollFirst(): E
+pollLast(): E
+lower(e: E): E
+higher(e: E):E
+floor(e: E): E
+ceiling(e: E): E
java.util.TreeSet<E>
+TreeSet()
+TreeSet(c: Collection<? extends E>)
+TreeSet(comparator: Comparator<? super E>)
+TreeSet(s: SortedSet<E>)
3
Section 21.2.1 – The HashSet Class
1. How do we create a HashSet?

HashSet()
Set<String> hsCities = new HashSet<>();

HashSet(c:Collection<? extends E>)
ArrayList<String> alCities = new ArrayList<>();
Set<String> hsCities = new HashSet<>(alCities);


HashSet(initCapacity:int)
HashSet(initCapacity:int, loadFactor:float)
2. What operations can we do on a HashSet? Anything in the Collection interface. HashSet doesn’t
provide any new operations.
3. What happens if we try to add a duplicate element to a HashSet? It is not added and the return from
add is false.
4. Note: HashSet does not provide random (positional) access. In other words there is no get method nor
add or remove at a particular location.
5. How do we iterate over the elements in a HashSet? We can use a for-each loop or an iterator, but not
an indexed for loop. And, there is no guarantee of order. The order of the elements could change
during execution of the program!
6. Can we store custom objects in a HashSet? Yes, but you must override hashCode and equals. We will
not do this in this class. You will learn about what a hash code is in CS 3410.
Section 21.2.2 – The LinkedHashSet Class
1. The LinkedHashSet class is identical to HashSet except that the order of insertion is preserved.
4
Section 21.2.3 – The TreeSet Class
1. How do we create a TreeSet?

TreeSet()
TreeSet<String> tsCities = new TreeSet<>();

TreeSet(c:Collection<? extends E>)
ArrayList<String> alCities = new ArrayList<>();
TreeSet <String> tsCities = new TreeSet <>(alCities);

TreeSet(comparator:Comparator<? super E>)
TreeSet<Employee> tsEmpsName = new TreeSet<>( new EmployeeNameComparator() );
2. What operations can we do on a TreeSet? Anything in the Collection interface and in SortedSet and
NavigableSet.
3. Can we store custom objects in a TreeSet? Yes, by supplying a Comparator in the constructor.
4. How do we see if a TreeSet contains a custom object (or remove one)? As we saw in Labs 8 and 10, we
use a dummy object. For example, suppose we have a TreeSet of Employee objects:
TreeSet<Employee> tsEmployees = new TreeSet<Employee>( new EmployeeSSNumComparator() );
tsEmployees.add( new Employee("Green", "Xavier", 338290448, 45.99);
...
and we want to remove an Employee and we only know the SSN. We can create a dummy Employee
object with the SSN and made-up values for name and salary (or provide a constructor that only takes
the SSN).
Employee dummy = new Employee("Doo", "Daa", 338290448, 0.00);
tsEmployees.remove(dummy));
5
5. The text considers a GeometricObjectComparator that compares different types of GeometricObjects
such as Circle, Rectangle, Triangle based on their area. Thus, we can compare a Circle and a Rectangle,
for instance.
public class GeometricObjectComparator implements Comparator<GeometricObject> {
public int compare(GeometricObject o1, GeometricObject o2) {
double area1 = o1.getArea();
double area2 = o2.getArea();
if (area1 < area2)
return -1;
else if (area1 == area2)
return 0;
else
return 1;
}
}
and then some driver code:
Set<GeometricObject> set = new TreeSet<>(new GeometricObjectComparator());
set.add(new Rectangle(4, 5));
set.add(new Circle(40));
set.add(new Circle(40));
set.add(new Rectangle(4, 1));
Section 21.3 – Performance of Sets & Lists
1. The author did a comparison of sets and lists doing a 50,000 calls to contains and remove. The results
were timed (milliseconds) and are shown below.
contains()
remove()
HashSet
20
27
LinkedHashSet
27
26
TreeSet
47
34
Section 21.4 – Case Study: Counting Keywords
Study this example carefully. It is very simple!
6
ArrayList
39,802
16,196
LinkedList
52,197
14,870
Section 21.5 – Maps
1. A Map is a very useful data structure.

A Map stores key/value pairs. In the figure below, SSN is the key and name is the value. The key/value pair is
often called a map entry or just an entry.

The keys are like indexes used to access the values.
value = map.get(key);
A List in some sense is a simple map (though not technically, a List is a Collection, not a Map) in that it uses
an integer index to reference the items in the list.

The keys (and values) can be any objects. In the figure on the right, both the key and the value are strings.
2. In Java, the Map interface is parallel to the Collection interface (i.e. not a sub-interface) as shown in the
diagrams below. Java provides three common implementations:
a. HashMap – The map entries have no particular order.
b. LinkedHashMap – The map entries are ordered according to the order they were inserted.
c. TreeMap – The map entries are ordered according to their keys where Comparable or Comparator
is used for the ordering.
3. In a Map, the keys are a Set (so no duplicate keys are allowed) and the values are a Collection (so duplicates
could exist. In other words, different keys could map to the same value).
7
8
4. The Map Interface:
java.util.Map<K, V>
+clear(): void
Removes all mappings from this map.
+containsKey(key: Object): boolean
Returns true if this map contains a mapping for the
specified key.
+containsValue(value: Object): boolean Returns true if this map maps one or more keys to the
specified value.
+entrySet(): Set
Returns a set consisting of the entries in this map.
+get(key: Object): V
Returns the value for the specified key in this map.
+isEmpty(): boolean
Returns true if this map contains no mappings.
+keySet(): Set<K>
Returns a set consisting of the keys in this map.
+put(key: K, value: V): V
Puts a mapping in this map.
+putAll(m: Map): void
Adds all the mappings from m to this map.
+remove(key: Object): V
Removes the mapping for the specified key.
+size(): int
Returns the number of mappings in this map.
+values(): Collection<V>
Returns a collection consisting of the values in this map.
Section 21.5.1 – The HashMap Class
1. HashMap Example 1:
Map<String,String> hmStates = new HashMap<>();
hmStates.put("NY", "New York");
hmStates.put("WY", "Wyoming");
String state = hmStates.get("WY");
hmStates.remove("SC");
Set<String> stateAbbrevs = hmStates.keySet();
for(String key : stateAbbrevs ) {
String value = hmStates.get(key);
}
Collection<String> stateNames = hmStates.values();
boolean isKeyInMap = hmStates.containsKey("CA");
9
2. HashMap Example 2:
Map<Integer,Employee> hmEmployees = new HashMap<Integer,Employee>();
Employee e1 = new Employee("Lyton", "Xavier", 243558673, 77.88);
...
hmEmployees.put(e1.getSSNum(), e1);
...
Set<Integer> keys = hmEmployees.keySet();
for(int key : keys) {
Employee e = hmEmployees.get(key);
}
Collection<Employee> emps = hmEmployees.values();
3. What happens if we try to add a map entry with a duplicate key? It replaces the value at the existing
key.
4. How do we iterate over the entries in a HashMap?
a. We can get all the keys with the keyset method and iterate over them.
b. We can get all the values with the values method and iterate over them.
c. We can get all the keys with the keyset, iterate over them, using each key to access the
corresponding value with the get method.
5. (Optional) Can we use custom objects as keys in a HashMap? Yes, but it is subject to the same issues as
storing custom objects in a HashSet, namely, you must override hashCode and equals.
Section 21.5.2 – The LinkedHashMap Class
1. The LinkedHashMap class is identical to HashMap except that the order of insertion is preserved.
2. (Optional) – The entries in a LinkedHashMap can also be accessed in the order in which they were last
accessed, from least recently accessed to most recently (access order). This can be used in an LRU
(Least Recently Used) cache. The constructor is:
LinkedHashMap(initialCapacity, loadFactor, true).
10
Section 21.5.3 – The TreeMap Class
1. How do we create a TreeSet?

TreeMap()

TreeMap(m:Map<? extends K, ? extends V>)

TreeMap(c:Comparator<? super K>)
2. A TreeMap inherits behaviors that are similar to TreeSet:
TreeMap (SortedMap)
TreeSet (SortedSet)
firstKey():K
lastKey():K
headMap(toKey:K):SortedMap<K,V>
tailMap(fromKey:K):SortedMap<K,V>
subMap(from:K,to:K):SortedMap<K,V>
first():E
last():E
headSet(toElement:E):SortedSet<E>
tailSet(fromElement:E):SortedSet<E>
subSet(from:E, to:E):SortedSet<E>
TreeMap (NavigableMap)
TreeSet (NavigableSet)
floorKey(key:K):K
lowerKey(key:K):K
ceilingKey(key:K):K
higherKey(key:K):K
floor(e:E):E
lower(e:E):E
ceiling(e:E) :E
higher(e:E):E
Section 21.6 – Case Study: Occurences of Words
Study the example from the text carefully.
Sections 21.7 – Singleton & Unmodifiable Collections & Maps
Omit
11
Ch 20 & 21, JCF Summary
12
13
HashSet & HashMap with Custom Objects (Advanced)
1. If you want to use HashSet with custom objects then the custom class must override both the equals and hashCode methods. The Java API
says that if two objects are to be considered equal (a.equals(b)==true) then their hashcodes must be equal (a.hashCode()==b.hashCode()).
Note, this does not say that two objects that have the same hashcode are equal. The signature for the two methods are:
public boolean equals( Object o )
public int hashCode()
We already know what the equals method is used to enforce logical equality. For instance, two Employees are equal, for instance, if and
only if they have the same SSN.
2. The hashCode method is a bit more involved to explain. You will study hashing in detail in the next course (CS 3410). For now, we say that
the hashCode method returns aninteger that represents the location of the object. The HashSet class uses the hashcode to find an object in
a set. The operation works something like this: in the course of doing certain operations (add, contains) if the HashSet class determines that
two objects have the same hashcode, then it calls equals to see if the objects are really the same.
3. For instance, suppose we are using a HashSet to hold Employee objects. When we add an Employee to the HashSet, the Employee’s
hashcode is used to store the object, like an index in an array. For example, an Employee object may have a hashcode of 48, so the HashSet
add method would put this object in the 48th position in an array. Later, when you want to call the contains or remove method with that
Employee object, the code retrieves the hashcode (48) for the object and then finds the object in position 48. The situation is identical with
HashMap if we want to use custom objects for the keys or if we want to use the containsValue method when the values are custom objects.
4. So, how do we override the hashcode method? We will show a simple approach here and again in the HashMap section. You will consider
this in more detail in CS 3410. To override hashcode effectively takes a much deeper understanding. Some further explanation and guidance
can be found here:
http://tutorials.jenkov.com/java-collections/hashcode-equals.html
http://www.ibm.com/developerworks/library/j-jtp05273/
http://www.javaworld.com/article/2076332/java-se/how-to-avoid-traps-and-correctly-override-methods-from-java-langobject.html?page=2
http://www.javamex.com/tutorials/collections/hash_function_guidelines.shtml
14
5. Example – The class below only implements the equals method, but not the hashCode method.
Code – Incorrect, only overrides the equals method.
class EmployeeIncorrect
{
private String lName;
private String fName;
public EmployeeIncorrect( String lName, String fName )
{
this.lName = lName;
this.fName = fName;
}
public String getLastName() { return lName; }
public String getFirstName() { return fName; }
public boolean equals(Object o)
{
EmployeeIncorrect e = (EmployeeIncorrect)o;
return ( this.getLastName().equals(e.getLastName()) ) &&
( this.getFirstName().equals(e.getFirstName()) );
}
}
So, we create two Employees that are equal, but they both get added to the set. This is because the add method calls the hashCode method
on each object and they are both different, so equals never gets called. The hashCode method that is called is the one implemented in the
Object class which usually uses the address of the object to compute its hashcode. Usually this unique.
Code
Set<EmployeeIncorrect> set = new HashSet<EmployeeIncorrect>();
Output
EmployeeIncorrect e1 = new EmployeeIncorrect("Jones", "Hugh" );
EmployeeIncorrect e2 = new EmployeeIncorrect("Jones", "Hugh" );
System.out.println( set.add( e1 ) );
System.out.println( set.add( e2 ) );
true
true
System.out.println( set );
[Hugh Jones,
15
Hugh Jones]
6. Example – Here, we also override the hashCode method in the Employee class.
Code
class Employee
{
private String lName;
private String fName;
public Employee( String lName, String fName )
{
this.lName = lName;
this.fName = fName;
}
public String getLastName() { return lName; }
public String getFirstName() { return fName; }
public String toString()
{
return String.format("%s %s", getFirstName(), getLastName() );
}
public boolean equals(Object o)
{
Employee e = (Employee)o;
return ( this.getLastName().equals(e.getLastName()) ) &&
( this.getFirstName().equals(e.getFirstName()) );
}
public int hashCode()
{
int x = lName.hashCode();
int y = fName.hashCode();
int z = 23*x + 7*y;
return z;
}
}
16
Some test code now shows that the duplicate object will not be added
Code
Set<Employee> set2 = new HashSet<Employee>();
Output
Employee e3 = new Employee("Jones", "Hugh" );
Employee e4 = new Employee("Jones", "Hugh" );
System.out.println( set2.add( e3 ) );
System.out.println( set2.add( e4 ) );
true
false
System.out.println( set2 );
[Hugh Jones]
17
Download