Collections

advertisement
Data Structures and Collections
• Principles revisited
• .NET:
– Two libraries:
• System.Collections
• System.Collections.Generics
FEN 2012
UCN Technology - Computer
Science
1
Data Structures
and Collections
Choose and
use an adt,
e.g. Map
application
Read and write
(use)
specifications
ADT
class:
class Appl{
----
interface:
Map m;
(e.g. Map)
HashMap
Specification
TreeMap
-----
----
m= new XXXMap();
Choose and use a
data structure, e.g.
TreeMap
FEN 2012
Data structure
and algorithms
UCN Technology - Computer
Science
Know about
2
Collections Library
System.Collections
•
•
•
•
•
•
Data structures in .NET are normally called Collections
Are found in namespace System.Collections
Compiled into mscorlib.dll assembly
Uses object and polymorphism for generic containers.
Deprecated!
Classes:
– Array
–
–
–
–
ArrayList
Hashtable
Stack
Queue
FEN 2012
UCN Technology - Computer
Science
3
Collection Interfaces
• System.Collections implements a range of different interfaces in
order to provide standard usage of different containers
– Classes that implements the same interface provides the same
services
– Makes it easier to learn and to use the library
– Makes it possible to write generic code towards the interface
• Interfaces:
–
–
–
–
–
–
ICollection
IEnumerable
IEnumerator
IList
IComparer
IComparable
FEN 2012
UCN Technology - Computer
Science
4
ArrayList
• ArrayList stores sequences of elements.
– duplicate values are ok – position- (index-) based
– Elements are stored in an resizable array.
– Implements the IList interface
public class ArrayList : IList, IEnumerable, ...
{
// IList services
...
// additional services
int Capacity { get... set... }
void TrimToSize()
control of memory
in underlying array
int BinarySearch(object value)
int IndexOf
(object value, int startIndex)
int LastIndexOf (object value, int startIndex)
...
searching
}
FEN 2012
UCN Technology - Computer
Science
5
IList Interface
• IList defineres sequences of elements
– Access through index
add new elements
public interface IList : ICollection
{
int Add
(object value);
void Insert(int index, object value);
remove
void Remove (object value);
void RemoveAt(int
index);
void Clear
();
containment testing
bool Contains(object value);
int IndexOf (object value);
object this[int index] { get; set; }
read/write existing element
(see comment)
bool IsReadOnly { get; }
bool IsFixedSize { get; }
structural properties
}
FEN 2012
UCN Technology - Computer
Science
6
Hashtable
• Hashtable supports collections of key/value pairs
– keys must be unique, values holds any data
– stores object references at key and value
– GetHashCode method on key determine position in the table.
create
add
Hashtable ages = new Hashtable();
ages["Ann"] = 27;
ages["Bob"] = 32;
ages.Add("Tom", 15);
update
ages["Ann"] = 28;
retrieve
int a = (int) ages["Ann"];
FEN 2012
UCN Technology - Computer
Science
7
Hashtable Traversal
• Traversal of Hashtable
– each element is of type DictionaryEntry (struct)
– data is accessed using the Key and Value properties
Hashtable ages = new Hashtable();
ages["Ann"] = 27;
ages["Bob"] = 32;
ages["Tom"] = 15;
enumerate entries
get key and value
FEN 2012
foreach (DictionaryEntry entry in ages)
{
string name = (string) entry.Key;
int
age = (int)
entry.Value;
...
}
UCN Technology - Computer
Science
8
.NET 2:
System.Collections.Generics
(key, value) -pair
ICollection<T>
IList<T>
LinkedList<T>
IDictionary<TKey, TValue>
List<T>
SortedDictionary
<TKey, TValue>
Index able
Array-based
FEN 2012
Balanced
search tree
UCN Technology - Computer
Science
Dictionary
<TKey, TValue>
Hashtabel
9
Demos
• Lists
• Maps
• LinkedList in C#
FEN 2012
UCN Technology - Computer
Science
10
How does they work?
Count
• Array-based list
used
Free (waste)
• Linked list
FEN 2012
UCN Technology - Computer
Science
11
Dynamic vs. Static Data Structures
• Array-Based Lists:
– Fixed (static) size (waste of memory).
– May be able to grown and shrink (ArrayList), but this is very
expensive in running time (O(n))
– Provides direct access to elements from index (O(1))
• Linked List Implementations:
– Uses only the necessary space (grows and shrinks as
needed).
– Overhead to references and memory allocation
– Only sequential access: access by index requires searching
(expensive: O(n))
FEN 2012
UCN Technology - Computer Science
12
Hashing
• Keys are converted to
indices in an array.
• A hash function, h maps a
key to an integer, the hash
code.
• The hash code is divided
by the array size and the
remainder is used as index
• If two or more keys gives
the same index, we have a
collision.
FEN 2012
UCN Technology - Computer
Science
13
Chaining
• The array doesn’t hold the element itself, but a reference to a
collection (a linked list for instance) of all colliding elements.
• On search that list must be traversed
FEN 2012
UCN Technology - Computer
Science
14
Efficiency of Hashing
• Worst case (maximum collisions):
– retrieve, insert, delete all O(n)
• Average number of collisions depends on the load
factor, λ, not on table size
λ = (number of used entries)/(table size)
– But not on n.
• Typically (linear probing):
numberOfCollisionsavg = 1/(1 - λ)
• Example: 75% of the table entries in use:
– λ = 0.75:
1/(1-0.75) = 4 collisions in average
(independent of the table size).
FEN 2012
UCN Technology - Computer
Science
15
When Hashing Is Inefficient
• Traversing in key order.
• Find smallest/largest key.
• Range-search (Find all keys
between high and low).
• Searching on something else than
the designated primary key.
FEN 2012
UCN Technology - Computer
Science
16
(Binary) Search Trees
• Value based container:
– The search tree property:
• For any internal node: the value is greater than the value
in the left child
• For any internal node: the value is less than the value in
the right child
– Note the recursive nature of this definition:
• It implies that all sub trees themselves are search trees
• Every operation must ensure that the search tree
property is maintained
FEN 2012
UCN Technology - Computer
Science
17
Example:
A Binary Search Tree Holding Names
FEN 2012
UCN Technology - Computer
Science
18
InOrder:
Traversal Visits Nodes in Sorted Order
FEN 2012
UCN Technology - Computer
Science
19
Efficiency
• insert
• retrieve
• delete
– All operations depend on the depth of the tree
– If balanced: O(log n)
• Most libraries use a balanced version, for
instance Red-Black Trees
FEN 2012
UCN Technology - Computer
Science
20
Download