Lists

advertisement
Implementing an ADT
• Having the JCFs is great – most of the significant
data structures are already available
• But you still may need to implement your own
– the JCF implementation may not contain the methods you
want
– the JCF implementation may not be as efficient as what
you can create
– the JCF may not restrict access like you would like
• for instance, Queue is an interface, the implementations of Queue
are LinkedList and PriorityQueue, neither of which presents a
real Queue
– or you are working in a language other than Java?
• As a CS student, you need to not only learn the JCFs
but how to implement your own ADT – here, we
focus on the List as an ADT
More
• It is important to understand how to use the JCFs
because using them will save you work, why reinvent them?
– It is also important to understand how to implement ADTs
on your own for the reasons stated on the previous slide
but is also furthers your understanding of OOP,
implementing proper interfaces (not interface classes, but
the interface of a class with a user class) and will also let
you think about computational and space efficiency of a
data structure
• There are usually 2 implementation ways to
implement a given ADT, an array and a linked list
– The array is already known by you
– The linked list means that you create a node for each
datum to be stored in the container and you link the nodes
together
What is a Collection?
• Generically, a collection is just a group of objects that
share the same type, each object is known as an
element of the collection
– In Java, the Collection is an Interface with three subtypes
(Set, List, Queue) each of which is an Interface
– Some of these collection types do not allow duplicate
elements
– Some collections allow you (or force you) to access
elements based on a key value (e.g., access by ID number)
– Some collections restrict access
•
•
•
•
Location to insert
Location to remove
Inability to search the collection for a particular entry
Inability to traverse the collection (without removing elements)
• We generally think of a Collection as a list of items –
the question is how we access the list
Ordered List ADT
• The List JCF classes allow inserting anywhere
• Instead, we want an Ordered List ADT
– insert only in the appropriate place based on ordering
• You can remove any element as long as its removal
maintains the ordering of the list
– removal will be done by item, not index
• Other methods are
– traversal of the list (e.g., print all elements)
– search for a particular element in the list – returning the
element itself
– destroy the list
– obtain the list’s size
– determine if the list is empty
– determine if the list is full (for the array version)
• As stated earlier, we can implement this using an array or a
linked list, we will examine both
Unordered List ADT
• Another way to represent a List is without ordering
– Why? It is easy (less computationally expensive) to add
elements to an unordered list
– Also, some lists do not necessarily require ordering
(consider a shopping list, you usually do not alphabetize
the items)
• The problem with the unordered list is that search is
expensive
– We will see why later
• The unordered list ADT has the same methods as the
ordered list but may be implemented differently:
– insert (always insert at the beginning or end for
convenience), search, remove, traverse, destroy, size,
isEmpty, isFull
Array Implementation of Unordered List
• Although we might want to use Generics to
implement our lists, we cannot if we are
implementing an array
– for simplicity, we will assume an array of int values
• We use the following three instance data
– int[] list;
– int size;
– int number;
// initialized to new int[100]
// set to 100
// initialized to 0
• And the following methods
• public boolean isFull( ) { return size==number;}
• public boolean isEmpty( ) {return number==0;}
• public void traverse( ) { for(int
i=0;i<number;i++) System.out.println(list[i]); }
• public void destroy( )
{number=0;size=0;list=null;}
Insert
• Since this is an unordered list, we can insert the new value
anywhere
• Where can we insert it that requires the least amount of
work?
• If we insert it anywhere from 0 to number-1 (that is, in a
position already occupied), we have to move elements
around
• Instead, we will always insert at the end of the array at
location number, and then increment number
public void insert(int x)
{
if(number<size) {
list[number]=x;
If the array is full, we can output an error
number++;
message here, throw an exception (we would
}
want to add throws ExceptionType to the
else …
header), or do array doubling
}
Search and Delete
• These methods will unfortunately require that we find the
item through a sequential search
• We will implement search first and use it for delete
public int search(int x)
{
int location=-1;
// indicates not found
for(int i=0;i<number&&location==-1;i++)
if(list[i]==x) location=I;
return location;
}
public void delete(int x)
We delete by replacing the item at index
{
with the last item in the array (at number-1)
int index=search(x);
and then decrement number
if(index!=-1)
{
list[index]=list[number-1];
number--;
}
If not found, output error message or throw
else …
Exception
}
The Cost Of Sequential Search
• As the item we are searching for can be anywhere in
the array (or not in the array at all), we have to look
at every value until we find it
– We will search from the beginning onward (or we could
search from the end inward)
– Once we find the item, we stop searching
– In the worst case, the item we are looking for is literally
the last item in the list based on the direction of our search,
or is not in the array at all
– If we have an array of n items
• in the best case, it takes 1 search
• in the worst case, it takes n searches
• in the average case, it takes (n+1)/2 searches, or roughly n/2
searches
– Keep this in mind for later…
Array Implementation of Ordered List
• We will use the same instance data as with our
Unordered List and the same initial methods
– int[] list;
– int size;
– int number;
// initialized to new int[100];
// set by constructor to 100
// initialized to 0
• public boolean isFull( ) { return
size==number;}
• public boolean isEmpty( ) {return number==0;}
• public void traverse( ) { for(int
i=0;i<number;i++)
System.out.println(list[i]); }
• public void destroy( )
{number=0;size=0;list=null;}
public void insert(int x) {
boolean done=false;
for(int i=number;i>0&&!done;i--)
if(x<list[i-1]) list[i]=list[i-1];
else {
We insert x into its proper location by
list[i]=x;
finding the first element > x and move that
number++;
and all other elements down by 1
done=true;
}
if(!done) {
Notice that we actually start from the right
list[0]=x;
end of the array and shift while we search
number++;
backward for the proper location
}
}
B efo re inserting
e at insertio n p o in t i
0
1
e0
e1
e
A fter in serting
e at insertio n p o in t i,
list size is
incre m e nted b y 1
…
…
i
i+ 1
…
k-1
e i-1 e i
e i+ 1
…
e k-1 e k
1
e0
e1
…
…
i
e i-1 e
d ata.length -1
… sh ift…
Insertio n p o int
0
k
i+ 1 i+ 2
…
k
ei
…
e k-1 e k
e inserted here
e i+ 1
k+ 1
d ata.length -1
Searching the Ordered List
public int bsearch(int item) {
int high=number-1,low=0,mid=(high+low)/2;
while(low<=high&&list[mid]!=item) {
We use binary search
if(list[mid]>item) high=mid-1;
else if(list[mid]<item) low=mid+1; for searching which
is much more efficient
mid=(high+low)/2;
}
than sequential search
if(list[mid]==item) return mid;
else return -9999;
// use this to indicate item not found
}
Why binary search? We could do sequential search which means on
average we have to look at n/2 items to find what we are looking for
Binary search will require looking at no more than log n items to find
the item we are looking for (or determine it isn’t in the list)
Is the difference between n/2 and log n significant? Depends on n
If n = 100, then it is 50 versus 7, not a big difference, but if n =
1,000,000 then the difference is 500,000 versus 20!
public void delete(int x) {
int temp=bsearch(x);
if(temp==-9999)
System.out.println("Item not
found!");
else {
for(int i=temp;i<number-;i++)
list[i]=list[i+1];
number--;
}
}
B efo re d eleting the
ele m e nt at ind e x i
0
1
e0
e1
…
…
0
1
e0
e1
…
…
We call upon binary search to
find the location of the item to
be deleted so that we don’t have
to repeat the code here
Deletion requires shifting to
“cover up” the deleted item
i
i+ 1
…
k-1
e i-1 e i
e i+ 1
…
e k-1 e k
D elete this ele m e nt
A fter d eletin g the
ele m e nt, list size is
d ecrem e nted b y 1
Deletion
i
e i-1 e i+ 1
… sh ift…
…
…
k
d ata.length -1
k-2 k-1
e k-1 e k
d ata.length -1
Using Search for Delete
• Notice in both of our implementations (ordered and
unordered array list) we used the search method to obtain the
index of an item we want to delete
• Why should we use search for delete?
– We don’t have to, but it saves us from having to re-implement
the same code in another method
• Could we also use the search method to identify the location
to insert into (for the ordered list)?
– We wouldn’t need to do this for the unordered list because we are
always adding at the end
• Notice in the ordered insert, we are searching from the end
of the array forward until we find the proper insertion
location
– While we are searching, we are also shifting elements up the
array to create a vacant spot
– If we use the binary search to find a location for insert, we would
still have to do all that shifting, so we may as well combine
Conclusions on Array Implementations
• We are familiar with arrays so it shouldn’t be
challenging to implement
• The unordered list’s insert and delete are easy but the
ordered insert and delete, requiring shifting, is harder
• Array doubling is necessary if we don’t want to
arbitrarily limit the size of the collection
– Array doubling is computationally expensive if you have to
do it often
– Without array doubling, we either have to resort to using a
very large collection with the potential to waste space or
limit the user program to a size that may be too small
• As we will see, an array for an unordered list is about
the same computationally as an unordered linked list
• The savings in using an array over a linked list appears
when we implement an ordered list
Linked Lists
• A linked list consists of nodes, one node per datum
– Unlike the array where there may be many empty elements
• A linked list node will be defined as some (1 or more) data
member(s) and a next member which will store the next node in
the list
– The next field will be a reference variable (pointer)
– The data member(s) can be Generic, unlike our array implementation
• The linked list starts with a single front pointer (called head
below)
– We may or may not have a tail pointer
• The linked list will require that we create a new Node to insert
into the list and attach the Node via references
head
N ode 1
N ode 2
ele m e nt
ele m e nt
ne xt
ne xt
N ode n
…
ele m e nt
null
tail
The Node
• Before we define the Linked List more formally, let’s
look at the Node
– The Node will be defined as its own class (whether a standalone or nested class)
• Any Node needs at least two instance data
– The value that the Node is storing
• a primitive datum like an int, an object like a String, an object that
itself contains multiple values like a Student object, multiple data
like a name, a major, a GPA, or a Generic type of Object
– A reference to the next Node in the list
• we will usually refer to this member as next and it will be of the
same type as the Node itself, e.g., Node next;
• this member is a pointer to the next Node but we don’t call them
pointers in Java, so it is a reference to the next Node
Why a Linked List
• We will find that dealing with reference variables to insert
and delete items from a list can be a challenge, so why
bother?
– We want an array of Integers, we declare Integer[ ] foo;
– Later we do foo=new Integer[100];
• what happens if we need 101 Integers?
• what happens if we only use 5 Integers?
– The array stores a preset number of elements while the linked
list contains 1 node for each element in the list so the linked list
is more memory efficient (lower space complexity)
– But the linked list suffers in search efficiency (time complexity)
especially when we have a large list of items that are sorted
• Another advantage of the Linked List is that the Node can
be defined as Generic while an array cannot (although we
could use an ArrayList to implement a Generic OrderedList
class)
Implementation Details of the LL
• First we implement the Node
– This will either be a separate class or an inner class
– We will need to implement getters and setters for the data and
the next members
– We need to get and set next when we are dealing with getting
the next node pointer or changing pointers
• Note the use of <E> to make this generic
– We must include in our outer class <E extends
Comparable<E>>
public class Node<E> {
private E datum;
private Node<E> next;
public Node(E x, Node n) {datum=x;next=n;}
public Node getNext() {return next;}
public E getDatum() {return datum;}
public void setNext(Node<E> n) {next=n;}
public void setDatum(E x) {datum=x;}
}
Implementation Details of the LL
• To create a linked list, set front to null (also tail to null if we have tail)
– If we are keeping track of the size, set size to 0
– A list is empty if size is 0 (or front==null), the list is never full
• To search an existing LL for some item temp, start at front and move
through the list until we either find temp or reach null
– Remember that E is Comparable
public Node<E> search(E x) {
Node<E> temp=front;
while(temp!=null&&temp.getDatum().compareTo(x)!=0)
temp=temp.getNext();
return temp;
Once found, we can either
}
return the Node (search) or
the
index (search2)
public int search2(E x) {
int index=0;
Node<E> temp=front;
while(temp!=null&&temp.getDatum().compareTo(x)!=0) {
temp=temp->next; index++; }
if(temp==null) return -1; else return index;
}
Implementation Details of the LL
public void insertFront(E x) {
Node<E> temp=new Node(x,front);
front=temp;
h ead
}
Insert at the front of
the list – we use this
approach for our
unordered list ADT
only
Traverse/print the list
e0
tail
…
n ext
A n ew n od e
to b e in serted
h ere
ei
e i+ 1
n ext
n ext
…
ek
n u ll
elem en t
n ext
(a) B efore a n ew n od e is in serted .
h ead
tail
elem en t
e0
n ext
n ext
…
ei
e i+ 1
n ext
n ext
N ew n od e in serted h ere
(b ) A fter a n ew n od e is in serted .
public void traverse() {
Node<E> temp=front;
while(temp!=null) {
System.out.println(temp.getDatum());
temp=temp.getNext();
}
}
…
ek
n u ll
Implementation Details of the LL
• Inserting in order requires finding the proper
location to place the new Node
• 3 cases
– Empty list – create a new node and set front to it
– Insert at the beginning – see previous slide
– Anywhere else – search the list for the first Node
whose datum > new value
• For the third case, let’s assume we use the
following code, is it correct?
Node temp2=front;
While(temp2!=null&&temp2.getDatum().compareTo(temp)<0)
temp2=temp2.getNext();
temp.setNext(temp2);
// but how do we attach temp to the list?
Let’s Take a Closer Look
• temp2 is pointing to the first Node whose datum > temp’s datum,
so we know to insert temp before temp2
– We set temp’s next field to point at temp2, but we do not have a
reference to temp2’s preceding node, how do we make the Node
storing 63 point at temp?
• We use 2 pointers to traverse the list, we will call them, current
and previous
– before moving current onto the next node, we set previous to be
current
– now when we reach 70, current is pointing at 70 but previous is
pointing at 63, so we can modify 63’s next pointer to point at temp
Implementing Ordered Insert
public void insert(E x) {
Node<E> temp=new Node<>(x,null);
if(front==null) front=temp;
else if(front.getDatum().compareTo(x)>0)
{
temp.setNext(front);
Case 1: empty list
front=temp;
}
Case 2: insert at beginning
else {
Node<E> current=front, previous=null;
while(current!=null&&current.getDatum().compareTo(x)<0) {
previous=current;
current=current.getNext();
}
Case 3: search for proper
previous.setNext(temp);
location to insert temp
temp.setNext(current);
}
}
previous.setNext(temp);
temp.setNext(current);
What if current is null?
This occurs if the place to
insert is after the last Node in
the list – this is not a special
case, the same two instructions
work
Implementing Remove (Delete)
• We have several versions of a deletion
– Delete first element
• if(front!=null)
{front=front.getNext( );size--;}
• else … // output a message, throw an exception
– Delete by index
for(current=front,i=0;current!=null&&i<n-1;i++)
current=current.getNext( );
if(current!=null)
current.setNext(current.getNext( ));
– Delete a specific element, x
• covered on the next slide
Implementing Remove (Delete)
• Again, we have three cases
– If empty list, throw Exception or output error message
– If item to be deleted is the first item in the list, redirect
front to point at the second item
– Otherwise, search the list using two pointers (we need to
have a pointer to the previous Node)
public void delete(E x) {
if(front==null) // throw Exception or output message
else if(front.getDatum().compareTo(x)==0)
front=front.getNext(); // the first Node is no longer
else {
// pointed at, it will be
Node<E> current=front;
// garbage collected
Node<E> previous=null;
while(current!=null&&current.getDatum().compareTo(x)!=0)
previous=current;
current=current.getNext();
}
if(current!=null) previous.setNext(current.getNext());
}
}
{
Comments
• Notice the last if statement, we use this because if the
item isn’t found, current will be null and we can’t
perform a deletion
– in the case of item not found or empty list, if we throw an
Exception, we have to add throws Exception or use a trycatch block to handle the Exception
• We can improve the efficiency by using
current.getDatum().compareTo(x)<0
– how does this help?
– if the item isn’t found, we stop searching as soon as we
find the first datum greater than x
– imagine we have the list 10, 20, 30, 40, 50, 60, 70, 80 and
are looking to delete 15, we can discover 15 is not in the
list as soon as we hit the Node with 20, but using the code
on the previous slide will have us continue searching
through the entire list
Why Linked List Revisited
• We cannot use binary search on a linked list
– To find some datum in an ordered array takes O(log n)
operations (worst case) while in the linked list, O(n)
– We already saw that this was a significant difference when
we compared the ordered and unordered lists in an array
• To insert and delete in an ordered array, we have to
shift elements up or down, as many as n elements
• In the linked list, we have to find the proper location to
insert or delete, as many as n elements
– So we see that for an ordered list, the array and linked list
versions give us the same worst case performances
• The linked list gives us superior memory performance
over the array because with the linked list, we do not
need to use array doubling, nor do we waste memory
space
Which Should We Use?
• If we know something about the use of the data, it
might help us decide
– How often are we inserting/deleting versus searching?
– If searches are a minimum, the linked list might be better
• consider a list where we spend much time adding and removing
items, and traversing the entire list, but we rarely actually have to
search for a single element in the list, this might be the case for
instance for a teacher maintaining a list of students
– On the other hand, what if we need to create a list and once
created, rarely change it but spend a lot of time searching
for particular elements?
• You will learn alternate forms of lists in 364, notably
trees and graphs
– A tree typically is implemented using a linked structure
and can be more efficient than either the ordered list as an
array or linked list
Linked List Variations
• Dummy node
– Create a linked list with a special dummy node (thus the list is
never empty) where the dummy node is always first and
contains a dummy value which will always be less than anything
in the list
– We will never insert before the first node so there are no special
cases
• Circular linked list
– Only need one pointer, tail
– To get to head, use tail.getNext( )
• Doubly linked list
– Each node has two pointers, a previous and a next pointer
– This requires modifying not only the linked list class but also the
Node class
– We no longer need a previous and a current pointer to work
through the list, just current (we can get to previous by
current.getPrevious( )) but it does require
manipulating more pointers upon insert and delete
Variations Illustrated
head
head
N ode 1
N ode 2
ele m e nt
ele m e nt
ne xt
ne xt
ne xt
N ode 1
N ode 2
N ode n
ele m e nt
ele m e nt
ne xt
ne xt
null
null
pre vio us
pre vio us
N ode n
…
…
ele m e nt
ele m e nt
tail
tail
Comparing Ordered Linked List
vs Ordered Array
Methods
MyArrayList/ArrayList
MyLinkedList/LinkedList
add(index: int, e: E)
O (1)
O (n )
O (1)
O (n )
clear()
O (1)
O (1)
contains(e: E)
O (n )
get(index: int)
O (n ) O(log n)
O(1)
indexOf(e: E)
O (n )
O (n )
isEmpty()
O (1)
O (1)
lastIndexOf(e: E)
O (n )
O (n )
remove(e: E)
O (n )
O (n )
size()
O (1)
O (1)
remove(index: int)
O(n)
O (n )
set(index: int, e: E)
O(n)
O (n )
addFirst(e: E)
O(n)
O (1)
removeFirst()
O(n)
O (1)
add(e: E)
O(log n)
O (n )
Download