Standard Template Library - An Introduction One of the later editions to the C++ standard is the Standard Template Library (STL). The STL is a set of abstract datatypes, functions, and algorithms designed to handle user-specified datatypes. Each of the abstract datatypes also contains useful functions, including overloaded operators, to access them. The spirit of the Standard Template Library is the idea of generic programming - the implementation of algorithms or data structures without being dependent on the type of data being handled. For instance, you can use the STL vector container to store a vector (think of it as a resizable array) of any object you desire. In C and non-STL C++, you can use arrays to get a similar feature, but arrays are limited to a single type of data structure. Moreover, the STL provides some nice features such as handling memory for you (no memory leaks), and it is also safer (no buffer overflow issues when using vectors or similar data structures). Note: If you don't know what templates are, you might want to read the templates tutorial for an overview. Although you don't need to know all the implementation details, you will need to know how to use templates to take advantage of the STL. The STL offers the programmer several advantages in addition to safety and memory management. First, having the predefined types simplifies program design; no longer does a programmer have to write his own class to handle vectors, queues, lists, or associative arrays. Second, it offers powerful type-independent algorithms, including the obvious sorting and searching. The STL does so at very low cost to program performance, no more than any other templated class or function, and it's less likely that using a library function will lead to bugs than using your own code. Finally, though not part of the STL, the standard library includes a string class, which will be covered in this set of tutorials. For those of you who have used the C string.h functions, it is a welcome relief to have access to a simplified string manipulation interface. The datatypes and functions in the STL are not limited to strings, but include vectors, linked lists, queues, stacks, as well as sorting, searching, numeric, permutation, and sequence operations. The scope of the STL is wide. This tutorial will cover various data structures ("container classes") available, the use of iterators to access those data structures, templated algorithms, and a comparison between the various container classes provided by the STL. 1 The STL Vector Class One of the basic classes implemented by the Standard Template Library is the vector class. A vector is, essentially, a resizeable array; the vector class allows random access via the [] operator, but adding an element anywhere but to the end of a vector causes some overhead as all of the elements are shuffled around to fit them correctly into memory. Fortunately, the memory requirements are equivalent to those of a normal array. The header file for the STL vector library is vector. (Note that when using C++, header files drop the .h; for C header files - e.g. stdlib.h - you should still include the .h.) Moreover, the vector class is part of the std namespace, so you must either prefix all references to the vector template with std:: or include "using namespace std;" at the top of your program. Vectors are more powerful than arrays because the number of functions that are available for accessing and modifying vectors. Unfortunately, the [] operator still does not provide bounds checking. There is an alternative way of accessing the vector, using the function at, which does provide bounds checking at an additional cost. Let's take a look at several functions provided by the vector class: unsigned int size(); elements in a vector push_back(type element); end of a vector bool empty(); vector is empty void clear(); the vector type at(int n); index n, with bounds checking Returns the number of Adds an element to the Returns true if the Erases all elements of Returns the element at also, there are several basic operators defined for the vector class: = Assignment replaces a vector's contents with the contents of another == An element by element comparison of two vectors [] Random access to an element of a vector (usage is similar to that of the operator with arrays.) Keep in mind that it does not provide bounds checking. Let's take a look at an example program using the vector class: 2 #include <iostream> #include <vector> using namespace std; int main() { vector <int> example; //Vector to store integers example.push_back(3); //Add 3 onto the vector example.push_back(10); //Add 10 to the end example.push_back(33); //Add 33 to the end for(int x=0; x<example.size(); x++) { cout<<example[x]<<" "; //Should output: 3 10 33 } if(!example.empty()) //Checks if empty example.clear(); //Clears vector vector <int> another_vector; //Creates another vector to store integers another_vector.push_back(10); //Adds to end of vector example.push_back(10); //Same if(example==another_vector) //To show testing equality { example.push_back(20); } for(int y=0; y<example.size(); y++) { cout<<example[y]<<" "; //Should output 10 20 } return 0; } Summary of Vector Benefits Vectors are somewhat easier to use than regular arrays. At the very least, they get around having to be resized constantly using new and delete. Furthermore, their immense flexibility - support for any datatype and support for automatic resizing when adding elements - and the other helpful included functions give them clear advantages to arrays. Another argument for using vectors are that they help avoid memory leaks--you don't have to remember to free vectors, or worry about how to handle freeing a vector in the case of an exception. This simplifies program flow and helps you write tighter code. Finally, if you use the at() function to access the vector, you get bounds checking at the cost of a slight performance penalty. 3 C++ vector is a container template available with Standard Template Library pack. This C++ vector can be used to store similar typed objects sequentially, which can be accessed at a later point of time. As this is a template, all types of data including user defined data types like struct and class can be stored in this container. This article explains briefly about how to insert, delete, access the data with respect to the C++ vector. The type stl::string is used for the purposes of explaining the sample code. Using stl::string is only for sample purposes. Any other type can be used with the C++ vector. Inserting a stl::string into the C++ Vector:<STR; The values can be added to the c++ vector at the end of the sequence. The function push_back should be used for this purpose. The <vector> header file should be included in all the header files in order to access the C++ vector and its functions. #include <vector> #include <string> #include <iostream.h> void main() { //Declaration for the string data std::string strData = "One"; //Declaration for C++ vector std:: vector <std::string> str_Vector; str_Vector.push_back(strData); strData = "Two"; str_Vector.push_back(strData); strData = "Three"; str_Vector.push_back(strData); strData = "Four"; str_Vector.push_back(strData); } The above code adds 4 strings of std::string type to the str_Vector. This uses std:: vector.push_back function. This function takes 1 parameter of the designated type and adds it to the end of the c++ vector. Accessing Elements of C++ Vector: The elements after being added can be accessed in two ways. One way is our legacy way of accessing the data with vector.at(position_in_integer) function. The position of the data element is passed as the single parameter to access the data. Using our normal logic for accessing elements stored in C++ Vector: If we want to access all the data, we can use a for loop and display them as follows. 4 for(int i=0;i < str_Vector.size(); i++) { std::string strd = str_Vector.at(i); cout<<strd.c_str()<<endl; } The std:: vector .size() function returns the number of elements stored in the vector. Using C++ vector iterator provided by STL: The next way is to use the iterator object provided by the STL. These iterators are general purpose pointers allowing c++ programmers to forget the intricacies of object types and access the data of any type. std::vector<std::string>::iterator itVectorData; for(itVectorData = str_Vector.begin(); itVectorData != str_Vector.end(); itVectorData++) { std::string strD = *(itVectorData); } Removng elements from C++ vector: Removing elements one by one from specified positions in c++ vector is achieved with the erase function. The following code demonstrates how to use the erase function to remove an element from position 2 in the str_Vector from our sample. str_Vector.erase(str_Vector.begin()+1,str_Vector.begin()+2); The following sample demonstrates how to use the erase() function for removing elements 2,3 in the str_Vector used in the above sample. str_Vector.erase(str_Vector.begin()+1,str_Vector.begin()+3); If one wants to remove all the elements at once from the c++ vector, the vector.clear() function can be used. STL Iterators The concept of an iterator is fundamental to understanding the C++ Standard Template Library (STL) because iterators provide a means for accessing data stored in container classes such a vector, map, list, etc. You can think of an iterator as pointing to an item that is part of a larger 5 container of items. For intance, all containers support a function called begin, which will return an iterator pointing to the beginning of the container (the first element) and function, end, that returns an iterator corresponding to having reached the end of the container. In fact, you can access the element by "dereferencing" the iterator with a *, just as you would dereference a pointer. To request an iterator appropriate for a particular STL templated class, you use the syntax std::class_name<template_parameters>::iterator name where name is the name of the iterator variable you wish to create and the class_name is the name of the STL container you are using, and the template_paramters are the parameters to the template used to declare objects that will work with this iterator. Note that because the STL classes are part of the std namespace, you will need to either prefix every container class type with "std::", as in the example, or include "using namespace std;" at the top of your program. For instance, if you had an STL vector storing integers, you could create an iterator for it as follows: std::vector<int> myIntVector; std::vector<int>::iterator myIntVectorIterator; Different operations and containers support different types of iterator behavior. In fact, there are several different classes of iterators, each with slightly different properties. First, iterators are distinguished by whether you can use them for reading or writing data in the container. Some types of iterators allow for both reading and writing behavior, though not necessarily at the same time. Some of the most important are the forward, backward and the bidirectional iterators. Both of these iterators can be used as either input or output iterators, meaning you can use them for either writing or reading. The forward iterator only allows movement one way -- from the front of the container to the back. To move from one element to the next, the increment operator, ++, can be used. For instance, if you want to access the elements of an STL vector, it's best to use an iterator instead of the traditional C-style code. The strategy is fairly straightforward: call the container's begin function to get an iterator, use ++ to step through the objects in the container, access each object with the * operator ("*iterator") similar to the way you would access an object by dereferencing a pointer, and stop iterating when the iterator equals the container's end iterator. You can compare iterators using != to check for inequality, == to check for equality. (This only works for one twhen the iterators are operating on the same container!) The old approach (avoid) using namespace std; 6 vector<int> myIntVector; // Add some elements to myIntVector myIntVector.push_back(1); myIntVector.push_back(4); myIntVector.push_back(8); for(int y=0; y<myIntVector.size(); y++) { cout<<myIntVector[y]<<" "; //Should output 1 4 8 } The STL approach (use this) using namespace std; vector<int> myIntVector; vector<int>::iterator myIntVectorIterator; // Add some elements to myIntVector myIntVector.push_back(1); myIntVector.push_back(4); myIntVector.push_back(8); for(myIntVectorIterator = myIntVector.begin(); myIntVectorIterator != myIntVector.end(); myIntVectorIterator++) { cout<<*myIntVectorIterator<<" "; //Should output 1 4 8 } As you might imagine, you can use the decrement operator, --, when working with a bidirectional iterator or a backward operator. Iterators are often handy for specifying a particular range of things to operate on. For instance, the range item.begin(), item.end() is the entire container, but smaller slices can be used. This is particularly easy with one other, extremely general class of iterator, the random access iterator, which is functionally equivalent to a pointer in C or C++ in the sense that you can not only increment or decrement but also move an arbitrary distance in constant time (for instance, jump multiple elements down a vector). For instance, the iterators associated with vectors are random access iterators so you could use arithmetic of the form iterator + n where n is an integer. The result will be the element corresponding to the nth item after the item pointed to be the current iterator. This can be a 7 problem if you happen to exceed the bounds of your iterator by stepping forward (or backward) by too many elements. The following code demonstrates both the use of random access iterators and exceeding the bounds of the array (don't run it!): vector<int> myIntVector; vector<int>::iterator myIntVectorIterator; myIntVectorIterator = myIntVector.begin() + 2; You can also use the standard arithmetic shortcuts for addition and subtraction, += and -=, with random access iterators. Moreover, with random access iterators you can use <, >, <=, and >= to compare iterator positions within the container. Iterators are also useful for some functions that belong to container classes that require operating on a range of values. A simple but useful example is the erase function. The vector template supports this function, which takes a range as specified by two iterators -- every element in the range is erased. For instance, to erase an entire vector: vector<int>::iterator myIntVectorIterator; myIntVector.erase(myIntVectorIterator.begin(), myIntVectorIterator.end()); which would delete all elements in the vector. If you only wanted to delete the first two elements, you could use myIntVector.erase(myIntVectorIterator.begin(), myIntVectorIterator.begin()+2); Note that various container class support different types of iterators -- the vector class, which has served as our model for iterators, supports a random access iterator, the most general kind. Another container, the list container (to be discussed later), only supports bidirectional iterators. So why use iterators? First, they're a flexible way to access the data in containers that don't have obvious means of accessing all of the data (for instance, maps [to be discussed later]). They're also quite flexible -- if you change the underlying container, it's easy to change the associated iterator so long as you only use features associated with the iterator supported by both classes. Finally, the STL algorithms defined in <algorithm> (to be discussed later) use iterators. Summary The Good 8 The STL provides iterators as a convenient abstraction for accessing many different types of containers. Iterators for templated classes are generated inside the class scope with the syntax class_name<parameters>::iterator Iterators can be thought of as limited pointers (or, in the case of random access iterators, as nearly equivalent to pointers) The Gotchas Iterators do not provide bounds checking; it is possible to overstep the bounds of a container, resulting in segmentation faults Different containers support different iterators, so it is not always possible to change the underlying container type without making changes to your code Iterators can be invalidated if the underlying container (the container being iterated over) is changed significantly STL Maps -- Associative Arrays Suppose that you're working with some data that has values associated with strings -- for instance, you might have student usernames and you want to assign them grades. How would you go about storing this in C++? One option would be to write your own hash table. This will require writing a hash function and handling collisions, and lots of testing to make sure you got it right. On the other hand, the standard template library (STL) includes a templated class to handle just this sort of situation: the STL map class, which conceptually you can think of as an "associative array" - key names are associated with particular values (e.g., you might use a student name as a key, and the student's grade as the data). In fact, the STL's map class allows you to store data by any type of key instead of simply by a numerical key, the way you must access an array or vector. So instead of having to compute a hash function and then access an array, you can just let the map class do it for you. To use the map class, you will need to include <map> and maps are part of the std namespace. Maps require two, and possibly three, types for the template: std::map <key_type, data_type, [comparison_function]> Notice that the comparison function is in brackets, indicating that it is optional so long as your key_type has the less-than operator, <, defined - so you don't need to specify a comparision function for so-called primitive types such as int, char, or even for the string class. Moreover, if you overloaded the < operator for your own class, this won't be necessary. The reason that the key type needs the less-than operator is that the keys will be stored in sorted order -- this means that if you want to retrieve every key, value pair stored in the map, you can retrieve them in the order of the keys. Let's go back to the example of storing student grades. Here's how you 9 could declare a variable called grade_list that associates strings (student names) with characters (grades -- no + or - allowed!). std::map <string, char> grade_list; Now, to actually store or access data in the map, all you need to do is treat it like a normal array and use brackets. The only difference is that you no longer use integers for the index -- you use whatever data type was used for the key. For instance, following the previous example, if we wanted to assign a grade of 'B' to a student named "John", we could do the following: grade_list["John"] = 'B'; If we later decided that John's grades had improved, we could change his grade in the same fashion: grade_list["John"] = 'B'; // John's grade improves grade_list["John"] = 'A'; So adding keys to a map can be done without doing anything special -- we just need to use the key and it will be added automatically along with the corresponding data item. On the other hand, getting rid of an element requires calling the function erase, which is a member of the map class: erase(key_type key_value); For instance, to erase "John" from our map, we could do the following: grade_list.erase("John"); If we're curious, we could also check to see how many values the map contains by using the size function, which is also a member of the map class and other containers: int size(); For instance, to find the size of our hypothetical class, we could call the size function on the grade list: std::cout<<"The class is size "<<grade_list.size()<<std::endl; If we're only interested in whether the map is empty, we can just use the map member function empty: bool empty(); empty returns true if the map is empty; it returns false otherwise. If you want guarantee that the map is empty, you can use the clear function. For instance: grade_list.clear(); 10 would remove all students from the grade_list. What if you just wanted to check to see whether a particular key was in the map to begin with (e.g., if you wanted to know whether a student was in the class). You might think you could do something like this using the bracket notation, but then what will that return if the specified key has no associated value? Instead, you need to use the find function, which will return an iterator pointing to the value of the element of the map associated with the particular key, or if the key isn't found, a pointer to the iterator map_name.end()! The following code demonstrates the general method for checking whether a key has an associated value in a map. std::map <string, char> grade_list; grade_list["John"] = 'A'; if(grade_list.find("Tim") == grade_list.end()) { std::cout<<"Tim is not in the map!"<<endl; } Iterators can also be used as a general means for accessing the data stored in a map; you can use the basic technique from before of getting an iterator: std::map<parameters>::iterator iterator_name; where parameters are the parameters used for the container class this iterator will be used for. This iterator can be used to iterate in sequence over the keys in the container. Ah, you ask, but how do I know the key stored in the container? Or, even better, what exactly does it mean to dereference an iterator over the map container? The answer is that when you dereference an iterator over the map class, you get a "pair", which essentially has two members, first and second. First corresponds to the key, second to the value. Note that because an iterator is treated like a pointer, to access the member variables, you need to use the arrow operator, ->, to "dereference" the iterator. For instance, the following sample shows the use of an iterator (pointing to the beginning of a map) to access the key and value. std::map <string, char> grade_list; grade_list["John"] = 'A'; // Should be John std::cout<<grade_list.begin()->first<<endl; // Should be A std::cout<<grade_list.begin()->second<<endl; Finally, you might wonder what the cost of using a map is. One issue to keep in mind is that insertion of a new key (and associated value) in a map, or lookup of the data associated with a key in a map, can take up to O(log(n)) time, where n is the size of the current map. This is potentially a bit slower than some hash tables with a good hashing function, and is 11 due to the fact that the map keys are stored in sorted order for use by iterators. Summary The Good Maps provide a way of using "associative arrays" that allow you to store data indexed by keys of any type you desire, instead of just integers as with arrays. Maps can be accessed using iterators that are essentially pointers to templated objects of base type pair, which has two members, first and second. First corresponds to the key, second to the value associated with the key. Maps are fast, guaranteeing O(log(n)) insertion and lookup time. The Gotchas Checking for whether an element is in a map requires using the find function and comparing the returned iterator with the iterator associated with the end of the map. Since maps associate a key with only one value, you need to use a multimap (or make the type of the data a container) if you must associate multiple pieces of data with one key. STL List Container The Standard Template Library's list container is implemented as a doubly linked list. You might wonder why there are both list and vector containers in the STL -- the reason is that the underlying representations are different, and each representation has its own costs and benefits. The vector has relatively costly insertions into the middle of the vector, but fast random access, whereas the list allows cheap insertions, but slow access (because the list has to be traversed to reach any item). Some algorithms, such as merge sort, even have different requirements when applied to lists instead of vectors. For instance, merge sort on vectors requires a scratch vector, whereas merge sort on lists does not. Using the STL list class is about as simple as the STL vector container. In fact, to declare a list, all you need is to give the type of data to be stored in the list. For instance, the following code declares a list storing integers: std::list<int> integer_list; Like the vector class, the list class includes the push_back and push_front functions, which add new elements to the front or back of the list respectively. For instance, 12 std::list<int> integer_list; integer_list.push_front(1); integer_list.push_front(2); creates a list with the element 2 followed by the element 1. Correspondingly, it is possible to remove the front or back element using the pop_front() or pop_back() functions. Using these functions alone, it would be possible to create a queue or stack data structure based on the list class. What about adding elements into the middle of the list -- that is, after all, one of the advantages of a list. The insert function can be used to do so: insert requires an iterator pointing to the position into which the element should be inserted (the new element will be inserted right before the element currently being pointed to will). The insert function looks like this: iterator insert(iterator position, const T& element_to_insert); Fortunately, the list container supports both the begin -- returning an iterator to the beginning of the list -- and end -- returning an iterator past the last element of the list -- iterator functions, and you can declare iterators as with any other container, in the following manner: list<type>::iterator iterator_name; Note that the STL list container supports iterating both forward and in reverse (because it is implemented as a doubly linked list). Using insert and the function end, the functionality of push_back, which adds an element to the end of the list, could also be implemented as std::list<int> integer_list; integer_list.insert(integer_list.end(), item); The list class also includes the standard functions size and empty. There is one caveat to be aware of: the size function may take O(n) time, so if you want to do something simple such as test for an empty list, use the empty function instead of checking for a size of zero. If you want to guarantee that the list is empty, you can always use the clear function. std::list<int> integer_list; \\ bad idea if(integer_list.size() == 0) ... \\ good idea if(integer_list.empty()) ... 13 integer_list.clear(); \\ guaranteed to be empty now Lists can also be sorted using the sort function, which is guaranteed to take time on the order of O(n*log(n)). Note that the sort algorithm provided by the STL works only for random access iterators, which are not provided by the list container, which necessitates the sort member function: instance_name.sort(); Lists can be reversed using instance_name.reverse(); One feature of using the reverse member function instead of the STL algorithm reverse (to be discussed later) is that it will not affect the values that any other iterators in use are pointing to. Another potentially useful list function is the member function called unique; unique converts a string of equal elements into a single element by removing all but the first element in the sequence. For instance, if you had a list consisting of 1 1 8 9 7 8 2 3 3 the calling unique would result in the following output: 1 8 9 7 8 2 3 Notice that there are still two 8's in the above example: only sequential equal elements are removed! Unique will take time proportional to the number of elements in the list. If you want each element to show up once, and only once, you need to sort the list first! Try the following code to see how this works and see many of the previous functions in action! std::list<int> int_list; int_list.push_back(1); int_list.push_back(1); int_list.push_back(8); int_list.push_back(9); int_list.push_back(7); int_list.push_back(8); int_list.push_back(2); int_list.push_back(3); int_list.push_back(3); int_list.sort(); int_list.unique(); for(std::list<int>::iterator list_iter = int_list.begin(); list_iter != int_list.end(); list_iter++) 14 { std::cout<<list_iter<<endl; } Summary The Good Lists provide fast insertions (in amortized constant time) at the expensive of lookups Lists support bidirectional iterators, but not random access iterators Iterators on lists tend to handle the removal and insertion of surrounding elements well The Gotchas 15 Lists are slow to search, and using the size function will take O(n) time Searching for an element in a list will require O(n) time because it lacks support for random access