Ensembles of Data How do these ways of organizing it differ?

advertisement
Ensembles of Data:
We often want to group data together, or to organize it, for different purposes.
How do these ways of organizing it differ?
* Sequential access or access in any order?
To see the difference, think about a cassette and a CD: this is the difference
between sequential access and "random" access.
The question is are they equally efficient to access, or is it most efficient to access
them sequentially.
* If sequential, is it only efficient to go from front to back, or is it also efficient to go
back to front (one-way or two-way; unidirectional or bidirectional)
* Is it read-only or read-write?
* Is it static (fixed sized) or dynamic (it can grow and shrink)?
Static data structures are usually more efficient, but you have to know in advance
how big to make them.
* Does the order of the data matter?
* Can you change the structure (rearrange the data).
So it's a typical engineering decision to pick a data structure.
"If you pick the right data structure, the algorithm will write itself."
In computing we use many different data structures, including vectors, arrays, sets,
sequences, stacks (LIFOs), queues (FIFOs), linked lists, trees, hash tables, ....
In C++, data structures correspond to classes, which contain member functions,
which determine what you can do with the data structure.
(For example, the scribbler class is not data structure, but has member functions
like stop(), forward(), ... that determine what you can do with members of the class
scribbler.)
For example a stack might only respond to the member functions push() and pop().
C++ provides a number of built-in data structures in the core language. And it also
provides many useful data structures in Standard Template Library (STL). These
are efficient, professionally implemented data structures, that you can use.
Vectors:
C++ provides a vector data structure as STL.
You know about 2D vectors, (x, y), and 3D vectors, (x, y, z), and higher
dimensional.
Notice that the order of the elements matter, (2,3) is not the same vector as (3,2).
This is different from sets, e.g., {2,3} = {3,2}, for which order doesn't matter.
C++ generalizes the common mathematical definition of a vector in several ways:
(1) A vector can have any number of elements. For example we might have a
vector of 365 elements containing the rainfall for each day of the year.
(2) The elements of a C++ vector can be of any type, but they all have to be the
same type. You can have a vector of ints, a vector of doubles, a vector of strings, a
vector of bools, a vector of chars, a vector of scribblers, a vector of vectors, or a
vector of vectors of vectors, ...
(3) You can read or write (assign to) any element of a vector; they randomly
addressable.
In math, if V = (3, 9, 27),
then V1 (V sub 1) = 3,
V2 = 9, V3 = 27.
C++ is a little different: it counts from 0 instead of from 1.
So the first element is V[0], the second V[1], and the last V[2].
C++ uses "0-origin indexing" (as opposed to 1-origin indexing).
Why do all the elements of a vector have to be the same type?
The answer is: so that given the index, you can calculate where the element is in
memory. In other words, if you know where V starts in memory, then you can
calculate where V[k] is, for any any integer k. This means you have constant-time
access to any element of the vector.
Download