Data Structures: char, ...

advertisement
Data Structures:
Simple (atomic) data structures: have no parts. For example, int, double, bool,
char, ...
These are the basic building blocks. We can use them to assemble more complex
compound data structures.
Compound data structures are either:
homogeneous = all their parts are the same type, or
heterogeneous = their parts could be of different types.
Examples of homogeneous data structures include vectors and strings.
Homogeneous data structures are more restricted in that all the components have
to be the same type.
We have them because they are also more flexible. In particular, you can calculate
the location of the component you want. For example in
V[k+1] we are using the computed value k+1 to determine which component of the
vector or string that we want. The compiler can do this because all the elements of
a vector are the same type and therefore the same size. So you just have to
multiply the index by the size of an element, and add it to the base address of the
vector to get the address of the element. A simple, efficient operation.
If the elements are of different types, then you can't calculate their address at
runtime (at least not efficiently). So for heterogeneous structures, the components
you can select are fixed in your program (you can't calculate them at runtime).
In C++ there are two primary heterogeneous data structures: the struct and the
class. They are very closely related.
Both allow to group together data, functions, variables, and other named things
that belong together as part of some object.
Example: employee record.
Structures should represent meaningful "things" (e.g., employees, dates).
Both structs and classes are ways of grouping together related named things (e.g.,
variables, functions, and other structs and classes). These named things are called
member variables, member functions, etc.
The only difference is that by default, all the members of a struct are public (visible
everywhere the struct/class is visible), and if you want any of them to be private,
you have to declare them private. Classes are just the opposite: all the members
are private unless they are explicitly declared public. C++ really doesn't need both
structs and classes, and it's mostly for upward compatibility that it has both.
If we declare a struct (or class) "employee" we think of that as a class or kind of
object, in this case representing an employee.
When we declare:
employee me;
employee boss;
that creates instances of that class or structure. Often (not always) we have many
instances of any given class/struct. We can even create instances on the fly while
the programming is running.
Reference Parameters:
When you declare a parameter in the usual way, it tells the compiler to pass the
parameter by value. For example,
double abs (double x) {
if (x<0) {
return -x;
} else {
return x;
}
}
When I write an invocation of abs, such as:
double a;
cin >> a;
cout << abs (a) << endl;
the compiler takes the value of a, and copies it into a local variable x in the abs()
function. It passes the value of the argument to the function.
This is just what you want for many purposes. But there are two problems:
(1) If the argument is very large (e.g., a big vector), then
(a) it takes up a lot of extra space in the function, and
(b) it takes time to copy it.
[efficiency problem]
(2) You can't change the argument from within the function.
[logic problem]
When you put an ampersand in the parameter declaration, it means to pass that
parameter by reference.
Approximate synonyms in programming languages: reference = pointer = address.
They are all "names" (i.e., addresses) of locations in the computer's memory. So if I
declare:
void increment (int& x) {
x++;
}
Then when I invoke: increment(N)
the address of N gets passed to increment, rather than the value of N. Or we say, "a
reference to N gets passed to increment."
So if we want a function to be able to change a variable, we need to pass that
variable by reference.
Note: this only makes sense if the argument is a variable of some sort (something
that has an address in memory).
increment(12); // makes no sense (and it's illegal)
Efficiency: If I pass a big argument by reference, it only passes the address of the
argument, which may be a lot smaller. Takes less time. Also, the function only has
to set aside space for address of the argument, not for a whole fresh copy of it.
The only problem: writing it this way suggests that the function may alter the
argument, even if it doesn't. So it can be misleading to the reader. (There is a way
around this called "constant reference parameters.")
Download