Document

advertisement
Computer Organization and Design
Pointers, Arrays and Strings in C
Montek Singh
Feb 5, 2016
Lab 4 supplement
C vs. Java
 For our purposes C is almost identical to Java except:
 C has “functions”, JAVA has “methods”
 function == method without “class”
 i.e., a global method
 C has “pointers” explicitly
 Java has them (called “references”) but hides them under the
covers
 JVM takes care of handling pointers, so the programmer doesn’t
have to
 C++ is sort of in-between C and Java
In this class, we will see how pointers/references are
implemented in C
What is a “pointer” in C?
 A pointer is an explicit
memory address
 Example
 int i
Memory
address
i
1056
1060
1064
1068
 i is an integer variable in memory
 located at, say, address 1056
 int *p
 p is a variable that “points to” an
integer
 p is located at, say, address 2004
 p = &i
 the value in p is now equal to
the address of variable i
 i.e., the value stored in Mem[2004]
is 1056 (the location of i)
p
1056
2000
2004
Referencing and Dereferencing
 Referencing an object means
 … taking its address and assigning it
to a pointer variable (e.g., p = &i)
 Dereferencing a pointer means
 … going to the memory address
pointed to by the pointer, and
accessing the value there (e.g., *p)
Memory
address
i
5
1056
1060
1064
1068
 Example
int i;
int *p;
p = &i;
*p = 5;
//
//
//
//
//
//
i is an int variable
p is a pointer to int
referencing i
p is assigned 1056
dereference p
i assigned 5
p
1056
2000
2004
Pointer expressions and arrays
 Dereferencing could be done to an expression
 So, not just *p, but can also write *(p+400)
 accesses memory location that is 400th int after i
 Arrays in C are really pointers underneath!
 int a[10];
// array of integers
 a itself simply refers to the address of the
 a is the same as &a[0]
 a is a constant of type “int *”
 a[0] is the same as *a
 a[1] is the same as *(a+1)
 a[k] is the same as *(a+k)
 a[j] = a[k]; is the same as
*(a+j) = *(a+k);
start of the array
// address of a[0]
Pointer arithmetic and object size
 IMPORTANT: Pointer
expressions automatically
account for the size of object
pointed to
Memory
address
i
1056
1060
1064
1068
 Example 1
 if p is of type “int *”
 and an int is 4 bytes long
 if p points to address 1056,
(p+2) will point to address 1064
 C compiler automatically does the
multiply-by-4
 BUT… in machine language, we will
have to explicitly do the multiply-by-4
 Example 2
 char *q; // char is 1 byte
 q++;
// really does add 1
p
1056
1064
2000
2004
Pointer examples
int i;
int a[10];
int *p;
// simple integer variable
// array of integers
// pointer to integer
p = &i;
p = a;
p = &a[5];
*p
*p = 1;
*(p+1) = 1;
p[1] = 1;
p++;
//
//
//
//
//
//
//
//
& means address of
a means &a[0]
address of 6th element of a
value at location pointed by p
change value at that location
change value at next location
exactly the same as above
step pointer to the next element
Pointer pitfalls
int i;
// simple integer variable
int a[10];
// array of integers
int *p; // pointer to integer(s)
So what happens when
p = &i;
What is value of p[0]?
What is value of p[1]?
 Very easy to exceed bounds
 C has no bounds checking!
Iterating through an array
 2 ways to iterate through an array
 using array indices
void clear1(int array[], int size) {
for(int i=0; i<size; i++)
array[i] = 0;
}
 using pointers
void clear2(int *array, int size) {
for(int *p = &array[0]; p < &array[size]; p++)
*p = 0;
}
 or, also using pointers, but more concise (more cryptic!)
void clear3(int *array, int size) {
int *arrayend = array + size;
while(array < arrayend) *array++ = 0;
}
Pointer summary
 In the “C” world and in the “machine” world:
 a pointer is just the address of an object in memory
 size of pointer itself is fixed regardless of size of object
 to get to the next object:
 in machine code: increment pointer by the object’s size in bytes
 in C: increment pointer by 1
 to get the ith object:
 in machine code: add i*sizeof(object) to pointer
 in C: add i to pointer
 Examples:
 int R[5]; // 20 bytes storage
 R[i] is same as *(R+i)
 int *p = &R[3] is same as p = (R+3)
(p points 12 bytes after start of R)
Strings in C
 There is no string type in C 😞 What?!
 Only low-level support for strings as character arrays
 char s[]
 char *s
// array of characters
// pointer to the beginning of a string
 But a rich library of string processing functions (string.h)
 reading: scanf(), fgets(), etc.
 printing: printf(), puts(), etc.
 processing: strcpy(), strlen(), strcat(), strcmp()
Strings in C: NULL termination
 All C functions assume string terminates with a NULL
 NULL = ASCII 0 character
 also written as ‘\0’
 Example:
 “hello” is actually { ‘h’ ‘e’ ‘l’ ‘l’ ‘o’ ‘\0’ }
 uses 6 characters, not 5
 so, for a string with N real characters, declare it as char S[N+1]
 but C functions define its length as 5
 strlen(“hello”) returns 5
 fputs(), printf(), etc. will print the string until they see a ‘\0’
 Be mindful of the terminating character!
Declaring strings statically
 Statically (if size is known at compile time)
 char S[6] = “hello”;
 char S[] = “hello”;
// compiler puts in the size
 char string_array[5][10];
 an array of 5 strings, each up to 9 characters long (plus NULL)
 string_array[0] is the 0-th (very first) string
 string_array[4] is the 4th (last) string
 string_array[0][4] is the 4th character of the first string
 etc.
Declaring strings dynamically
 Dynamically (if size is known only at run time)
1. declare a pointer

2.
determine size (at run time), e.g., 99 real characters + NULL

3.
char *s;
size is 100
allocate space in memory

s = malloc(100);

BUT remember: there is no bounds checking, so do not put a string with
more than 99 real characters in s…
 Comparison to Java
 char *s = malloc(100);
 char[] s = new char[100];
// in C
// in Java
 When finished with string
 free(s);
// free up the 100 bytes
 Java: no need to free; JVM does garbage collection
Declaring strings dynamically
 Examples
 Declare an array of NUM strings, each at most LEN long (incl. the
terminating NULL), where NUM and LEN are known at compile time
 char string_array[NUM][LEN];
 Declare an array of NUM strings, where only NUM is known at compile
time
 char *string_array[NUM];
 string_array[0] = malloc(length of string 0…);
 string_array[1] = malloc(length of string 1…);
 etc.
 Declare an array of strings, where we don’t know how many strings there
will be, and how long each will be, at compile time




char **string_array;
string_array = malloc(how many strings … * sizeof(char *));
string_array[0] = malloc(length of string 0…);
string_array[1] = malloc(length of string 1…);
 etc.
An array of strings
 Declare an array of NUM strings, where only NUM is known at
compile time
 char *string_array[NUM];
 string_array[0] = malloc(length of string 0…);
 string_array[1] = malloc(length of string 1…);
 etc.
 For sorting: Swap strings by swapping pointers, not by
copying contents!
 char *temp;
// pointer to char
 temp = string_array[i];
// swap pointers!
 string_array[i] = string_array[j];
// no copying chars
 string_array[j] = temp;
One more thing…
Passing pointers as arguments to functions
Passing arguments by reference
 Example 1:
 reading values into variables
int x, y, z;
scanf(“%d%d%d”, &x, &y, &z);
 scanf() changes the values of x, y and z! How?
 we provide it the addresses where x, y and z are located
 Example 2:
 a function to double the value given to it
void double_it(int x) {
x = x*2;
}
 doesn’t work… because x is “passed by value”
 a copy of x is modified inside the function, not the original
Passing arguments by reference
 Example 2 take 2:
 a function to double the value given to it
void double_it(int *x) {
*x = (*x) * 2;
}
int main() {
int y = 10;
double_it(&y);
//
}
y is now 20
 works! Because y is “passed by reference”
 the address in memory where y is stored is sent to the function,
which modifies it correctly at that location
Passing arguments: C vs. Java
 Which of these will work in Java? WHY?
1. Modify an integer
double_it(int x) { x = x*2; }
2. Modify an object
double_it(Point p) { p.x = p.x*2; }
3. Swap two objects
swap(Point p1, Point p2) {
Point temp = p1;
p1 = p2;
p2 = temp;
}
Passing arguments: C vs. Java
 Java:
 passes arguments by value for all the primitive data types
 int, double, float, etc.
double_it(int x) { x = x*2; } // won’t work
 passes objects by reference
 so member data of an object can be modified
double_it(Point p) { p.x = p.x*2; }
// works!
 BUT: references are passed by value to methods…
 so cannot swap two objects
swap(Point p1, Point p2) {
Point temp = p1;
p1 = p2;
p2 = temp;
}
// won’t work!
 Understanding C pointers will demystify everything!
Download