CS1061 C Programming Lecture 18: Sequential File Processing A. O’Riordan, 2004, 2007 updated The FILE structure C views each file as a sequence of bytes Files end with the end-of-file marker (EOF) Stream created when a file is opened Opening a file returns a pointer to a FILE structure The FILE structure File descriptor index into operating system array called the open file table File Control Block (FCB) found in every array element, system uses it to administer the file Creating a File FILE *myPtr; creates a FILE pointer called myPtr myPtr = fopen("myFile.dat", openmode); Function fopen() returns a FILE pointer and takes two arguments – file to open and file open mode. If open fails, NULL returned. fclose(FILE pointer) Function fclose() closes specified file. Even though it’s performed automatically when program ends it is good practice to close files explicitly. Reading a sequential access file A sequential access file is written and read from the start to the end of the file using a file pointer. Create a FILE pointer, tie it to the file to read myPtr = fopen("myFile.dat", "r"); Use fscanf() to read from the file. Like scanf(), except first argument is a FILE pointer, e.g.: fscanf(myPtr, "%d%s%f", &myInt, myString, &myFloat); fopen() Every call to fopen() will typically be followed with a test, like this: ifp = fopen("input.dat", "r"); if(ifp == NULL){ printf("can't open file\n"); //exit or return } or if((ifp = fopen("input.dat", "r")) == NULL){ printf("can't open file\n"); //exit or return } File Modes Files are open in a certain mode. MODE "a" "a+" "r" "r+" "w" "w+" USED FOR Appending Reading/appending Read only Reading/writing Write only Reading/writing FILE CREATED? Yes Yes No No Yes Yes EXISTING FILE? Appended to Appended to Yes Yes Destroyed Destroyed The number of streams that may be open simultaneously is specified by the macro FOPEN_MAX (in stdio.h). FOPEN_MAX must be at least eight in ISO C. fscanf() and fprintf() fscanf() and fprintf() are file processing equivalents of scanf() and printf(). Prototypes (as in stdio.h): int fprintf(FILE *stream, const char *format, ...); int fscanf(FILE *stream, const char *format, ...); fprintf(stdout, "Hello") is eqivalent to printf(“Hello") fscanf(stdin, “%d”, &i) is the equivalent of scanf( “%d”, &i) The following example outputs a table of exam scores to a file named scores.txt. fprintf() example #include <stdio.h> #include <stdlib.h> int main(){ int i, score[10]; FILE *HiScores; for(i=0; i<10; i++) score[i] = 100 - i; if(!(HiScores = fopen("scores.txt", "w"))== NULL){ for(i=0; i<10; i++) fprintf(HiScores, "Number %d:\t %d\n", i+1, score[i]); return 0; } fscanf() example #include <stdio.h> int main(){ int i; FILE *fp; fp = fopen("scores.txt", "r"); while(fscanf(fp, "%d", &i) != EOF) printf("%d\n", i); return 0; } fgetc() and fputc() fgetc() reads one character from a file. Takes a FILE pointer as an argument. fgetc() and getc() are identical, except that getc() is usually implemented as a macro for efficiency. fputc() writes one character to a file Takes a FILE pointer and a character to write as an argument. Likewise fputc() and putc() are identical. fgetline() Read one line from data.txt, copying it to line array (not more than max chars). Does not place terminating \n in line array. Returns line length, or 0 for empty line, or EOF for end-of-file. Now we could read one line from ifp by calling: #define MAXLINE 100 char line[MAXLINE]; ifp = fopen(“data.txt”, “r”); ... fgetline(ifp, line, MAXLINE); EOF indicates the end of file. It usually has the value -1. (ISO C standard requires only that it's a negative integer.) fgetc() example int fgetline(FILE *fp, char line[], int max){ int nch = 0, c; max = max - 1; /* leave room for '\0' */ while((c = fgetc(fp)) != EOF){ if(c == '\n') break; if(nch < max){ line[nch] = c; nch = nch + 1; } } if((c == EOF) && (nch == 0)) return EOF; line[nch] = '\0'; return nch; } Caution Sequential access file cannot be modified without the risk of destroying other data. Fields can vary in size -different representation in files and screen than internal representation. 1, 34, -890 are all ints, but have different sizes on disk. Because of these limitations C also has random access file processing features. See next lecture.