CSC 211 Data Structures Lecture 32 Dr. Iftikhar Azim Niaz ianiaz@comsats.edu.pk 1 Last Lecture Summary Hash Function Properties of a Good Hash Function Hash Function Methods File Text and Binary Files Operations on Files File Access Methods Sequential Files Indexed Files Hashed Files 2 Objectives Overview File Implementation in C Language Basic File Operations Opening a file Reading data from a file Writing data to a file Closing a file File operations on Text Files File operations on Sequential Binary Files Revision of the Course Lecture 1 to Lecture 31 3 Files - Implementation Files are places where data can be stored permanently. Some programs expect the same set of data to be fed as input every time it is run. Cumbersome. Better if the data are kept in a file, and the program reads from the file. Programs generating large volumes of output. Difficult to view on the screen. Better to store them in a file for later viewing/ processing 4 Text Data Files When you use a file to store data for use by a program, that file usually consists of text (alphanumeric data) and is therefore called a text file. Can be created, updated, and processed by C programs Are used for permanent storage of large amounts of data Storage of data in variables and arrays is only temporary 5 Basic Files Operations Opening a file Reading data from a file Writing data to a file Closing a file 6 Opening a File A file must be “opened” before it can be used. FILE *fp; : fp = fopen (filename, mode); fp is declared as a pointer to the data type FILE. filename is a string - specifies the name of the file. fopen returns a pointer to the file which is used in all subsequent file operations. mode is a string which specifies the purpose of opening the file: “r” :: open the file for reading only “w” :: open the file for writing only “a” :: open the file for appending data to it 7 File Modes r - open a file in read-mode, set the pointer to the beginning of the file. w - open a file in write-mode, set the pointer to the beginning of the file. a - open a file in write-mode, set the pointer to the end of the file. rb - open a binary-file in read-mode, set the pointer to beginning of file. wb - open a binary-file in write-mode, set the pointer to beginning of file. ab - open a binary-file in write-mode, set the pointer to the end of the file. r+ - open a file in read/write-mode, if file does not exist, it will not be created. w+ - open a file in read/write-mode, set the pointer to the beginning of file. a+ - open a file in read/append mode. r+b - open a binary-file in read/write-mode, if the file does not exist, it will not be created. w+b - open a binary-file in read/write-mode, set pointer to beginning of file. a+b - open a binary-file in read/append mode. 8 File Modes Points to note: Several files may be opened at the same time. For the “w” and “a” modes, if the named file does not exist, it is automatically created. For the “w” mode, if the named file exists, its contents will be overwritten. 9 Opening a File FILE *in, *out ; in = fopen (“mydata.dat”, “r”) ; out = fopen (“result.dat”, “w”); FILE *empl ; char filename[25]; scanf (“%s”, filename); empl = fopen (filename, “r”) ; 10 Closing a File After all operations on a file have been completed, it must be closed. Ensures that all file data stored in memory buffers are properly written to the file. General format: fclose (file_pointer) ; FILE *xyz ; xyz = fopen (“test.txt”, “w”) ; ……. fclose (xyz) ; 11 Closing a File fclose( FILE pointer ) Closes specified file Performed automatically when program ends Good practice to close files explicitly system resources are freed. Also, you might not find that all the information that you've written to the file has actually been written to disk until the file is closed. feof( FILE pointer ) Returns true if end-of-file indicator (no more data to process) is set for the specified file 12 Read/Write Operations on Text Files The simplest file input-output (I/O) function are getc and putc. getc is used to read a character from a file and return it. char ch; FILE *fp; ch = getc (fp) ; getc will return an end-of-file marker EOF, when the end of the file has been reached. putc is used to write a character to a file. char ch; FILE *fp; putc (ch, fp) ; 13 Text File - Example Convert a text file to all UPPERCASE main() { FILE *in, *out ; char c ; in = fopen (“infile.dat”, “r”) ; out = fopen (“outfile.dat”, “w”) ; while ((c = getc (in)) != EOF) putc (toupper (c), out); fclose (in) ; fclose (out) ; } 14 Read/Write Operations on Text Files We can also use the file versions of scanf and printf, called fscanf and fprintf. General format: fscanf (file_pointer, control_string, list) ; fprintf (file_pointer, control_string, list) ; Examples: fscanf (fp, “%d %s %f”, &roll, dept_code, &cgpa) ; fprintf (out, “\nThe result is: %d”, xyz) ; fprintf Used to print to a file It is like printf, except first argument is a FILE pointer (pointer to the file you want to print in) 15 Some Points How to check EOF condition when using fscanf? Use the function feof if (feof (fp)) printf (“\n Reached end of file”) ; How to check successful open? For opening in “r” mode, the file must exist. if (fp == NULL) printf (“\n Unable to open file”) ; 16 Example : Merge Two Text Files #include <stdio.h> int main() { FILE *fileA, /* first input file */ *fileB, /* second input file */ *fileC; /* output file to be created */ int num1, /* number to be read from first file */ num2; /* number to be read from second file*/ int f1, f2; /* Open fileA = fileB = fileC = files for processing */ fopen("class1.txt","r"); fopen("class2.txt","r"); fopen("class.txt","w"); 17 Example : Merge Two Files /* As long as there are numbers in both files, read and compare numbersone by one. Write the smaller number to the output file and read the next number in the file from which the smaller number is read. */ f1 = fscanf(fileA, "%d", &num1); f2 = fscanf(fileB, "%d", &num2); while ((f1!=EOF) && (f2!=EOF)){ if (num1 < num2){ fprintf(fileC,"%d\n", num1); f1 = fscanf(fileA, "%d", &num1); } else if (num2 < num1) { fprintf(fileC,"%d\n", num2); f2 = fscanf(fileB, "%d", &num2); } else { /* numbs are equal:read from both files */ fprintf(fileC,"%d\n", num1); f1 = fscanf(fileA, "%d", &num1); f2 = fscanf(fileB, "%d", &num2); } 18 } Example : Merge Two Files /* if reached end of second file, read the remaining numbers from first file and write to output file */ while (f1!=EOF){ fprintf(fileC,"%d\n", num1); f1 = fscanf(fileA, "%d", &num1); } /* if reached the end of first file, read the remaining numbers from second file and write to output file */ while (f2!=EOF){ fprintf(fileC,"%d\n", num2); f2 = fscanf(fileB, "%d", &num2); } /* close files */ fclose(fileA); fclose(fileB); fclose(fileC); return 0; } /* end of main */ 19 Files and Streams C views each file as a sequence of bytes File ends with the end-of-file marker Stream created when a file is opened Provide communication channel between files and programs Opening a file returns a pointer to a FILE structure Example file pointers: stdin - standard input (keyboard) stdout - standard output (screen) stderr - standard error (screen) FILE structure File descriptor Index into operating system array called the open file table File Control Block (FCB) Found in every array element, system uses it to administer the file 20 Files and Streams Read/Write functions in standard library fgetc Reads one character from a file fputc Writes one character to a file Takes a FILE pointer as an argument fgetc( stdin ) equivalent to getchar() Takes a FILE pointer and a character to write as an argument fputc( 'a', stdout ) equivalent to putchar( 'a' ) fgets reads a line (string) from a file fputs writes a line (string) to a file fscanf / fprintf File processing equivalents of scanf and printf 21 Creating a Sequential Access File C imposes no file structure No notion of records in a file Programmer must provide file structure Creating a File FILE *myPtr; myPtr = fopen("myFile.dat", openmode); Creates a FILE pointer called myPtr Function fopen returns a FILE pointer to file specified Takes two arguments – file to open and file open mode If open fails, NULL returned fprintf Used to print to a file Like printf, except first argument is a FILE pointer (pointer to the file you want to print in) 22 Creating a Sequential Access File feof( FILE pointer ) fclose( FILE pointer ) Returns true if end-of-file indicator (no more data to process) is set for the specified file Closes specified file Performed automatically when program ends Good practice to close files explicitly Details Programs may process no files, one file, or many files Each file must have a unique name and should have its own pointer 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 /* Fig. 11.3: fig11_03.c Create a sequential file */ #include <stdio.h> int main() { int account; char name[ 30 ]; double balance; FILE *cfPtr; /* cfPtr = clients.dat file pointer */ if ( ( cfPtr = fopen( "clients.dat", "w" ) ) == NULL ) printf( "File could not be opened\n" ); else { printf( "Enter the account, name, and balance.\n" ); printf( "Enter EOF to end input.\n" ); printf( "? " ); scanf( "%d%s%lf", &account, name, &balance ); Outline 1. Initialize variables and FILE pointer 1.1 Link the pointer to a file 2. Input data 2.1 Write to file (fprintf) 3. Close file while ( !feof( stdin ) ) { fprintf( cfPtr, "%d %s %.2f\n", account, name, balance ); printf( "? " ); scanf( "%d%s%lf", &account, name, &balance ); } fclose( cfPtr ); } return 0; } 24 Enter Enter ? 100 ? 200 ? 300 ? 400 ? 500 ? the account, name, and balance. EOF to end input. Jones 24.98 Doe 345.67 White 0.00 Stone -42.16 Rich 224.62 Outline Program Output 25 Reading Data from a Sequential Access File Reading a sequential access file Create a FILE pointer, link it to the file to read myPtr = fopen( "myFile.dat", "r" ); Use fscanf to read from the file Like scanf, except first argument is a FILE pointer fscanf( myPtr, "%d%s%f", &myInt, &myString, &myFloat ); Data read from beginning to end File position pointer Indicates number of next byte to be read / written Not really a pointer, but an integer value (specifies byte location) Also called byte offset rewind( myPtr ) Repositions file position pointer to beginning of file (byte 0) 26 1 /* Fig. 11.7: fig11_07.c 2 Reading and printing a sequential file */ 3 #include <stdio.h> 4 5 int main() 6 { 7 int account; 8 char name[ 30 ]; 9 double balance; 10 FILE *cfPtr; /* cfPtr = clients.dat file pointer */ 11 12 if ( ( cfPtr = fopen( "clients.dat", "r" ) ) == NULL ) 13 printf( "File could not be opened\n" ); 14 else { 15 printf( "%-10s%-13s%s\n", "Account", "Name", "Balance" ); 16 fscanf( cfPtr, "%d%s%lf", &account, name, &balance ); 17 18 while ( !feof( cfPtr ) ) { 19 printf( "%-10d%-13s%7.2f\n", account, name, balance ); 20 fscanf( cfPtr, "%d%s%lf", &account, name, &balance ); 21 } 22 23 fclose( cfPtr ); 24 } 25 26 return 0; 27 } Account Name Balance 100 Jones 24.98 200 Doe 345.67 300 White 0.00 400 Stone -42.16 500 Rich 224.62 Outline 1. Initialize variables 1.1 Link pointer to file 2. Read data (fscanf) 2.1 Print 3. Close file Program Output 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 /* Fig. 11.8: fig11_08.c Credit inquiry program */ #include <stdio.h> int main() { int request, account; double balance; char name[ 30 ]; FILE *cfPtr; if ( ( cfPtr = fopen( "clients.dat", "r" ) ) == NULL ) printf( "File could not be opened\n" ); else { printf( "Enter request\n" " 1 - List accounts with zero balances\n" " 2 - List accounts with credit balances\n" " 3 - List accounts with debit balances\n" " 4 - End of run\n? " ); scanf( "%d", &request ); Outline 1. Initialize variables 2. Open file 2.1 Input choice 2.2 Scan files 3. Print while ( request != 4 ) { fscanf( cfPtr, "%d%s%lf", &account, name, &balance ); switch ( request ) { case 1: printf( "\nAccounts with zero " "balances:\n" ); while ( !feof( cfPtr ) ) { 28 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 if ( balance == 0 ) printf( "%-10d%-13s%7.2f\n", account, name, balance ); fscanf( cfPtr, "%d%s%lf", &account, name, &balance ); } Outline 2.2 Scan files 3. Print break; case 2: printf( "\nAccounts with credit " "balances:\n" ); while ( !feof( cfPtr ) ) { if ( balance < 0 ) printf( "%-10d%-13s%7.2f\n", account, name, balance ); fscanf( cfPtr, "%d%s%lf", &account, name, &balance ); } break; case 3: printf( "\nAccounts with debit " "balances:\n" ); while ( !feof( cfPtr ) ) { if ( balance > 0 ) printf( "%-10d%-13s%7.2f\n", 29 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 } account, name, balance ); fscanf( cfPtr, "%d%s%lf", &account, name, &balance ); } Outline 3.1 Close file break; } rewind( cfPtr ); printf( "\n? " ); scanf( "%d", &request ); } printf( "End of run.\n" ); fclose( cfPtr ); } return 0; 30 Enter request 1 - List accounts with zero balances 2 - List accounts with credit balances 3 - List accounts with debit balances 4 - End of run ? 1 Outline Program Output Accounts with zero balances: 300 White 0.00 ? 2 Accounts with credit balances: 400 Stone -42.16 ? 3 Accounts with debit balances: 100 Jones 24.98 200 Doe 345.67 500 Rich 224.62 ? 4 End of run. 31 Reading Data from a Sequential Access File Sequential access file Cannot be modified without the risk of destroying other data Fields can vary in size Different representation in files and screen than internal representation 1, 34, -890 are all ints, but have different sizes on disk 300 White 0.00 400 Jones 32.87 (old data in file) If we want to change White's name to Worthington, 300 Worthington 0.00 300 White 0.00 400 Jones 32.87 Data gets overwritten 300 Worthington 0.00ones 32.87 32 Read and Write for Binary Files size_t fread(void *buffer, size_t numbytes, size_t count, FILE *a_file); size_t fwrite(void *buffer, size_t numbytes, size_t count, FILE *a_file); Buffer in fread is a pointer to a region of memory that will receive the data from the file. Buffer in fwrite() is a pointer to the information that will be written to the file. The second argument is the size of the element; it is in bytes. Size_t is an unsigned integer. For example, if you have an array of characters, you would want to read it in one byte chunks, so numbytes is one. You can use the sizeof operator to get the size of the various datatypes; for example, if you have a variable, int x; you can get the size of x with sizeof(x); 33 Read and Write for Binary Files The third argument count is simply how many elements you want to read or write; for example, if you pass a 100 element array The final argument is simply the file pointer fread() returns number of items read and fwrite() returns number of items written To check to ensure the end of file was reached, use the feof function, which accepts a FILE pointer and returns true if the end of the file has been reached. 34 Sample Program - 1 /* a simple example of using fread and fwrite to read and write an array of structures */ #include <stdio.h> #include <conio.h> int main() { FILE *fp; // File pointer struct prod { // declaring record int cat_num; float cost; }; typedef struct prod product; // type definition product a[3] = {{2,20.1},{4,40.1},{6,60.1}};// array of records product k, *p = &k; 35 Sample Program - 2 // opening the text file in read/write mode fp = fopen("c:\fread1.dat","w+b"); // write the entire array into the file pointed to by fp fwrite(a, sizeof(product), 3, fp); // prepare for reading from the beginning of the file rewind(fp); // read from the file one product at a time for (i=0; i<3; i++) { fread(p, sizeof(product), 1, fp); printf(" product %d, cat_num=%d, cost=%f\n", i, p->cat_num, p->cost); } // end of for loop getch(); } // end of main program 36 Revision – Lectures 1 – 4 Lecture 1 2 Topics Course Outline, Programming and problem solving, Software development method, Program control structures, Algorithm, Pseudocode and Flow chart System Development Life cycle and its phases, Program Development Life cycle and its phases 3 Generation of programming languages, Compiler Interpreter, Procedural and modular programming Structure of a C Program, Data and data structure, Abstract Data Type (ADT) 4 Data types in C Language, Arrays, Declaration, operations, Array and functions, Pointers, Arrays and pointers 37 Revision – Lectures 5 – 8 Lecture 5 6 7 8 Topics Concept of Pointer, Pointer operators, Address and Indirection, Pointer Arithmetic, Pointer and functions Pass by Value, Pass by Reference Dynamic Memory Management with Pointers, Structures, Unions, Strings, Multidimensional Arrays Need for Data Structures, Selecting a data structure Data structure philosophy, Classification, Common Operations, Arrays and Lists, List Operations Algorithm Analysis, Time and Space Complexity, Complexity of Algorithms, Measuring Efficiency Big O Notation, Standard Analysis Techniques 38 Revision – Lecture 9 – 12 Lecture 9 10 11 12 Topics Algorithms and Complexity, Criteria for Algorithm Analysis, Complexity Analysis, Various Complexity Functions, Properties of Big O Notation Data structure Operations, Array-based and Pointer based List ADT, Linked List, Linked List Operations Dynamic Representation, Allocation from Dynamic Storage, Returning unused storage back to dynamic storage, Linked List insert and delete Operations Cursor-based Implementation of List, Search Operation, Sequential Search, Concept, Algorithm and Implementation, Complexity of Sequential Search 39 Revision – Lectures 13 – 16 Lecture 13 14 15 16 Topics Binary Search, Concept, Algorithm and implementation, Binary search complexity, Searching Unordered and Ordered Linked Lists Sorting, Concept, Terminology, Classification, Stability of Key, Bubble Sort, Concept, Algorithm and implementation Bubble Sort Complexity, Selection Sort, Concept, Algorithm, Implementation, Complexity, Insertion Sort, Concept, Algorithm, Implementation, Complexity Comparison of Bubble, Selection and Insertion Sort, Recursion, Concept, Example and Implementation 40 Revision – Lectures 17 – 20 Lecture 17 18 19 20 Topics Recursive Search Algorithms, Recursion with Linked Lists, Recursion and Iteration, Analysis of Recursion Merge Sort, Concept, Algorithm and Implementation Complexity of Merge Sort Quick Sort, Concept, Algorithm and Implementation Complexity of Quick Sort Comparison of Merge and Quick Sort, Shell Sort, Radix sort, Bucket Sort, Sorting techniques comparison 41 Revision – Lectures 21 – 24 Lecture 21 22 23 24 Topics Doubly Linked List, Operations, Algorithm and Implementation Code, Doubly Linked List with Two Pointers Queues, Operations, Algorithm and Implementation, Circular Queue and Deque operations Stacks, Operations, Algorithm and Implementation, Stack Applications Trees, Concept, Examples and Applications, Tree Terminology, Types of Trees, General Trees, Representation and Traversal, Binary Tree 42 Revision – Lectures 25 – 28 Lecture Topics 25 Binary Tree Operations and Traversal, Binary Search tree and its Operations 26 Complete Binary Tree, Heaps, Heap Operations Applications of Heaps, Priority Queue, Heap Sort Concept , Algorithm and Implementation, Complexity, Comparison with Quick and Merge Sort 27 28 Types of Binary trees, Expression Tree, Threaded Binary Tree, AVL Tree, Red-Black, Splay, Insertion and Deletion Operations, Time Complexity, B–trees Graphs, Terminology and Representation of Graphs Operations on Graphs, Graph Traversals, Breadth First Search (BFS), Depth First Search (DFS) 43 Revision – Lectures 29 – 32 Lecture 29 30 31 32 Topics Shortest Path Problem, Dijkstra’s Algorithm, Bellman Ford Algorithm, Spanning Tree, Minimum Spanning Tree, Kruskal and Prim Algorithm Dictionaries, Table, Concept, Operations and Implementation, Hash Table, Hashing and Hash Function, Hash Tables Implementation, Applications Hash Function, Properties of a Good Hash Function Hash Function Methods, File, Text and Binary Files Operations on Files, File Access Methods, Sequential Files, Indexed Files, Hashed Files File operations implementation in C, File operations on Text and Sequential Binary Files, Revision 44 Summary File Implementation in C Language Basic File Operations Opening a file Reading data from a file Writing data to a file Closing a file File operations on Text Files File operations on Sequential Binary Files Revision of the Course Lecture 1 to Lecture 31 45