Introduction to Programming in C Douglas C. Schmidt University of California, Irvine Based on Material Prepared by Robert C. Carden IV, Ph.D. Unisys Corporation Mission Viejo, California Required textbooks David R. Brooks, C Programming: The Essentials For Engineers And Scientists, Springer, New York. Al Kelley and Ira Pohl, A Book On C, fourth edition, The Addison-Wesley Publishing Company, Reading, Massachusetts Gamma, Helm, John, and Vlissides, Design Patterns: Elements of Reusable Software Components, The Addison-Wesley Publishing Company, Reading, Massachusetts Recommended textbook Brian Kernighan and Dennis Ritchie, The C Programming Language, second edition, Prentice Hall, New Jersey. Recommended books on the C language Kamal B. Rojiani, Programming in C with Numerical Methods For Engineers, Prentice Hall, New Jersey. A-Series C Manual (Unisys Corporation), reference number 3950 8775-000, New Jersey. ANSI Draft on the C Programming Language, X3.159-1989 Copyright (c) 1998 by Robert C. Carden IV, Ph.D. 2/12/2016 Introduction to Programming in C Recommended books covering advanced C language topics Narain Gehani, An Advanced C Introduction: ANSI C Edition, Computer Science Press, Rockville, Maryland Andrew Koenig, C Traps and Pitfalls, Addison-Wesley Publishing Company, Reading, Massachusetts 1-2 Introduction to Programming in C Steven R. Lerman, Problem Solving and Computation for Scientists and Engineers: An Introduction Using C, Prentice Hall, New Jersey W. Richard Stevens, Advanced Programming in the UNIX Environment, AddisonWesley Publishing Company, Reading, Massachusetts W. Richard Stevens, UNIX Network Programming, Addison-Wesley Publishing Company, Reading, Massachusetts Robert Sedgewick, Algorithms in C, Addison-Wesley Publishing Company, Reading, Massachusetts Additional references Mark Allen Weiss, Data Structures and Algorithm Analysis, The Benjamin/Cummings Publishing Company, Redwood City, CA Paul Wang, An Introduction to Berkeley Unix, Wadsworth Publishing Company, Belmont, CA Peter A. Darnell, Philip E. Margolis, Software Engineering in C, Springer-Verlag, New York, NY. P.J. Plauger, The Standard C Library, Prentice Hall, Englewood Cliffs, NJ 1-3 Introduction to Programming in C Class Objectives The objectives of this class are as follows: Teach you how to read C programs Teach you how to write C programs Teach you how to debug C programs Teach you how to design good C programs This is a class for developers, not managers Managers are certainly invited to attend, but they must plan to do real work If you are taking this class to see what C is all about without planning on really learning it and programming it, you are in the wrong class! 1-4 Introduction to Programming in C How to Succeed in this Class Attend class "90 percent of life is just showing up." --Woody Allen Exams are heavily based on material presented in class… Never miss more than one consecutive week of classes If you just have to take that two week vacation, plan it so that you only miss one week of classes -- really! Missing even one class will severely limit your ability to keep up, particularly wrt the exams Read ahead Do the homework Students with a previous programming (e.g. Pascal or C++) background should plan on spending 5-10 hours per week Students with a no programming background but computer literate 10-20 hours per week Students with a no programming background and no computer experience 20-30 hours per week...you will be spending a lot of time struggling with the tools as well as learning how to program The best plan of action is to do a little bit of the lab assignment work each day, particularly if you have a home computer The worst plan of action is to do everything the night before the homework is due Ask questions 1-5 Introduction to Programming in C How to Fail in this Class Miss two consecutive weeks of class sessions Try to survive without access to a C compiler You learn C by doing, i.e., you must write C programs, compile them, and run them Never spend any time on this outside of class Do not attempt any of the homework Habitually show up late for class or habitually leave early The bottom line is that you will get out of this class what you put into it!!! To succeed, you need to spend at least 5 to 10 hours per week on your own 1-6 Introduction to Programming in C Overview of a Computer - Hardware Reference: Brooks, Chapter 1; KP, Chapter 1 Central Processing Unit (CPU) controls the flow of instructions and data and performs the necessary manipulation of data Primary storage (memory) is used to store information for immediate access by the CPU Note that there are many levels of cache used in primary storage Secondary storage devices (e.g., the hard drive, CD-ROM drives, tapes, etc.) provide permanent storage of large amounts of data, but are much slower than primary storage Input and Output devices provided interfaces between the computer and the user Secondary Storage Primary Storage Input Devices Control Unit Arithmetic Logic Unit Central Processing Unit (CPU) 1-7 Output Devices Introduction to Programming in C Overview of a Computer - Data Representation The first computers used vacuum tubes to hold data Vacuum tubes have two states - ON and OFF An ON state represents a 1 An OFF state represents a 0 Eight vacuum tubes strung together can represent an 8 digit string of 0s and 1s Put another way, this string is an 8 digit binary (base 2) number We use decimal (base 10) numbers in our daily life A decimal number is a string of digits whose values are drawn from the set {0,1,2,3,4,5,6,7,8,9} In general, a number system is simply a way of representing numbers A number system has a base (the number of digits used in the number system) Consider a number in a base b number system: an an 1 a1a0 The value of this number is: a n b n a n 1b n 1 a1b1 a 0 b 0 A binary number has a base of 2 where the valid digits are 0 or 1 E.g., 1001 binary == 9 decimal (1*8 + 0*4 + 0*2 + 1) An octal number has a base of 8 where the valid digits are 0 through 7 E.g., 031 octal == 25 decimal (3*8 + 1) A decimal number has a base of 10 where the valid digits are 0 through 9 E.g., 2000 decimal == 2000 (Y2K bug ;-)) A hexadecimal number has a base of 16 where the valid digits are 0 through F, i.e. {0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} E.g., xABBA hex == 43962 decimal (10*4096 + 11*256 + 11*16 + 10) 1-8 Introduction to Programming in C Overview of a Computer - Data Representation (2) Powers of 2 2^0 2^1 2^2 2^3 2^4 2^5 2^6 2^7 2^8 2^9 2^10 1 2 4 8 16 32 64 128 256 512 1024 2^11 2^12 2^13 2^14 2^15 2^16 2^17 2^18 2^19 2^20 2^21 1K 2048 4096 8192 16,384 32,768 65,536 131,072 263,144 524,288 1,048,576 2,097,152 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 1 KILO = 2^10 = 1024 1 MEG = 2^20 = 1024*1024 = 1,048,576 1 GIGA = 2^30 = 1024*1024*1024 = 1,073,741,824 To evaluate a binary number, say 101101, simply add up the corresponding powers of 2: 101101 1 2 5 0 2 4 1 2 3 1 2 2 0 21 1 2 0 32 0 8 4 0 1 45 1-9 Introduction to Programming in C Overview of a Computer - Hexadecimal Numbers A hexadecimal number is a string of hexadecimal digits Digits A, B, C, D, E, F represent the numbers 10, 11, 12, 13, 14, and 15 Hexadecimal is popular in the computer field because it can be used to concisely represent a long string of binary digits Consider a 16 digit binary number: 1011000111000101 Break this up into groups of 4: 1011 0001 1100 0101 Convert each group of 4 into decimal: 11 1 12 3 Then convert each decimal number into hex: B1C3 And you now have the number: Baker 1 Charlie 3 Able Baker Charlie Dog Easy Fox = = = = = = A B C D E F = = = = = = 10 11 12 13 14 15 = = = = = = 1010 1011 1100 1101 1110 1111 By the same token, the HEX number 3F2C1596 represents the binary string: 0011 1111 0010 1110 0001 0101 1001 0110 1-10 Introduction to Programming in C Overview of a Computer - Primary Storage Primary storage, also known as main memory or RAM (random access memory) is used to store information for immediate access by the Central Processing Unit (CPU) Memory can be viewed as a series of memory cells with each cell having its own individual address E.g., think of a bank of mailboxes at a post office The information contained in a memory cell is called the contents of that cell Memory cells can be used to store data, such as characters or numbers Internally, of course, they are all numbers, but we can choose to interpret some numbers as characters They can also be used to store program instructions These are “special” numbers that are meaningful to a CPU! The smallest unit of computer storage is a bit, as in binary digit Most computers group bits together to form larger entities E.g., 8 consecutive bits often form a byte and 32 consecutive bits often form a word A word on an Intel 286 computer is 16 bits or 2 bytes A word on an Intel 386, 486, and Pentium computers is 32 bits or 4 bytes The next generation of Intel computers (e.g. the Merced) will use 64 bit words, i.e. 8 bytes The DEC Alpha computer is currently using 64 bit words Many mainframe computers, such as the Unisys A-Series, use 48 bit words Memory cells in a computer are typically one byte or one word in size A word is a unit of information that can be transferred to and from memory A kilobyte of memory, as in 1K, is 1024 bytes of memory A megabyte of memory, as in 1M, is 1024K of memory, or 2^20 bytes A gigabyte of memory, as in 1G, is 1024M of memory, or 2^30 bytes Note that a 32 bit word can represent 2^32 different possible values 1-11 Introduction to Programming in C Programming Languages - Low Level Languages Computers only do what they are told to do Except in Hollywood movies, e.g., 2001, Terminator 2, the Matrix, etc. ;-) In order for a computer to perform a task, it must be given a series of specific instructions in a language it can understand The fundamental language of any computer is its machine language This is typically sequences of zeroes and ones In the very early days of computers, this was the only way one could write programs!!! To relieve the suffering of these early programmers, a higher level language called assembly language was developed Assembly language contains mnemonic words and symbols for the binary machine instructions An assembler maps assembly language instructions into machine language instructions Assembly language programming is indeed a significant improvement over machine language programming However, it has the following drawbacks: Machine dependent - each computer architecture has its own unique assembly language Low level instructions - writing programs is very time consuming, tedious, and error-prone 1-12 Introduction to Programming in C Programming Languages - High Level Languages A general trend in computing over the past 4 decades is to elevate programming from low level to higher level languages I.e., high level languages are geared more toward people writing the programs rather than the computer Assembly language instructions map directly to machine instructions High level language instructions must be translated/compiled into machine instructions High level languages are more “problem-oriented” than assembly/machine languages E.g, they require little or no knowledge of the underlying computer architecture Learning how to write/debug programs in high level languages is much easier and less error-prone than learning how to write/debug equivalent programs in assembler E.g., high level languages required fewer statements to do the same thing as assembler Programs written in high level languages can be ported much more easily to different computer architectures E.g., the compiler encapsulates the machine-dependent details of the target assembly language A special program called a compiler is needed to translate a program written in a high level language into assembly code (which is then transformed into native machine code by an assembler) The statement written in the high level language are called source code The compiler/assembler's output is called object code Source Code Compiler Assembler 1-13 Object Code Introduction to Programming in C Language Taxonomy FORTRAN COBOL ALGOL ASSEMBLER PL/1 FORTRAN IV B,BCPL BURROUGHS EXTENDED ALGOL FORTRAN77 SMALLTALK APL ALGOL68 COBOL85 PROLOG LISP C PASCAL ANSI C C++ MODULA-2 E ADA 1-14 Introduction to Programming in C History of C C 1970'S Early 1980's Late 1980's TRADITIONAL C ANSI C [OR STANDARD C] Add void type enumeration struct as parameter other improvements Add void * function prototypes new function definition syntax minimum standard library more functionality to preprocessor Add const and volatile American National Standards Institute X3J11 Committee 1-15 Introduction to Programming in C Significant Points About C Small number of keywords Widely available, particularly on personal computers and workstations Can be used very portably Standard library Preprocessor may be used to isolate machine dependent code Unlike Pascal, which has many dialects Native language of UNIX (tm) and Windows NT Terse Powerful set of operators Statements can be very powerful Some are bit level operators Designed to be implemented efficiently on many machines Modular -- functions Parameters are typically passed `by value’ No nested functions Syntax is complicated Semantics of certain features are complex and error-prone 1-16 Introduction to Programming in C A Comparison of Programming Language Philosophy Pascal Strict Parent (a “bondage and discipline” language ;-)) Restricts programmer for his/her own good A white, automatic transmission automobile with lots of safety features (e.g., air bags, controls that limit speed to 55 miles per hour and prohibit leaving the lights on or locking the keys in the car) C A permissive, easy going parent (a ‘lassize-faire’ language ;-)) Assumes that the programmer knows what he/she is doing and will assume responsibility for his/her actions. (Some describe it as a “gun with which you can shoot yourself in the foot.”) A bright red ’65 Corvette with a big block engine, manual transmission, optional seat belt, and with fuzzy dice hanging from the rear view mirror. C++ A less permissive, yet open minded parent (e.g., ‘Thomas Huxtable’ ;-)) Assumes that the programmer generally knows what he/she is doing, but provides more checking by default. (“With C++ it’s harder to shoot yourself in the foot, but when you do, you’ll blow off both of your legs” – Bjarne Stroustrup) A bright red 2000 Corvette with a 6 speed manual transmission, air bag, and heads up display, many on board computers 1-17 Introduction to Programming in C Example C Program /* File: hello.c Description: Prints a greeting to stdout. Author: Douglas C. Schmidt <schmidt@uci.edu> */ #include <stdio.h> int main (void) /* execution starts in main */ { printf ("Hello world.\n"); return 0; } All C programs must have a function in it called main Execution starts in function main C is case sensitive! Comments start with /* and end with */. Comments may span over many lines. C is a “free format” language. The #include <stdio.h> statement instructs the C compiler to insert the entire contents of file stdio.h in its place and compile the resulting file. 1-18 Introduction to Programming in C Compiling a C Program On a personal computer using Visual Studio, you will create a .c file, e.g. hello.c (as in the previous example). You will then compile it to produce a .exe file (e.g., hello.exe) and perhaps a .obj file (hello.obj). Consult your compiler documentation (each one is different). 1-19 Introduction to Programming in C In Visual Studio, do a File|New and you will see the following dialog... 1-20 Introduction to Programming in C Here, you must first select Win32 Console Application. Then press the ... button that appears to the right of the Location 1-21 Introduction to Programming in C Now choose the location. You may need to create a folder on your file system. In this case, I brought up the Windows NT explorer and created the new folder Ece11s98. Whatever you do, you should place the project in a place such that you can easily find it later. 1-22 Introduction to Programming in C Now that you have specified a Location, i.e. a folder where you want your project created, you should see the following... 1-23 Introduction to Programming in C Now, specify the name of your project. Visual Studio will create a folder underneath the one you specified for Location by that name. This will also become the name of your executable... Finally, click on the OK button to record your selection. It is very important that you do each of these initial steps in the exact order that I have described above. That is, first specify that you want a Console Application, then specify the location, then specify the project name. Do not forget any of these steps. In other words, pay attention!!! 1-24 Introduction to Programming in C As a result, you will see the following in Visual Studio Notice that the Workspace window now has three tabs. It has a ClassView tab, a FileView tab, and an InfoView tab. Select (point the mouse to) the FileView tab. 1-25 Introduction to Programming in C After selecting the FileView tab, you should see the following. Notice what appears in the Workspace window... 1-26 Introduction to Programming in C Now we need to create a source file and include it in the project. If we do the correct sequence of steps, Visual Studio will automatically include the new file into the project. Do a File|New and select Text File and then specify a File name. Do not accept the default choice, Active Server Page (or whatever happens to be the default). Do not select C/C++ Header File. Do not select C++ Source File. Select Text File. Do not forget to type in the file name, i.e. hello.c (or whatever you wish to call it), but it must end with .c !!! The file suffix tells Visual Studio what type of file this is so it knows which compiler to invoke on it. Finally, click OK to select your choice. 1-27 Introduction to Programming in C As a result of your efforts, you should see the following. 1-28 Introduction to Programming in C I usually move the windows around at this point, adjusting the sizes. Type in the program. 1-29 Introduction to Programming in C If you click on the + by Hello files under Workspace, you will see a list of all of the files in your project Build your program by selecting the Build|Build menu option 1-30 Introduction to Programming in C Run it (Build | Execute) Now let's look at the files: Hello.dsw: this is your WORKSPACE Hello.dsp: this is the project build file These two files, along with hello.c are worth saving. 1-31 Introduction to Programming in C The program we just built is HELLO.EXE under the Debug directory Because it is a CONSOLE application, we can run it directly from our Windows NT command (or Windows 95 Dos) prompt. When I typed hello below at the Dos prompt, Windows ran the program hello.exe. Below you can see the results of this run. 1-32 Introduction to Programming in C Compiling Under UNIX To compile a C program under UNIX, one may do it in the following ways: % cc hello.c This compiles the file and creates an executable named a.out % cc hello.c -o hello This compiles the file but renames the executable hello % cc -c hello.c This compiles the file but does not link it, thus producing an object module which may be linked later on. That file, by default, is hello.o % cc hello.o -o hello This links the object file hello.o to create the executable hello Under UNIX, suffixes are important (i.e., .c versus .o). They tell the compiler what type of file they are, i.e. a C program file versus an object module. % cc foo.o bar.o -o fubar This links two separately compiled modules into an executable named fubar. 1-33 Introduction to Programming in C The compiling process -- overview C Program -- foo.c % cc -c foo.c cpp -- C preprocessor Handles #-directives; removes comments foo.E ccom -- C Compiler compile program C Optimizer (optional) foo.s as -- assembler foo.o 1-34 Introduction to Programming in C Compiling process -- translation phases (ANSI) 1. Physical source file characters are mapped to the source character set (including new-line characters and end-of-file indicators) if necessary. Trigraph sequences are replaced by corresponding single-character internal representations. 2. Each instance of a new-line character and an immediately preceding backslash character (\) is deleted, splicing physical source lines to form logical source lines. 3. The source file is decomposed into preprocessing tokens and sequences of whitespace characters (including comments). A source file shall not end in a partial preprocessing token or comment. Each comment is replaced by one space character. New-line characters are retained. 4. Preprocessing directives are executed and macro invocations are expanded. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4 recursively. 5. Each source character set member and escape sequence in character constants and string literals is converted to a member of the execution character set. 6. Adjacent character string literal tokens are concatenated and adjacent wide string literal tokens are concatenated. 7. White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. The resulting tokens are syntactically and semantically analyzed and translated. 8. All external object and function references are resolved. Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translation output is collected into a program image which contains information needed for execution in its execution environment. 1-35