Second Year C and R Programming Workshop Prof. R. Willingale October 12, 2015 Department of Physics and Astronomy University of Leicester University Road Leicester LE1 7RH Telephone Internet Email +44-116-252-3556 http://www.star.le.ac.uk/zrw zrw@le.ac.uk Contents 1 Introduction 5 1.1 Getting started with SPECTRE . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Logging on to SPECTRE remotely . . . . . . . . . . . . . . . . . . . . . . 7 2 Linux shell commands 8 2.1 NEDIT and EMACS text editors . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Editing text and the command-line - cut and paste . . . . . . . . . . . . . 11 2.3 Adding module load R to your bash profile . . . . . . . . . . . . . . . . . . 12 2.4 Summary of a few Linux shell commands . . . . . . . . . . . . . . . . . . . 12 2.5 Exercise 1 - Creating a C source file . . . . . . . . . . . . . . . . . . . . . . 13 1 University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 3 C programming Document: Issue: Date: Page: CR1 1.0 October 12, 2015 2 14 3.1 Compiling and running a C program . . . . . . . . . . . . . . . . . . . . . 14 3.2 Exercise 2 - Your first C program . . . . . . . . . . . . . . . . . . . . . . . 14 3.3 Anatomy of the main C program . . . . . . . . . . . . . . . . . . . . . . . 15 3.4 Variable names and numeric data types . . . . . . . . . . . . . . . . . . . . 16 3.5 Printing output and formatting numbers . . . . . . . . . . . . . . . . . . . 17 3.6 Arithmetic expressions and mathematical functions . . . . . . . . . . . . . 18 3.7 Symbolic constants and the pre-processor . . . . . . . . . . . . . . . . . . . 19 3.8 Exercise 3 - Variable types, functions and precision . . . . . . . . . . . . . 20 3.9 Repeating instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.10 Logical expressions and relational operators . . . . . . . . . . . . . . . . . 22 3.11 Exercise 4 - A program to calculate a factorial . . . . . . . . . . . . . . . . 24 3.12 The for-loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.13 Compound statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.14 Exercise 5 - More repetition . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.15 Entering numbers into a program . . . . . . . . . . . . . . . . . . . . . . . 26 3.16 Exercise 6 - Requesting values from the terminal . . . . . . . . . . . . . . . 27 3.17 Subscripted variables (arrays) . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.18 Exercise 7 - Using arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.19 Return values from functions . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.20 Input/output using files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.21 Exercise 8 - Reading and writing text files . . . . . . . . . . . . . . . . . . 33 University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 3 3.22 Conditional execution (the if-statement) and error checking . . . . . . . . . 33 3.23 Indefinite loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.24 Character strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.25 Structured data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.26 Reading in a table from a file . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.27 Exercise 9 - Reading a data table and performing linear regression . . . . . 39 3.28 User-defined functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.28.1 Defining new functions . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.28.2 Function prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.29 Prototypes of standard library functions . . . . . . . . . . . . . . . . . . . 42 3.30 Exercise 10 - Tabulation of a user-defined function . . . . . . . . . . . . . . 43 3.31 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.31.1 Assigning locations to pointer variables . . . . . . . . . . . . . . . . 45 3.31.2 De-referencing pointers . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.31.3 Using pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.32 Exercise 11 - Using pointers to return values from a function . . . . . . . . 49 3.33 Passing arrays to functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.34 Exercise 12 - Testing the final sawtooth function . . . . . . . . . . . . . . . 50 4 Programming with C and R together 50 4.1 Exercise 13 - Using R to repeat Exercise 9 . . . . . . . . . . . . . . . . . . 51 4.2 Exercise 14 - Plotting the sawtooth function . . . . . . . . . . . . . . . . . 52 4.3 Calling a C function from R . . . . . . . . . . . . . . . . . . . . . . . . . . 52 University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 4 4.4 Exercise 15 - calling the sawtooth C function from R . . . . . . . . . . . . 54 4.5 Using C functions in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.6 Handling 2-D arrays in C and R . . . . . . . . . . . . . . . . . . . . . . . . 55 4.7 Exercise 16 - Investigating thermal spectra . . . . . . . . . . . . . . . . . . 59 5 Programming unaided 5.1 60 Exercise 17 - The gravitational potential of the Earth-Moon system . . . . 61 6 Quick C Reference 61 6.1 Variables, types and declarations . . . . . . . . . . . . . . . . . . . . . . . 61 6.2 Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.3 Reserved identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 6.4 Input and output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 6.5 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.6 Compiler directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 6.7 Conditional statement blocks . . . . . . . . . . . . . . . . . . . . . . . . . 66 6.8 Definite loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.9 Functions and header files . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.10 List of mathematical functions in C . . . . . . . . . . . . . . . . . . . . . . 68 University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 1 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 5 Introduction This 2nd year Workshop is a direct follow-up to the introduction to computing using R given in the 1st year Workshop. In the 1st year workshop you used the UoL IT Windows machines to run the R environment. In this workshop you will be using Linux running on the UoL IT SPECTRE system. The workshop is divided into three sections. The first section shows you how to use SPECTRE and the Linux operating system. The second section is an introduction to C programming. The third section is about using C and R in combination: transfering data to/from programs written in C and the R environment; writing C functions which can be called from within R. The total contact time is 12 hours. This script is a pdf file and can be found on Blackboard under course PA2900 or at http://www.star.le.ac.uk/zrw/compshop/C_R_workshop_2nd.pdf We suggest you download the file to your Desktop for ease of use. 1.1 Getting started with SPECTRE You can access SPECTRE using a X-Terminal client called NX running on the UoL IT Windows system. So first you must logon to a UoL IT Windows machine. If you have not yet used NX on your Windows account then you must install it. Programs-->Program Installer-->NX Client (click) Once installed you can start NX for a SPECTRE session. Programs-->NX Client (click) SPECTRE (click) In order to logon to SPECTRE you must supply your UoL username and password. NX will open a window for SPECTRE. In order to do the workshop you will need to start a terminal window for SPECTRE. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 6 Applications-->System Tools-->Terminal (click) The terminal window will open within the NX window and the prompt within the terminal window will look something like: [zrw@spectre02~]$ This indicates the username (zrw in my case), the machine (spectre02 for this particular login) and the directory (~ is the home directory of the user). This terminal window serves the same function as the R Console. You type commands against the prompt and the system will execute these commands. Under Linux (or more generally UNIX) the prompt is produced by what is called a “shell”. This is a program which runs within the terminal and acts as an interface or wrapper around the system giving the user access to all aspects of the system. One application you are already familiar with is R. Before you can use R on SPECTRE you must load the module using the following command. [zrw@spectre03 under]$ module load R You must do this at the start of each SPECTRE session unless you put this command line in to your .bash_profile file. We will do this later. Once the R module is loaded you can start R by simply typing “R return” in response to the prompt in the terminal window (note the command is upper-case; on SPECTRE a lower-case r will not work). [zrw@spectre03 under]$ R R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows" Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-unknown-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type ’license()’ or ’licence()’ for distribution details. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 7 Natural language support but running in an English locale R is a collaborative project with many contributors. Type ’contributors()’ for more information and ’citation()’ on how to cite R or R packages in publications. Type ’demo()’ for some demos, ’help()’ for on-line help, or ’help.start()’ for an HTML browser interface to help. Type ’q()’ to quit R. > The above should be familiar. R is using the terminal window as a Console. You could now go ahead and redo the 1st year workshop using SPECTRE. You might like to try a few of the R commands you learnt in the 1st year to try it out. 1.2 Logging on to SPECTRE remotely You can logon and use SPECTRE from your own computer/laptop on or off Campus providing you have an internet connection. Details about remotely connecting to Spectre can be found at: http://www2.le.ac.uk/offices/itservices/ithelp/services/hpc/spectre/access/nx If you are using a Windows machine then you can download NX (free) from the following link. http://www.star.le.ac.uk/zrw/compshop/nx/ Download all the files onto your Desktop. If you execute the *.exe files they will install NX and the required fonts. The files SPECTRE.nxs and SPECTREFullScreen.nxs can be used to start up NX for login to Spectre. If you start up NX without using the *.nxs configuration files you will need to set the configuration by hand host: spectre.le.ac.uk port: 22 Desktop: Unix KDE(or your choice) Connection: ADSL or LAN etc. depending on your internet connection University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 8 If you are using a Mac then you should use the native OS X services. To get full functionality, including plotting, you need to install X-Windows software. This is available with the Mac application Xcode which you can download free. https://developer.apple.com/xcode/ The most recent versions of Mac OS X and Xcode do not include the X11 application. You can download this free from: http://xquartz.macosforge.org/landing/ Xcode will also give you a local C compiler (gcc) so that you will be able to do the workshop on your local machine if you so desire. When you have Xcode and X11 start up a Terminal window Applications-->Utilities->Terminal This Terminal window will be running the bash shell (see details about shell commands below). To connect to SPECTRE use the following command subsitituting your own UoL username. Note the dollar sign is the shell prompt which will appear in the terminal window. $ ssh -X username@spectre.le.ac.uk The ssh command sets up a secure shell connection to the target machine which in this case is SPECTRE. The -X switch enables the X-protocol so that you can send plotting commands and similar from SPECTRE to your local Mac. 2 Linux shell commands The NX window which opens when you logon to SPECTRE provides a Windows-like interface to the Linux operating system. This includes pull-down menus for starting applications and so forth. However, the Terminal application which was described in the introduction above provides a more basic interface using shell commands. Anything you can do using the pull-down menus and clicking with the mouse can also be done using shell commands and there are many things which are a lot easier to do this way. In the following text we will use a $ at the start of a line to represent the shell prompt (similar to the > used by R in the R console). University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 9 An important concept when using Linux is the current “working directory”. This is a directory file (or folder in Windows speak) which will be used by applications as the default for writing new files or reading old files. It is very similar to the same concept used in the R environment. In fact, when you start R from a Terminal window the R working directory will be the same as the Linux working directory. You can find out what the current working directory is using the command: $ pwd /home/z/zrw In the above example the command pwd has listed my home directory on SPECTRE. This is performing the same function as the R command getwd() you are familiar with from the 1st year workshop. You can list the files in the current working directory using the command ls so $ ls bin chaos... You should have a file called bin in your home directory but you won’t have a file called chaos! You can create a new directory within the current working directory using the following $ mkdir Cwork This will create the directory file Cwork. You can then change the current working directory using $ cd Cwork Here cd assumes that Cwork is within the current working directory. You can override this by giving a complete path specification. $ cd /home/z/zrw/Cwork If you want to move back one step up the directory tree then use University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 10 $ cd .. If you want to delete a file then use the rm command. $ rm filename This will not work if the file is a directory. To delete a directory use the command rmdir. $ rmdir dirfilename Use the commands rm and rmdir carfully. Once you have deleted a file it can’t be recovered. You can copy a file to create another file using the cp command. $ cp oldfilename newfilename After this there will be two files, oldfilename and newfilename. If you want to change the name of an existing file you use the mv command. $ mv oldfilename newfilename After this operation oldfilename will have disappeared and will have been replaced by newfilename. 2.1 NEDIT and EMACS text editors You can create or edit a text file using a text editor. You are already familiar with using such an editor within the R environment. On Linux systems there is a choice of editors. All UNIX and Linux systems offer the vi editor (the name stands for visual for some reason) but most users find vi strange and awkward to use (I don’t, I think it’s wonderful and I’m using it to type this document). nedit and emacs are more conventional. To invoke the editor use University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 11 $ nedit filename & or $ emacs filename & Both of these commands open up an editor window. The & at the end of the command tells the shell to run the editor as a “background job”. This means that you will be able to switch focus between the editor window and the original terminal window keeping both windows active and accessible. If filename exists in the current working directory then the contents of the file (or at least some lines at the start) will be displayed within the window. If filename doesn’t exist the window will be blank and the editor will be creating a new file. When you have finished entering or editing text in the editor window you can save the file using the File menu at the top left of the editor screen. You don’t need to close the editor window, in fact it is best not too if you intend to make further edits to the file. A simple way to list a file at the terminal is to use the more command. $ more filename This prints out a few lines at a time. You use “ return” to scroll to the next few lines. 2.2 Editing text and the command-line - cut and paste When using NX in Windows you can cut and paste text from one place to another, for example from the Acrobat Reader window containing this document to the nedit window used to edit a text file. Select the text with the mouse (hold down the left button and sweep across text). Use ctrl-c to copy the highlighted text to the clipboard. Move to the nedit window and use ctrl-v to paste the text from the clipboard into the file. You will be able to lift the source code of programs listed in this document into source files on SPECTRE using this method to save typing time. If you are logged on to SPECTRE from another Linux machine running X-Windows or a Mac this cutting and pasting between windows is still possible but the method is a little different. When using the Terminal window you can recall and edit previous lines you have typed using control key strokes in the same way that you can in the R Console window. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop ctrl-p ctrl-n ctrl-a ctrl-e ctrl-b ctrl-f 2.3 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 12 go to previous line go to next line go to start of current line go to end of current line move 1 character back along current line move 1 character forward along current line Adding module load R to your bash profile Typing module load R at the start of each SPECTRE session is tedious. You can avoid this by putting this line into your bash (shell) profile file. Use emacs or nedit to add this line to your .bash_profile file. This file exists in your home directory so make sure you are in your home directory (using the cd command without specifying any file) before trying to edit the file. # cd $ nedit .bash_profile & Add the line module load R at the end of the file and save. Next time you start a terminal screen the command R will be available without the need to load the module. 2.4 Summary of a few Linux shell commands To summarize here is a list of the Linux shell commands we have introduced so far. pwd ls mkdir cd dirfilename cd /full/path/file cd .. nedit filename & more filename rm filename rmdir dirfilename cp oldfile newfile mv oldfile newfile print the current working directory list files in the current working directory create a new directory in current working directory change directory to file in current working directory change directory using a complete path specification move one step up directory tree start editing a file list a file at the terminal delete file from working directory delete directory file create a copy of a file change the name of a file University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 13 More details about these commands and other aspects of the UNIX/Linux operating system can be found at the link UNIXhelp for Users in http://www.star.le.ac.uk/zrw/compshop 2.5 Exercise 1 - Creating a C source file (i) Create a directory Cwork in your home directory on SPECTRE. (ii) Change the working directory to Cwork. (iii) Use nedit to create a C source file called prog1.c containing the following short C program. Note that the file name extension .c indicates a C source file. /* prog1 - a #include #include #include simple first program in C */ <stdio.h> <stdlib.h> <math.h> int main() { /* Declare variables */ float a, b, sum; /* Assign values to variables */ a = 10.0; b = 2.0; /* Calculate the sum, print it out */ sum = a + b; printf("The sum is %f\n", sum); } (iv) Save the file and move back to the terminal window. Use the more command to check that the file prog1.c contains the text you typed in. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 3 3.1 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 14 C programming Compiling and running a C program Before a C program can be run it must be translated into the simple low-level instructions which are understood by the Central Processing Unit (CPU). This translation step is called compilation, and is achieved by a program called a compiler. In addition, the code must be linked to all the system routines which are required to run the program. The shell commands to compile, link and run the program you have just typed in are $ cc prog1.c -lm -o prog1 $ ./prog1 This will create a new file that contains the translated version of the program, expressed as instructions which can be understood by the CPU. We call such a file an executable, as it is capable of being executed by the CPU (which your original source code was not). cc is the shell command which runs the C compiler. The -o is called a command switch (or sometimes a command qualifier) and tells the compiler what filename to give the executable file (prog1 in this case). The -lm switch allows us to use mathematical functions in our programs. To run this newly created program, you just use a command which is the filename of the executable (adding ./ before the file name tells the computer to look in the current directory for the file to run). Developing a computer program in C (or any other compiled language) involves the cycle edit-save-compile-link-run-edit-save-compile-link-run... very similar to the process you used in the 1st year to develop R scripts but including the additional compile-link stage which is performed by the C compiler. 3.2 Exercise 2 - Your first C program (i) Compile and run prog1. If it fails you will have to edit the source file, save the new version of the file, re-compile the program and run it again. (ii) Copy the file prog1.c to prog2.c and use the editor to change the value of the variables being added together in prog2.c. Compile and and run the program prog2 to check it gets the correct answer. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 3.3 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 15 Anatomy of the main C program The prog1 program introduces several very important concepts in computer programming. Ignore the first three lines for now. They are necessary but are easier to understand when you know a little more about C and will be described in section 3.7. The line int main() declares that this is a main C program (function). The function main() is special and acts as an entry point or starting point when the program is run. The int specifies that on completion the program will return an integer value that can be used as a flag to indicate whether the program ran correctly or not (see Section 3.19). You can ignore this for now and just concentrate on the lines between the curly brackets ({ and }). The program is basically a short list of instructions (called statements in C). In the C language each statement is separated from the next by a semi-colon (;). Be very careful to check that you place semi-colons where they are needed, missing one out will generate a lot of error messages. The program also contains comments, which are enclosed (delimited) by /* and */. The comments are ignored by the compiler, so you can put whatever you like in the comments (except other comments, i.e. comments cannot be nested). The point of a comment is to aid readability, to explain to another human reading the program what the program is doing (or supposed to do) at a given point. Learning to make liberal use of comments is an important part of learning to program a computer. A program is as good as useless if no-one can tell what it does! The first thing the program does is to declare three “variables”. A variable is a named entity that holds a value. The value of a variable can be updated as a program runs (hence the name). Every variable must have a type. The C language supports both numeric and non-numeric types; we will leave non-numeric types until Section 3.24. Computers use two different types of numbers: integers and real numbers (commonly called floating-point numbers). The line float a, b, sum; declares three variables named a, b and sum, and states that they contain floating-point (real) values. If the word float was replaced by int then the variables would be declared as integers. The next action the program takes is to assign values to variables. This is University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 16 called defining the variable value. Before these lines all three variables will contain an undefined value. It is a common mistake to declare a variable but forget to give it an initial value. Most modern compilers will warn you if you declare a variable and never use it, or use it before you assign it a value (though it is very bad practice to rely on this). After the variables a and b are assigned values they can be used in an arithmetic expression. The assignment bears a passing resemblance to an equation in algebra (though it is not an algebraic equation): sum = a + b; What this line does is add the values held in variables a and b, and assign this value to the variable named sum. (A CPU would execute this using simple instructions like “get this number from there” “add this number to that number” and “put that number there”). The last thing the program does is print out the result. After all, a program which keeps its answers to itself is pretty useless! More details about printf() are given in Section 3.5. 3.4 Variable names and numeric data types Variables in C must be given a name. The variable name can be made up from upper-case or lower-case letters (A-Z or a-z), numeric characters (0-9) or an underscore ( ). There is one important restriction though: a variable name cannot start with a numeric character. Variable names are case-sensitive in C, in other words value and Value are different variables. It is a bad idea to define variables with names that are distinguished only by their case, as this makes code difficult to interpret, and can lead to mistakes that are very hard to track down. Most implementations of C allow variables to be given long names (at least 32 characters). You should choose names for your variables that are reasonably descriptive of what the variable represents (to help other programmers read your code, for example nvalues) without being unnecessarily long-winded or difficult to type (eg. number_of_values_in_the_file_that_I_am_reading_on_Tuesday). Numeric variables in computer programs are not capable of representing every possible real or integer number; they have limited precision and can only represent a limited range of numbers. There is a trade-off between precision/range and the amount of storage space required to represent the number held within the variable. For example the int type has a range -2147438646 to 2147483647 and occupied 4 bytes (each byte is 8 bits). The float data type is often referred to as single-precision, and the double type as double-precision. Historically single-precision mathematical operations were faster than University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 17 double-precision, however with modern CPUs this is generally no longer the case. So the important differences between single and double precision is the amount of memory or disc space required to store them and the numerical precision they provide. 3.5 Printing output and formatting numbers The printf statement is used to print output to the screen. printf() is an example of a function in C. A function is a separate piece of code that you can call from your program to perform a specific task. When calling a function you can supply arguments, which are variables or values which the function will need to perform the task you want it to. These arguments follow the function name and are enclosed in brackets. Functions are a very important part of programming in C; a large number of powerful functions are provided for you to use, and you can write your own functions. We will come back to functions in more detail later on. Fortunately you do not need to know how the printf() function works internally, you only need to know what arguments to give it to print the output you want. Your program does not need to concern itself with the mechanics of formatting numbers and printing them out; this has all been done for you. The printf function accepts one or more arguments. The first argument is a format specifier, which is basically a template for what you want printed out. Take the example in the prog1 program: printf("The sum is %f\n", sum); In this case the format specifier is the string "The sum is %f\n". What the printf() function does is to work through the format specifier string and replace the format codes (which look like %f) with the formatted representation of the values of the subsequent arguments in order. The format code %f is used to format real numbers, the format code %d is used for integer values. Be very careful to ensure that the format codes are correct for the data types of the arguments, otherwise you will quite probably get garbage! The full width and precision can be specifed by including them directly after the % character, %n.mf %10d %15.10f format integer with full width 10 characters format real with full width 15, 10 digits after decimal point A complete list of format specifiers in given in Section 6.4. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 18 The \n character sequence stands for newline, and asks printf() to start a new line on the screen. The example below shows both integer and real numbers in the same line of output: int first; float second; first = 10; second = 11.0; printf(The value of first is %d, the value of second is %f\n, first, second); would print The value of first is 10, the value of second is 11.0 3.6 Arithmetic expressions and mathematical functions The C language supports the following operators for use in arithmetic expressions: + * / addition subtraction multiplication division In complicated arithmetic expressions the compiler will use rules of operator precedence to work out the order in which the operations should be performed. Operator precedence can be tricky to get to grips with. A complete list of operators in precedence order is given in Section 6.5. For now it is enough to know that multiplication and division have higher precedence than addition and subtraction, for example: a * b + c * d will evaluate a * b, then c * d, then add together the two intermediate results. If you are in doubt how an expression might be interpreted by the compiler, it is best to use parentheses (brackets) to make sure it understands what you really mean. So you might use University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 19 a * (b + c) * d in which case b + c will be evaluated first and then this intermediate result will be multiplied by a and b. It is possible to mix numerical types in an expression, for example to multiply an integer (int) by a single-precision floating-point number (float). When such an expression is evaluated the values of the variables will be automatically converted to the same type as the highest precision variable used in the expression (for example int * float becomes a float, int * float * double becomes a double). It is also legal to assign values of one numeric type to variables of another type, however in some cases this will lead to loss of precision, so care should be taken when doing this. When real numbers are assigned to integer variables, the values are truncated (not rounded, ie. 67.9 becomes 67, not 68). Many mathematical functions are available. For example b = sin(a) Here b will be assigned the value of sin(a) where a is in radians. The full list of mathematical function is given in Section 6.10. 3.7 Symbolic constants and the pre-processor We have seen how the C language provides variables to represent values that can change throughout the program. Sometimes you will find yourself wanting to use constant values in your program. Many constants will have a special meaning, for example π or the conversion factor from degrees to radians. You could type these numbers in every time you need them, but this is tedious, error-prone and doesn’t make for very readable code. A better solution is to use a symbolic constant, eg. PI or DEG_TO_RAD. The C language (unlike FORTRAN or C++) does not intrinsically provide support for such named constants, however there is a simple way to get the same effect by using the C preprocessor. Before a C program undergoes compilation it passes through a preprocessor phase. You have already seen pre-processor instructions (called directives) in all of the example programs above, they begin with a hash sign, #, for example #define or #include. The #define directive is very useful for defining constants. In the program above we use #define like so: #define MAXVALS 100 University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 20 Whenever the C pre-processor sees the character sequence MAXVALS it will replace it with the sequence 100. This means that the line of code which appears: float values[MAXVALS]; will be compiled as if it had been written float values[100]; Defining constants like this can be very useful. Firstly, as the name suggests, it is not possible to change the value of a constant. If we assign the value of π to a variable, there is a risk that it might inadvertently be changed, which can lead to some very hard-to-find bugs. Another important pre-processor directive that you have already encountered is #include. The #include directive informs the compiler that you wish to use some of the standard library of functions which are available with the C compiler. For example if you want to use input/output functions (such as printf, fprintf, or scanf), you must ensure you have the line #include <stdio.h> close to the start of the file. To use mathematical functions, you should include the math.h header file, thus #include <math.h> In the example programs you have seen so far we have already included three standard header files (stdlib.h, stdio.h and math.h) that cover all of the requirements of those programs. The pre-processor has a number of other useful features, for example macros and conditional compilation, but those are beyond the scope of this workshop. 3.8 Exercise 3 - Variable types, functions and precision (i) Make a copy of the source file prog2.c and call it prog3.c University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 21 (ii) Edit, compile and run prog3.c to do the following: Calculate an estimate of the constant π using the inverse function asin(). Save the result as an integer (int), a single precision real (float) and a double precision real (double). Print out the results using the format specifiers %d and %f. How many significant figures are produced in each case? (iii) Add lines to divide two integers and assign the result to an integer variable. What can you say about the results? What happens if you assign the result to a float? What happens if you assign the two integers to real variables and then calculate the ratio using the real numbers assigning the result to a real? 3.9 Repeating instructions How might you modify prog1 to calculate and print, say, the factorial of the variable a? Remember that a factorial is defined as: a! = 1 × 2 × 3 × 4 × ... × a so basically you must multiply a by all integers less than a. An obvious way to do this might be to modify the program to explicitly perform the multiplications required as follows: fact = 1.0; fact = fact * 2.0; fact = fact * 3.0; fact = fact * a; however this is not very satisfactory because firstly it involves a lot of typing, and secondly that the program is difficult to modify if you want to calculate the factorial of a different number. A more elegant (and the correct) solution is a loop. What a loop does in a computer program is to execute a bunch of instructions more than once. A loop will continue to execute the enclosed instructions until some condition (defined by the programmer) is satisfied. The following program will calculate the factorial of a number: /* prog4 - calculate a factorial */ #include <stdio.h> #include <stdlib.h> #include <math.h> University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop /* /* /* /* Document: Issue: Date: Page: CR1 1.0 October 12, 2015 22 int main() { int i, num; float fact; What number do we want to calculate the factorial of? */ num = 10; Initialise the factorial */ fact = 1.0; Count from 2 to num, multiply fact by counter each time */ i = 2; while ( i <= num ) { fact = fact * i; i = i + 1; } Print the result */ printf("The factorial of %d is %f\n", num, fact); } This program makes use of a while-loop. The basic syntax of a while loop is: while ( condition ) statement in other words it repeatedly executes statement while condition is satisfied. The program uses the variable i as a counter, which starts with a value 2 and is repeatedly incremented. Each time it is incremented a second variable, fact, is multiplied by the counter. This continues while the value of i is less-than-or-equal-to the value which we are trying to calculate the factorial of (num). 3.10 Logical expressions and relational operators In an earlier section we covered arithmetic expressions which are used to calculate numerical values (which are typically then assigned to a variable or passed as an argument to a function). The C language also supports a different type of expression, the logical expression, which can be used to test whether a certain condition is true. In the example program above the while loop has an associated logical expression while ( i <= num ) University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 23 In this case the logical expression i <= num evaluates to TRUE so long as i is less-thanor-equal- to num. The following relational operators are available in C: == < <= > >= equal to less than less than or equal to greater than greater than or equal to It is a very common mistake to mix up the equality test operator (==) with the assignment operator (=). It is a quirk in the C language that the assignment operator is legal in all the contexts in which the equality test is valid, however they do very different things. Most modern compilers will attempt to spot this mistake and issue a warning, but this cannot be a substitute for due care and attention when constructing logical expressions! C also provides a means by which conditions can be chained together. Suppose you want to test whether the variable a is within the range 1.0 to 10.0. This would be written as: a >= 1.0 && a < 10.0 The && operator (AND) is a boolean operator which links the two sub-expressions together; it is only TRUE if the first sub-expression is TRUE AND the second subexpression is TRUE. Now suppose you wanted to test whether the variable a is outside that range. There are actually two ways to do this. Firstly you can use the OR operator (||): a < 1.0 || a > 10.0 The || operator (OR) is TRUE if the first sub-expression is TRUE OR the second subexpression is TRUE. The alternative way to perform the test is to check whether the value is within the specified range, and invert the result, using the ! operator (NOT). The value must be outside the range if the value is NOT inside the range: ! ( a >= 1.0 && a <= 10.0 ) University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 24 Note that in this example parentheses are used to dictate the order of precedence as they were in arithmetic expressions. Logical expressions are subject to rules of operator precedence in much the same way as arithmetic expressions are. The basic rules are that the operators <, <=, ==, >= and > have higher precedence than && or ||. The NOT operator (!) has higher precedence than all of the other relational operators, hence the need for the parentheses in the example above, which would otherwise be interpreted as if it were written: ( ! a >= 1.0 ) && ( a <= 10.0 ) 3.11 Exercise 4 - A program to calculate a factorial (i) Type in the program above (call it prog4.c), compile it and run it. (ii) Alter prog4.c so that the counter counts from num down to 1, rather than the other way. (iii) A single precision floating-point variable can only represent values up to about 1038 . What is the largest value of num for which this program can calculate the factorial? Alter the program to use double precision floating-point, which can represent values up to 10308 . What is the largest value of num for which the factorial can be calculated now? 3.12 The for-loop The loop in the example the factorial program: i = 1; /* Initialise counter variable */ while ( i <= num ) { /* Test counter variable */ fact = fact * i; i = i + 1; /* Increment counter variable */ } performs three important operations. Firstly it sets a counter variable (in this case i) to an initial value (1), secondly it tests that the counter variable is still within the required range, and thirdly it increments the counter variable. This form of definite loop is so common in C that a special shorthand statement has been devised for it; the for statement. The for statement encapsulates the three key operations into one statement, which helps readability. The code above can be replaced by the following: University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 25 for ( i=1; i <= num; i = i + 1 ) { fact = fact * i; } which is much less long-winded, and expresses all you need to know about where the loop starts, stops, and how the counter behaves in between, in one single statement. In general you should always use a for statement to control definite loops. 3.13 Compound statements You have probably noticed that the example programs presented so far contain quite a few curly brackets ({and }). Curly brackets are used in C to delimit (enclose) what is called a compound statement. A compound statement is quite simply a block of code made up from one or more statements. A compound statement can be used pretty much anywhere a single statement can be used. When we introduced the while loop we said that the syntax was: while ( condition ) statement You can see that in the example program statement is actually a compound statement consisting of the following two statements: { fact = fact * i; i = i + 1; } Once we replaced the while loop with a for loop there is actually only one statement executed by the loop, so strictly speaking we can drop the curly brackets: for ( i=1; i <= num; i = i + 1 ) fact = fact * i; University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 26 You may see this often if you are reading a C program written by an experienced programmer. It can be quite confusing at first working out where curly brackets are needed and where they aren’t. Just remember that a compound statement is intended to make lots of statements look like a single statement. Be very careful that every curly bracket you use to open a compound statement has a companion bracket that closes that compound statement. If you don’t you will almost certainly get an error message and your program won’t compile. 3.14 Exercise 5 - More repetition (i) Bearing in mind what you have learned about compound statements, what is wrong with writing the loop in the factorial program as follows? while ( i <= num ) fact = fact * i; i = i + 1; This program is syntactically correct, and will compile without an error. What do you think will happen when it is run? (ii) Copy prog4.c to prog5.c and change the factorial program so that it uses the loop above. Compile and run the program to see if you are right. (iii) Change prog5.c so that it uses a for loop in place of the while loop. Compile and run the program to check that it gives the correct answer. 3.15 Entering numbers into a program So far the example programs have been performing arithmetic operations on values of variables which are hard-coded into the program, for example: a = 10.0; b = 2.0; sum = a * b; This approach is not very useful if you want to write flexible programs. You don’t want to have to edit the program and recompile it every time you want to calculate the product of University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 27 a different pair of numbers. What is needed is some way to get numbers into a program, say by typing them at the keyboard when the program runs, or by reading them from a text file. Fortunately this is quite straightforward to do. Earlier we saw how the standard function printf() was used to format text and numbers and print them to the screen. There is another standard function, called scanf(), which performs the opposite operation, namely reading values from the input and assigning these values to variables. /* prog6 - using scanf to read values from user input */ #include <stdio.h> #include <stdlib.h> #include <math.h> int main() { float a, b, sum; /* Read two numbers */ printf("Enter two real numbers: "); scanf("%f %f", &a, &b); /* Calculate the sum, print it out */ sum = a + b; printf("The sum is %f\n", sum); } You can see that scanf() looks very similar to printf() in this example. It reads two real numbers from the keyboard, and assigns the values to two variables a and b. It uses the same format specifier codes (%f to read a real number, %d to read an integer) as printf(). The only difference is that the second and third arguments, which are the variables intended to receive the two values are prefixed by an ampersand (&) in the argument list. The reason for the ampersand is that it asks the C compiler to ensure that the variables are passed to the function in such a manner that the scanf() function can actually write values to the variables. It does this by passing the address of the variable, rather than the value of the variable. This allows the function being called (scanf) to write the value to the correct location in the computer’s memory. We will look at this is much more detail in Section 3.31. 3.16 Exercise 6 - Requesting values from the terminal (i) Type the program which uses function scanf() into the file prog6.c and test that it works. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 3.17 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 28 Subscripted variables (arrays) So far the variables we have been dealing with have held a single value (a scalar). For some problems you might want to collect together a number of related numeric values, for example a list of measurements of a length, or a temperature, or similar. The naive approach is to declare N distinct variables, each with a different name to hold each of the distinct values you are dealing with, for example: float value1, value2, value3, value4; value1=10.0; value2=12.0; value3=15.0; value4=17.0; and then to explicitly work with these variables. Clearly this is very cumbersome and inflexible. The program will have to be altered if the number of values changes, it is very difficult to perform mathematical operations on these values (for example subtracting a constant from all the values, or similar). Imagine how long a program would be if it had to deal with hundreds, thousands or millions of distinct values! Fortunately there is a simple solution offered by almost all programming languages, and C is no exception. It is also possible to define a single variable that represents a collection (or array) of values. In C an array variable is declared as follows: float value[100]; This code fragment declares an array of 100 distinct single-precision real values, and associates them all with the variable name value. The square bracket notation is also used to access the individual elements of the array, for example value[0] accesses the first element of the array, value[1] the second, and value[99] the last. Array indices are always integers. Note that the first element of the array is numbered zero, and therefore that value[100] is not a valid element of this array. This is a very common source of confusion to programmers new to C. Accessing nonexistent elements of an array can cause all sorts of problems. At best you will get the wrong answer, at worst your program will crash with a segmentation fault or bus error. The advantages of arrays are many-fold. For a start it is possible to use another variable as an array index (ie. within the square brackets), which means that operations can easily be carried out on all values in an array simply by using a loop: University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 29 int i; float value[100]; ... ... for ( i = 0; i < 100; i = i + 1 ) value[i] = value[i] + 10.0; This example will add 10.0 to the value of every element of the array; much easier to write and understand than had the values all been held in distinct variables. We can also perform some very important operations on arrays that don’t really make sense for a scalar variable (for example sorting the values into some order). The following program will read ten numbers into an array, then calculate the mean and standard deviation of the values: /* prog7 - defining and using arrays */ #include <stdio.h> #include <stdlib.h> #include <math.h> int main() { int i; float value[10], sum, mean, stddev; /* Read ten values from the keyboard into the array */ for ( i = 0; i < 10; i = i + 1 ) { printf("Enter value %d: ", i); scanf("%f", &value[i]); } /* Calculate the mean */ sum = 0.0; for ( i = 0; i < 10; i = i + 1 ) sum = sum + value[i]; mean = sum / 10.0; printf("Mean value is %f\n", mean); /* Calculate the standard deviation */ sum = 0.0; for ( i = 0; i < 10; i = i + 1 ) sum = sum + powf(value[i] - mean, 2.0); stddev = sqrt(sum) / 10; printf("Standard deviation is %f\n", stddev); } University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 30 Note that the powf() function calculates the first argument raised to the power of the second, so powf(2.0, 3.0) would return 8.0 (which is 23 ). The sqrt() function calculates the square-root of the argument. 3.18 Exercise 7 - Using arrays (i) Type the above program into the file prog7.c and test that it works. 3.19 Return values from functions You have seen how, when calling a function, you can specify a number of arguments (values which are to be used by that function to perform the operation you are requesting of it). Once these operations have been performed, the function can return a value back to the calling program. This value, called the return value, can be assigned to a variable, or used in an expression. In the example above we are using two functions, sqrt() and powf() in this way, embedding them directly into numeric expressions. Hopefully it is obvious why powf(), sqrt() or other mathematical functions should return the value they calculate (they would be pretty useless otherwise!), however you should note that a large fraction of non-mathematical functions also return a value. Often this return value is used to communicate back the success (or otherwise) of the execution of the function. Even printf() returns a value, which indicates the number of characters actually printed, as does the main() function, which is the main body of the program itself. In C it is legal to call a function and ignore its return value; this is what we have been doing when calling printf() in the previous example programs. If you don’t assign the value to a variable, or use it in an expression, the return value is simply discarded. 3.20 Input/output using files Reading data from the keyboard is very useful, however if you have a large amount of data, or need to run a program on the same data time and again, it is not very sensible to repeatedly type the numbers into the program by hand. The solution is to type numbers into a separate file, then read them from that file as if they had been from the keyboard. The following program will read numbers from a file and calculate their mean and standard deviation: /* prog8 - reading from a text file */ #include <stdio.h> University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop /* /* /* /* /* /* /* Document: Issue: Date: Page: CR1 1.0 October 12, 2015 31 #include <stdlib.h> #include <math.h> #define MAXVALS 100 int main() { float value[MAXVALS], sum, mean, stddev; int i, count; FILE *fh; Open the data file */ fh = fopen("file.dat", "r"); if ( fh == NULL ) { printf("Cannot open data file\n"); exit(EXIT_FAILURE); } Initialise counter */ count = 0; Read all the values in the file */ while ( ! feof(fh) && count < MAXVALS ) { if ( fscanf(fh, "%f", &value[count]) > 0 ) count = count + 1; } Close the file */ fclose(fh); Check the file contains data */ if ( count == 0 ) { printf("Data file contains no values\n"); exit(EXIT_FAILURE); } else { printf("Read %d values from file\n", count); } Calculate the mean */ sum = 0.0; for ( i = 0; i < count; i = i + 1 ) sum = sum + value[i]; mean = sum / count; printf("Mean value is %f\n", mean); Calculate the standard deviation */ sum = 0.0; for ( i = 0; i < count; i = i + 1 ) sum = sum + powf(value[i] - mean, 2.0); stddev = sqrt(sum) / count; printf("Standard deviation is %f\n", stddev); } University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 32 This program introduces a number of new and important concepts dealing with files, return values from functions, conditional execution, error checking, indefinite loops and the use of the C pre-processor to represent constant values. To use a file we need to perform three basic operations in sequence: to open the file (for reading and/or writing), to read or write our data, and then to close the file when we are finished with it. Every computer operating system allows files to be given filenames, which are used by the user to identify and distinguish the files on disc, however different operating systems use different conventions to construct the filename and represent its location within a directory structure. The fopen() function is used to open a file and associate a filehandle with it. In the example program above it is used as follows: FILE *fh; fh = fopen("file.dat", "r"); The first argument to fopen() is the name of the file, in this case enclosed in double quotes as it is a string literal. The second argument is the access mode, in other words whether the file is to be opened read-only (”r”), for read/write (”rw”), or write-only access (”w”). Note that if you open an existing file for write or read/write you will overwrite the existing contents. The value returned by the fopen function is a filehandle, which can then be passed to other input/output functions to instruct them which file to access. A filehandle is a variable of a special type FILE*. Reading numbers from a file is achieved by the function fscanf(), which is very similar to the function scanf() used to read values from the keyboard. The only difference is that it has an additional argument, the filehandle representing the file from which the values should be read, for example: fscanf(fh, "%f", &value[count]) The filehandle argument slots in before the format specifier; other than that everything is the same as scanf(). Another point to notice is that the program actually make use of the value returned by the function fscanf(). The C Standard dictates that the fscanf() function should return the number of values actually read from the input file. The format specifier we are using is only asking for a single real value to be read from the file, so our call fscanf() will return the value 1, up until the point where we reach the end of the file and there are no more values to be read. Once this happens fscanf will return zero. The feof() function simply checks whether we have reached the end-of-file, and returns TRUE if we have, or FALSE otherwise. Once the contents of the file have been read, the file can be closed using: fclose(fh); University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 33 The function fprintf() is similar to printf() except there is an extra argument which specifies a filehandle of the destination file. So if you open a file with write access (see above) you can output lines of text to that file (instead of the termimal) using: FILE *fh; fh = fopen("results.dat","w"); fprintf(fh,"Mean value is %f\n", mean); fprintf(fh,"Standard deviation is %f\n", stddev); fclose(fh); 3.21 Exercise 8 - Reading and writing text files (i) Type prog8, the program that reads data from a file above, into the source file prog8.c and create a data file file.dat which contains a list of numbers. Compile and run the program to check that it opens the file, reads the numbers from the file and finally calculates the results and prints them out. (ii) What happens if file.dat contains more than one number per line (separated by spaces)? (iii) Add lines so that the results are also written to a new file called results.dat. Check that the output file is created and contains the correct results. 3.22 Conditional checking execution (the if-statement) and error In prog8.c we are using the function fopen() to try to open a file on disc and associate a filehandle with it. In the case where fopen() is unable to open the named file (because the file does not exist, or the access permissions do not allow the file to be read for example) then fopen() will return the special value NULL. Clearly if we cannot open the file then we cannot sensibly read any data, and we should report an error and exit from the program. It is good programming practice to check for such errors, and can save a lot of time for the user by giving a helpful error message. The program uses the following code to check for such an error: if ( fh == NULL ) { printf("Cannot open data file\n"); exit(EXIT_FAILURE); } University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 34 The if-statement will only execute the compound statement that follows it if the logical expression is TRUE. In this case, if fh is equal to NULL, an error message is printed and the program exits. A further clause (else) can be added to define what will happen if the logical expression is not TRUE: if ( fh == NULL ) { printf("Cannot open data file\n"); exit(EXIT_FAILURE); } else { printf("The file was opened successfully\n"); } 3.23 Indefinite loops In prog4 we introduced the concept of a definite loop, which is a loop for which the trip count (the number of times the loop executes) was known, or could be calculated by looking at the program, before the loop started. In prog8 we use an indefinite loop, where the trip count is unknown prior to the commencement of the loop. Indefinite loops are very useful; in this example we are reading numbers from a data file, and we don’t necessarily know how many numbers will be in the data file when we start reading it. The while loop in this case is while ( ! feof(fh) && count < MAXVAL) in other words while we have not reached the end-of-file, and while we have not filled up the work array value. Using this simple technique the program will run successfully on any data file containing up to MAXVAL values. 3.24 Character strings So far you have learned how to declare and use numeric variables, i.e. variables whose values are numbers. There are many occasions when the values you will want to represent are not numeric. Say for example you were writing a program to manipulate a list of names. Obviously it is difficult to assign the value “John Smith” to an integer or to a real number! Fortunately computer languages, C included, allow us to declare and use variables that represent characters and strings of characters (often called character strings, or string variables). In computer terminology a character is a single letter, number, or punctuation mark. For example A, D, h, 2, * and ) are all characters. A character string University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 35 is simply one or more characters strung together in sequence. The C language uses the data-type char to represent characters. A scalar variable declared to be of type char holds a single character. For example: char cval; cval = "A"; In the C language a character string is represented as an array of characters. char name[20]; The standard library function fgets() is used read a single line for a file as a character string. The function accepts three arguments: the string variable that will receive the result, the maximum number of characters to read (ie. the length of the string variable), and the handle to the open file from which the string will be read. It will read a complete line from the file (up to a newline character), but will only store up to the specified maximum number of characters in the string. It’s important not to read more characters than the string can hold, otherwise your program will overwrite other variables and will probably crash. The function fgets() actually returns a value (the location of the string variable), but we don’t need to use this so we don’t assign this returned value to a variable, we just discard it. If you want to print a string variable to the screen, you can do this using printf(), just like for numeric variables, but you should use the format specifier %s in place of the %d or %f which would be used for numeric variables. char name[20]; fgets(name,20,stdin); printf("The name string is: %s",name); 3.25 Structured data types The data in the table below are taken from the paper “The Velocity-Distance Relation Among Extra-Galactic Nebulae” written by Edwin Hubble and Milton L. Humason in 1931 and published in the Astrophysical Journal. The data show the recessional velocity (measured redshift), rvel km/s, and photographic magnitude (proportional to log10 of the flux), pmag, of a number of clusters of nebulae. The second column, nneb, is the number of nebulae identified in the cluster. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop cluster nneb Virgo 7 Pegasus 5 Pisces 4 Cancer 2 Perseus 4 Coma 3 Ursa_Major 1 Leo 1 Iso_I 16 Iso_II 21 rvel 890 3810 4630 4820 5230 7500 11800 19600 2350 630 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 36 pmag 12.5 15.5 15.4 16.0 16.4 17.0 18.0 19.0 13.8 11.6 Typically such data will be available as a computer text file and to analyse the data we need to read them into arrays in a C program. This is easily done using a modified version of prog8 above. We can read each column into an array of the appropriate type and length. However it’s useful to keep the properties of each cluster together in a single variable so you don’t forget what relates to what and you can pass around the collection of variables for each cluster during any analysis. The C language allows you to define new structured data types, which act like a collection of related variables. Using this feature you can define a new data type to hold all of the information pertaining to a single cluster (or some other entity) in a single variable. The following code does this: #define MAXCLUNAMLEN 10 struct clusterinfo { char name[MAXCLUNAMELEN]; int nneb; float rvel, pmag; }; The struct keyword introduces the definition of the new data type, and is followed by a nametag that identifies the structured type within your program, in this case clusterinfo. This is followed by a block which declares each of the members of the structured type, describing their name, data type and in the case of array members, their size. Note that strings, integer and floating-point variables can be freely mixed within a structured data type. Once this new type has been defined, you can declare variables of the new type, and use them: int main() { University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 37 /* Declare a new variable ’clist’ of type ’clusterinfo’ */ struct clusterinfo clist; /* Access the members */ clist.name="Orion"; clist.nneb = 10; clist.rvel = 1500.0; clist.pmag = 13.1; printf("name %s, nneb = %d, rvel = %f, pmag = %f\n, clist.name, clist.nneb, clist.rvel, clist.pmag); } The dot operator (‘.’) is used when accessing the members of a structured variable, so clist.rvel refers to the rvel member of the structured variable clist. This is equivalent to the $ operator in R. Note that just like variables you cannot use a structured data type (ie. define variables using it) before it has itself been defined. Definitions of structured data types are generally placed near the start of any program (after the header files are #included, but before any executable code). You can define an array of a struc data type by adding a size specification in square brackets. The members of the array are then referenced in the normal way, using an integer counter. #define MAXCLU 100 struct clusterinfo clist[MAXCLU]; int ncu; ... ... /* Access the members */ ncu=0; clist[ncu].name="cluster1"; clist[ncu].nneb=4; ncu=1; clist[ncu].name="cluster2"; ... 3.26 Reading in a table from a file Below is prog9 which reads the Hubble cluster data from a file called hubble.dat into a structured array. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 38 /* prog9 - reading a tabulation file and performing linear regression */ #include <stdio.h> #include <stdlib.h> #include <math.h> #define MAXCLUNAMLEN 10 #define MAXCLU 100 #define MAXHEAD 60 struct clusterinfo { char name[MAXCLUNAMLEN]; int nneb; float rvel,pmag; }; int main() { struct clusterinfo clist[MAXCLU]; char header[MAXHEAD]; int i, count; FILE *fh; float sumx=0.0, sumy=0.0, sumxx=0.0, sumxy=0.0; float del, gradient, intercept; /* Open the data file */ fh = fopen("hubble.dat", "r"); if ( fh == NULL ) { printf("Cannot open data file hubble.dat\n"); exit(EXIT_FAILURE); } /* read header line from file */ fgets(header,MAXHEAD,fh); /* Initialise counter */ count = 0; /* Read all cluster info from the file */ while ( ! feof(fh) && count < MAXCLU ) { if ( fscanf(fh, "%s %d %f %f", &clist[count].name,&clist[count].nneb,&clist[count].rvel, &clist[count].pmag) > 0 ) count = count + 1; } /* Close the file */ fclose(fh); /* Check the file contains data */ if ( count == 0 ) { printf("Data file hubble.dat contains no data\n"); exit(EXIT_FAILURE); } else { printf("Read info for %d clusters from file hubble.dat\n", count); University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 39 } /* List data and perform linear regression on log(rvel) vs. pmag */ printf("%s",header); for (i = 0; i < count; i = i + 1) { printf("%s %d %f %f \n",clist[i].name,clist[i].nneb, clist[i].rvel,clist[i].pmag); sumy = sumy + log10(clist[i].rvel); sumx = sumx + clist[i].pmag; sumxx = sumxx + clist[i].pmag * clist[i].pmag; sumxy = sumxy + clist[i].pmag * log10(clist[i].rvel); } del=count*sumxx-sumx*sumx; gradient=(count*sumxy-sumx*sumy)/del; intercept=(sumxx*sumy-sumx*sumxy)/del; /* print results */ printf("log10(rvel) = %f * pmag + %f\n", gradient,intercept); } The correlation between log of the velocity and the mean magnitude is closely linear. Measured data like these led Hubble to conclude that a galaxy’s recessional velocity was proportional to it’s distance (by calibrating the photographic magnitude in terms of log10 of the distance to the source). This law, known as Hubble’s law, demonstrates that galaxies are not only moving away from Earth, but from each other too, providing strong evidence that the Universe is expanding. 3.27 Exercise 9 - Reading a data table and performing linear regression (i) Create a data file hubble.dat from the data table above. (ii) Create a program prog9 which reads this data file and performs a least squares fit to find the gradient and intercept for the relationship log10 (rvel) = grad × pmag + interc (iii) Using the code in prog9 write down the formulae used to find the gradient m and intercept c in the linear relationship y = mx + c given a set of values xi , yi . University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 3.28 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 40 User-defined functions You have already made use of a few of the large number of mathematical functions in the C standard library. For example: root_val = sqrt(val); will calculate the square-root of the variable val and assign it to the variable root_val. A list of such functions is given in Section 6.10. But what if you wish to calculate a mathematical function that isn’t included in the standard C library? The C language allows you to create userdefined functions, and to use them in the same way as the standard functions provided by the standard C library. User-defined functions can be used to implement virtually any mathematical function, or perform non-mathematical operations. User-defined functions can be as long and complicated as necessary, and are usually used to help to structure the code (to break the overall program into smaller, more understandable segments). In large C programs user-defined functions will form the bulk of the code. 3.28.1 Defining new functions Creating user-defined functions is quite straightforward to do, but you need to supply the compiler with a few pieces of information about your new function. Firstly you need to give the new function a name, in order to identify it. This name should be unique within your program. No other functions or variables that you define should have the same name, otherwise the compiler will not know which function you are referring to when you use that name. For this reason it is not a good idea to use the same name as a pre-existing function in the standard C library. Defining a new function called sin() for example could cause problems, as this function already exists in the standard library (it calculates the sine of an angle). Secondly you need to be able to supply the argument (or arguments) that will be passed to the function. Each argument has a name and a type associated with it. As a example we are going to set up a “sawtooth” function. This is periodic with the period split into three consecutive phases, trise , tf all and tcons . The period is therefore tp = trise + tf all + tcons . The period starts a time t = tzero where the function has a value of 0.0. The function rises linearly to a value of 1.0 at t = tzero + trise . It then falls linearly back to 0.0 at t = tzero + trise + tf all and then remains constant at 0.0 for the rest of the period (a time span of tcons ). The following code defines such a function. /* function to return sawtooth amplitude for a given time */ #include <tgmath.h> double sawtooth_amp(double trise, double tfall, double tcons, double tzero, double t) { double tperiod, tphase, trf, ft; University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 41 tperiod=trise+tfall+tcons; tphase=t-tzero; tphase=fmod(tphase,tperiod); if(tphase<0.0) { tphase=tperiod+tphase; } trf=trise+tfall; if(tphase<trise) { ft=tphase/trise; } else if(tphase<trf) { ft=(trf-tphase)/tfall; } else { ft=0.0; } return ft; } All new function definitions are basically of the form: result-type function-name(arg-type arg1, arg-type arg2, ...) { <some code to calculate the result> return result; } The return statement defines the value that will be returned by the function. 3.28.2 Function prototypes When you use variables in a C program you are used to obeying the rule that a variable must be declared before it can be used. It is not legal to refer to a variable before it has been declared. A similar rule must be obeyed when writing a new user-defined function: you should not refer to the function before it has been declared and/or defined. A function prototype looks like just the first line of a function definition. The function body (the code which is executed when the function is invoked) is missing. The prototype of our sawtooth function would look like this: double sawtooth_amp(double trise, double tfall, double tcons, double tzero, double t); University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 42 A function prototype serves two purposes: firstly it declares the function (so when the compiler encounters the named function in an expression it can distinguish it from a variable), and secondly it provides the compiler with information about the parameters that the function accepts, and the type of the result it returns. This latter point is very important, because it allows the compiler to check that the parameters you are passing to a function when you call it are of the correct type (and perform a limited range of conversions if required). Proper use of function prototypes helps the compiler to ensure that you haven’t made any typographical errors when writing your program. The prototype of a user-defined function should be placed in a program before any code that refers to the function. The best place to put function prototypes in your program is very close to the top of the file, after any #include statements you are using, for example: /* /* /* /* #include <stdio.h> #include <math.h> Function prototypes */ double sawtooth_amp(double trise, double tfall, double tcons, double tzero, double t); ... define your new functions here ... */ int main() { ... main body of the program ... */ ... use the function here ... */ } 3.29 Prototypes of standard library functions You may have been wondering the precise purpose of the #include <stdio.h> and #include <math.h> pre-processor directives which you have been using without explanation to this point. The major purpose of these directives is to include into your program the prototypes of the standard library functions that you wish to use. For example the standard header file stdio.h contains prototypes for input/output-related functions (eg. printf, scanf, fprintf, fscanf, etc.), and math.h contains prototypes for mathematical functions (sin, cos, exp, sqrt, etc.). In all there are fourteen so-called Standard headers that declare prototypes for functions in the Standard library. See the Sections 3.7 and 6.6 for more information about Standard headers. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 3.30 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 43 Exercise 10 - Tabulation of a user-defined function (i) Type the function source code for sawtooth_amp() into a file sawtooth.c and compile it using the command: $ cc -c sawtooth.c The -c switch tells the compiler to simply compile the source code and create an object file called sawtooth.o. If the compilation is successful then this file should have been created on the current working directory. Check that this file exists using: $ ls sawtooth* The * is a wildcard used to specify any file which starts with sawtooth. Note that this compilation has not created an executable because no main() function was present in the source file. (ii) Below is a short program which uses the function sawtooth_amp() and creates a tabulation on the file sawtooth.dat. /* prog10 - tabulation of sawtooth function */ #include <stdio.h> #include <stdlib.h> #include <math.h> #define MAXT 1000 double sawtooth_amp(double tr, double tf, double tc, double tz, double t); int main() { double tmin=-14.,tmax=14.,tsam; double trise=1.0,tfall=5.0,tcons=4.0,tzero=3.0; int nt=MAXT,i; double t[MAXT],ft[MAXT]; FILE *fh; /* Set up time samples and calculate function */ tsam=(tmax-tmin)/(nt-1); for(i=0; i<nt; i=i+1) { t[i]=tmin+tsam*i; University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 44 ft[i]=sawtooth_amp(trise,tfall,tcons,tzero,t[i]); } /* Produce a tabulation of function on file */ fh = fopen("sawtooth.dat","w"); fprintf(fh,"i t amp\n"); for(i=0; i<nt; i=i+1) { fprintf(fh,"%d %f %f\n",i,t[i],ft[i]); } fclose(fh); } Note that this code contains a prototype for the sawtooth_amp() function but doesn′ t include the code for that function (which is in a separate file sawtooth.c). Type the program code into a file prog10.c and compile it using the command: $ cc prog10.c sawtooth.o -lm -o prog10 Note that we now include the object code file sawtooth.o in the list of input files to the compiler. The compiler will pick up this file to get the compiled (object) code for our user defined function. Run the prog10 executable to generate the tabulation file sawtooth.dat and check that this file contains the tabulation expected. 3.31 Pointers Pointers are an important and widely used part of the C language. They are an extremely powerful facility, and one of the prime reasons that C is such a versatile and widely-used language. You have already met and understood the concept of a variable. A variable can be thought of as a container that holds a value (the variables you have met so far have all held the values of numbers). A pointer (or pointer variable) is different from a normal variable; rather than containing a value it points to a location that contains a value. Just like normal variables you must declare pointers before you use them. A statement that declares a pointer variable looks very similar to one declaring a normal variable: int var; /* Declare a variable */ int *ptr; /* Declare a pointer */ This example declares a variable var and a pointer ptr. Notice the presence of the * in the declaration of ptr; it is this that identifies ptr as a pointer variable rather than a normal variable. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 45 The value of a pointer variable is undefined unless you actually assign something to it. You can make the pointer ptr point to the variable var like this: ptr = &var; The ampersand (&) operator returns the location of the variable it is applied to, so &var is the location of the variable var. This is the syntax which we introduced in the argument list of scanf() above. You can assign this location to the pointer variable using the assignment operator in the usual way, using the assignment operator (=). An important point to note here is that ptr is NOT a variable of type int (integer). It is a variable of type “pointer-to-int”. You cannot assign integer values to ptr, so for example the following two statements are NOT legal: ptr = 10; ptr = var; 3.31.1 Assigning locations to pointer variables The value of a pointer variable can be assigned to another pointer variable of the same type: int val; int *ptr1, *ptr2; ptr1 = &val; ptr2 = ptr1; /* ptr1 points to val */ /* ptr2 now points to val */ In this example ptr1 is initialised to point at the variable val, then the value of ptr1 is assigned to ptr2. At this point both ptr1 and ptr2 will point at the variable val. Pointers are not restricted to pointing at integer variables, they can point at variables of any type: float fval, *fptr; double dval, *dptr; int ival, *iptr; Document: Issue: Date: Page: University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop CR1 1.0 October 12, 2015 46 However you must be careful; when assigning to pointers they must point to variables of the correct type. Using the definitions above: iptr fptr dptr iptr dptr dptr fptr = = = = = = = &ival; /* OK */ &fval; /* OK */ &dval; /* OK */ &fval; /* Not legal, iptr is &fval; /* Not legal, dptr is iptr; /* Not legal, pointers dptr; /* Not legal, pointers not not are are pointer-to-float */ pointer-to-float */ not the same type */ not the same type */ If you try to make an illegal pointer assignment the compiler will produce an error and won’t compile your code. 3.31.2 De-referencing pointers Once a pointer variable has been assigned a location of a variable (in other words once it points at a variable) you can read from and write to that variable by using the pointer. To do this you must place an asterisk (*) in front of the name of the pointer, for example: int val, other; int *ptr; ptr = &val; *ptr = 10; /* Assign value to variable val */ other = *ptr; /* Access value of variable val through the pointer */ printf(%d\n, other); /* Will print 10 */ When referring to a pointer in this way you are accessing (reading from and writing to) the location that the pointer points to, not the pointer variable itself. This is called de-referencing the pointer. 3.31.3 Using pointers Pointers have a number of uses, but primarily they are used when your program needs to know the location of a variable, rather than just it’s value. You have already seen how you can write a user-defined function to compute and return a single value, sawtooth_amp() defined in the source file sawtooth.c. In such a function University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 47 the return statement can only return the value of one variable. When calling a function in ‘C’ you can use variables to supply any input parameters necessary (the arguments), however when you do this you are passing the value of the variable, not the variable itself. The values of variables passed in this way can be used within the function but new values or variables calculated by the function cannot be passed back to the calling routine. This behaviour is called pass-by-value. The way to get around this is to pass pointers to the variables as parameters (arguments) to the function, rather than the value of the variables. Using these pointers the function can be instructed to write values into the variables pointed to by the pointers. So in addition to the providing the returned value the function can also change the values of variables within the calling routine. The “sawtooth” function defined above is periodic so as well as having an amplitude at a given time there is also a phase angle defined as a function of time. Conventionally this phase angle has units of radians and takes values in the range 0 − π. To illustrate the use of pointers to return values from a function the function below calculates both the amplitude and phase of our sawtooth. #include <tgmath.h> #define PI 3.1415926535898 /* function to return both amplitude and phase for a given time */ void sawtooth_amp_pha(double trise, double tfall, double tcons, double tzero, double t, double *amp, double *pha) { double tperiod, tphase, trf; tperiod=trise+tfall+tcons; tphase=t-tzero; tphase=fmod(tphase,tperiod); if(tphase<0.0) { tphase=tperiod+tphase; } trf=trise+tfall; if(tphase<trise) { *amp=tphase/trise; } else if(tphase<trf) { *amp=(trf-tphase)/tfall; } else { *amp=0.0; } *pha=(tphase/tperiod)*PI*2.0; } University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 48 The return type for function sawtooth_amp_pha() is declared void because now we are not using the return value. Instead we have declared two arguments as pointers to double *amp and double *pha. Within the body of the function the values associated with these pointers are assigned, *amp=... and *pha=.... The following program uses the sawtooth_amp_pha() function to produce a tabulation of both the amplitude and phase of the sawtooth. /* prog11 - tabulation of sawtooth function amplitude and phase */ #include <stdio.h> #include <stdlib.h> #include <math.h> #define MAXT 1000 void sawtooth_amp_pha(double tr, double tf, double tc, double tz, double t, double *amp, double *pha); int main() { double tmin=-14.,tmax=14.,tsam; double trise=1.0,tfall=5.0,tcons=4.0,tzero=3.0; int nt=MAXT,i; double t[MAXT],amp[MAXT],pha[MAXT]; FILE *fh; /* Set up time samples and calculate function */ tsam=(tmax-tmin)/(nt-1); for(i=0; i<nt; i=i+1) { t[i]=tmin+tsam*i; sawtooth_amp_pha(trise,tfall,tcons,tzero, t[i],&amp[i],&pha[i]); } /* Produce a tabulation of function on file */ fh = fopen("sawtooth.dat","w"); fprintf(fh,"i t amp pha\n"); for(i=0; i<nt; i=i+1) { fprintf(fh,"%d %f %f %f\n",i,t[i],amp[i],pha[i]); } fclose(fh); } In prog11 you can see that by passing a pointer to a variable using &amp[i] and &pha[i] in the call to sawtooth_amp_pha() rather than the it is possible to write user-defined functions that calculate and return more than just a single value. The function calculates the results and writes them directly into the variables that our pointers point to. This is called pass-by-reference; you are passing a reference to a variable rather than its value. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 49 Because the function is using pointers to write directly to these variables, you no longer need the function to supply a return value and the return value has type void and there is no return statement. You should recall that the library function scanf() to read input from the keyboard uses the same mechanism. The arguments to this function are passed as pointers so that the function can assign the values it reads from the keyboard to the variables we want to contain them. 3.32 Exercise 11 - Using pointers to return values from a function (i) Edit the source file sawtooth.c so that includes the definition of the function sawtooth_amp_pha(). Re-compile this source file to produce an object file sawtooth.o which contains both sawtooth_amp() and sawtooth_amp_pha(). (ii) Type the program prog11 into prog11.c and compile and run the program. Remember to include the sawtooth.o file in the compiler input list so that it picks up the sawtooth functions. Run the program to check that it produces a tabulation which contains both the amplitude and phase as a function of time. 3.33 Passing arrays to functions An array in C is a variable with multiple values. These multiple values are arranged in successive locations in the computers memory. Specifying the location of the first element of the array, the number of elements in the array and the data-type of each element is sufficient to fully describe the array. This is the way that arrays are passed to functions in C. A side effect of this is that all arrays in C are passed to functions by reference, not by value, and therefore a function that accepts an array as an argument can write to the array, to alter the values in the elements. In prog11 we used a call which illustrates this. sawtooth_amp_pha(trise,tfall,tcons,tzero,t[i],&amp[i],&pha[i]); The functions sawtooth_amp() and sawtooth_amp_pha() calculate the value of the function at a single value of time t. We can improve on this by setting up a function which calculates the amplitude and phase for an array of time values. The following function does just that. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 50 /* function to return amplitude and phase for a series of times */ void sawtooth(double *tr, double *tf, double *tc, double *tz, int *nt, double *t, double *amp, double *pha) { int i; for(i = 0; i < *nt; i=i+1) { sawtooth_amp_pha(*tr,*tf,*tc,*tz,t[i],&amp[i],&pha[i]); } } You will notice that we have declared all the arguments to the new function sawtooth() as pointers. This is strictly unnecessary for the first six arguments but as you will see it is required if we want to call the function from R as we do in Section 4. 3.34 Exercise 12 - Testing the final sawtooth function (i) Type the function sawtooth() into the source file sawtooth.c and compile it. The object file sawtooth.o will now contain three functions, sawtooth_amp(), sawtooth_amp_pha() and sawtooth(). (ii) Write a program prog12 to use sawtooth() to generate the tabulation file sawtooth.dat. The call to the function should now occur outside the for loop and look something like: sawtooth(&trise,&tfall,&tcons,&tzero,&nt,t,amp,pha); The variables t, amp and pha are declared as arrays so the variable names on their own (without an index specification) act as pointers. It is important the the value of nt passed to the function is less or equal to the declared size of the arrays. If the arrays have less than nt elements then the sawtooth function will attempt to use array indices which are out of range and start overwriting other code. 4 Programming with C and R together The First Year Computing Workshop provided an introduction to using the S language within the R environment. All the basic elements of procedural programming were covered albeit in a simple form. This Second Year Workshop has so far given you an introduction University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 51 to C using the same basic elements of procedural progamming. If you look through the exercises in both workshops you will see they follow a similar pattern. What can be done using R can be done using C. Of course there is an obvious difference. R operates at a much higher level in the sense that the R environment provides facilities like plotting and complicated statistical analysis through simple commands. C acts at a much lower level concerned with the details of data types, pointers and values etc.. In this last part of the Second Year Workshop we show you how to use C and R together to capitalise on the advantages of both. 4.1 Exercise 13 - Using R to repeat Exercise 9 (i) Write an R script, prog13.R, to perform the same linear regression on the data in the hubble.dat file as you did in Exercise 9. You can modify the script from Exercise 6 in the First Year Workshop to do this. (ii) Check that the answer you get is the same as that from prog9. If there is a small difference can you suggest why that might be? (iii) Include R code in prog13.R to plot the data points and plot the best fit linear regression line. Comparing Exercise 9 using C and Exercise 13 using R it should be obvious that performing linear modelling (or linear least squares fitting) using R is significantly easier than using C. This is because in R you are making use of a large body of software which has been incorporated into the R environment. In addition, if you use the lm() command in R it performs a lot more statistical analysis over and above simple linear regression. For example you can calculate and list the estimate standard errors on the fitted parameters (gradient and intercept) using the object created. (iv) Use the lm() command to do the linear regression in prog13 and then list the coefficients and standard errors as follows: lmfit<-lm(y~x) cat("fit parameters: ",lmfit$coefficients,"\n") cat("standard errors: ",sqrt(diag(vcov(lmfit))),"\n") The function vcov() produces the covariance matrix. Then diag() picks out the diagonal of this matrix and finally we estimate the standard errors by taking the sqrt() of the variance values along the diagonal. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 4.2 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 52 Exercise 14 - Plotting the sawtooth function (i) Write a R script prog14.R which plots the sawtooth function using the tabulation produced in Exercise 12. Plot two panels, one for the amplitude vs. time and the other for the phase vs. time. Use this script to check that your sawtooth functions are working correctly. Are the parameters trise, tfall, tcons and tzero doing the right things? Try changing these and re-running prog12 to check. This illustrates how you can perform some calculation or analysis using a C program, write the results to a tabulation file (in this case sawtooth.dat), read the file as a table into R and use R to plot out the results. Of course you could pass data in the opposite direction by performing some analysis using R, writing the results to a tabulation file using write.table() and then use a C program to read the tabulation and perform further analysis. 4.3 Calling a C function from R Routines (functions) can be written in C (or Fortran 77 or C++) and linked to R at runtime so that they can be called from within R. Such routines are compiled and then linked into a shareble object library (also called a dynamically loadable library, ddl, in Windows speak) and this library is then loaded into R when R is running. Two R functions, .C() and .Fortran(), are used to interface with the external routines. There is a fixed mapping between vectors in R and variables/arrays in C or Fortran which must be used. R storage mode logical integer double complex character raw C type int * int * double * Rcomplex * char ** unsigned char * FORTRAN type INTEGER INTEGER DOUBLE PRECISION DOUBLE COMPLEX CHARACTER*255 none All the C type specifications are pointers so all the arguments (parameters) of the C function are passed by reference not value. The C function should not return any value except through its arguments so the function should be defined as type void. This may seem somewhat restricting but in practice it isn’t. If you have a C function you want to call but it doesn’t conform to the above restrictions you can always write a simple wrapper function that does conform. A function which does conform is the sawtooth() function we used in Exercise 12. This is acting as wrapper for the sawtooth_amp_pha() which doesn’t conform. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 53 You can use R to compile and link this function into a shareable library using the shell command: $ R CMD SHLIB sawtooth.c This will create a file sawtooth.so which is the shareble object. Actually you can perform the same task by using the C compiler directly but using the R command above ensures that the shareble object created is linked with all the required libraries. Note: if you have previously compiled the sawtooth.c file the above command to create the shareable object might fail because it doesn’t like the sawtooth.o object file you created. If this happens you should delete the sawtooth.o file and then reissue the R CMD command: $ rm sawtooth.o $ R CMD SHLIB sawtooth.c The is because the code must be compiled with the −f P IC switch (PIC stands for Position Independent Code) in order to create a shareable object library. The following R script uses the .C() function to call the sawtooth() function, load the shareable object and then call the function. # Define the sawtooth function using the C interface sawtooth <- function(tr, tf, tc, tz, t) { a<-.C("sawtooth", as.double(tr), as.double(tf), as.double(tc), as.double(tz), as.integer(length(t)), as.double(t), amp = double(length(t)), pha = double(length(t))) a$t=t return(a) } # Load the shareable (dynamically loadable) library dyn.load("sawtooth.so") # Set up vectors to sample function t<-seq(length=1000,from=-14,to=14) # Call the sawtooth function and plot the results ss<-sawtooth(1.0,4.0,5.0,3.0,t) par(mfrow=c(1,2)) plot(ss$t,ss$amp,type="l") plot(ss$t,ss$pha,type="l") University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 54 The arguments of .C() are straighforward. The first argument is a character string which is the name of the C function, in this case sawtooth. The rest of the arguments correspond directly to the arguments of our C function. Arguments like as.double(tr) specify that the R vector tr is to be passed as a numeric vector (double precision). Note that tr is given in the argument list of the R function we are defining. Arguments like as.integer(length(t)) specify that the length of the R vector (also in the argument list of the R function being defined) is to be passed to the C routine as a pointer to an integer. Values are returned to R from the C function using arguments specify like amp=double(length(t)). This is creating a R vector assigned to the name amp with the correct type (double) and length (the same size as the argument vector t). When the .C() function executes it will create a R object. In the above this object is assigned to the name a. This object will contain components called amp and pha which are the results returned from the function. In addition the line a$t=t will assign a component in a which is t, the times at which the function was evaluated. This is not strictly necessary but it is neat. The object created will contain the original time sample vector and the amplitude and phase vectors created by the function call. Finally the object a is returned as the value of the function. Before the newly defined R function sawtooth() can be called we must load the shareble object. This is done by the line dyn.load("sawtooth.so") R looks for this file in the current working directory unless a path specification is included in the file name. In the script the returned object is assigned to the name ss and plot() picks up the components using the usual $ operator, plot(ss$t,ss$amp,type="l"). 4.4 Exercise 15 - calling the sawtooth C function from R (i) Create the shareble object sawtooth.so using the R CMD SHLIB command. (ii) Type the R script which defines and uses sawtooth into the file prog15.R and try executing this script. You should get a plot similar to the one produced by prog14.R above. (iii) Check that the sawtooth function written in C and called in R is behaving properly. Do this by changing the value of the arguments passed to the sawtooth() function in prog15.R and examining the plots produced to see if the sawtooth shape is being generated properly by your code. University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 4.5 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 55 Using C functions in R The sawtooth function demonstrates how vector objects can be passed between R and C. This example is a little more complicated than is often the case because we have constructed our sawtooth to generate both an amplitude and a phase. In order to pass these back we constructed an object called a within the definition of the R sawtooth() function and assigned this as the object ss when we used the function. The object contains three components, amp, pha and the orginal t. If we simply want to return a single vector we can make things simpler. # Define the sawtooth amplitude function using the C interface sawtoothamp <- function(tr, tf, tc, tz, t) { a<-.C("sawtooth", as.double(tr), as.double(tf), as.double(tc), as.double(tz), as.integer(length(t)), as.double(t), amp = double(length(t)), pha = double(length(t))) return(a$amp) } # Load the shareable (dynamically loadable) library dyn.load("sawtooth.so") # Set up vectors to sample function t<-seq(length=1000,from=-14,to=14) # Call the sawtooth function and plot the results s<-sawtoothamp(1.0,4.0,5.0,3.0,t) plot(t,s,type="l") We could simplify even more by using a C function which didn’t return the phase array. In the above this is being calculated unnecessarily by the C sawtooth() function and ignored. 4.6 Handling 2-D arrays in C and R In many applications it is useful to use 2-D arrays to sample functions of 2 variables or represent some form of image or 2-D distribution. In this section we consider the spectra of thermal sources as examples of functions of 2 variables (photon energy and temperature) which can be sampled using 2-D arrays. The Planck function expressed as a function of frequency ν gives the spectral energy density of Black-body radiation. U(ν) = 8πh ν3 c3 exp(hν/kT ) − 1 University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 56 where h is Planck’s constant, c is the velocity of light and k is Boltzmann’s constant. This function has great significance in the history of Physics and still has prominence in many areas of research. It is particularly important in the analysis of observations of the Cosmic Microwave Background and modern cosmology. We can re-write the function in terms of photons per unit energy interval by substituting E = hν. N(E, T ) = A E2 exp(E/kT ) − 1 (1) Now we are counting photons and not energy so the power in the numerator drops to 2 and we have gathered together the leading constants into a single normalisation constant A (which will not concern us here). Equation 1 is the form of the Black-body spectrum which is most useful to observers using photon counting spectral instruments. The optical depth of a Black-body is infinite and the source is said to be optically thick. Every photon we see has been scattered many times exchanging energy with the electrons and ions such that the radiation is in thermal equilibrium with the matter in the source. Black-body sources arise from dense matter. The spectrum of thermal radiation seen from much more rarified plasma is different. If the optical depth of the source is zero (or close to zero) then the source is said to be optically thin. The photons we see are then characteristic of the emission processes involved and have not been thermalised by any scattering. The continuum thermal radiation seen from such a source is bremsstrahlung radiation which is caused by the braking or deceleration of the negatively charged electrons as they pass close to positively charged ions. The volume emissivity of thermal bremsstrahlung radiation is given by J(ν) = 6.8 × 10−37 Z 2 Ne Nz T −1/2 G(ν, T )exp(−hν/kT ) Watts s−1 m−3 Hz−1 where Z is the nuclear charge of the ions, Ne and Nz are the number densities of the electrons and ions and G(ν, T ) is the so-called Gaunt factor calculated using quantum mechanics to describe the scattering process which produces the radiation. Again, we can re-write this in terms of photons per unit energy interval N(E, T ) = B.G(E, T )T −1/2 exp(−E/kT ) E (2) Note that the factor E in the denominator on the right-hand side converts the spectral energy to a number of photons. B is a normalisation constant which contains details of the geometry and composition of the source and will not concern us here. The Gaunt factor is a complicated function which can be approximated using polynomials. We won’t be concerned with the physical details here so instead of specifying the Gaunt factor using equations we will proceed by giving you a C function which calculates an approximation to the Gaunt factor. #include <math.h> University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 57 /* Born approximation to the Gaunt factor Based on the Kellogg, Baldwin & Koch (ApJ 199, 299) polynomial fits to the numerical values in Karzas & Latter (ApJS 6, 167) arguments: ekev photon energy kev, temp temperature keV R. Willingale Nov 2013 */ double gaunt(double ekev, double temp) { double x,y,z,t2,y2,bo,aio,fac; x=(ekev/temp)*0.5; if(x<2.0) { t2=x*x/3.75/3.75; aio=(((((.0045813*t2+.0360768)*t2+.2659732)*t2+1.2067492)*t2+ 3.0899424)*t2+3.5156229)*t2+1.0; y=x*0.5; z=log(y); y2=y*y; bo=-aio*z+(((((.0000074*y2+.0001075)*y2+.00262698)*y2+ .0348859)*y2+.23069756)*y2+.4227842)*y2-.57721566; fac=exp(x)*bo; } else { y=2./x; bo=(((((.00053208*y-.0025154)*y+.00587872)*y-.01062446)*y+ .02189568)*y-.07832358)*y+1.25331414; fac=bo/sqrt(x); } return .551329*fac; } You will note that the photon energy is specified by the variable ekev and is expressed in unit of keV (kilo electron volts). You will also notice that Boltzmann’s constant is not used. This is because the temperature variable temp is also expressed in unit of keV. We convert from temperature in Kelvin to eV by multiplying by Boltzmann’s constant and dividing by the charge on the electron. We can sample the Gaunt factor over a 2-D grid (array) of points using the following C function. /* Generate a 2-D array of Gaunt factor values */ void gaunt_array(int *ne, double *ekev, int *nt, double *temp, double *garr) { int i,j,ij; University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 58 for(i=0; i< *ne; i++) { for(j=0; j< *nt; j++) { ij=i+j*(*ne); garr[ij]=gaunt(ekev[i],temp[j]); } } } The pointer int *ne specifies the number of photon energy grid points. The pointer double *ekev specifies the array of photon energy values. Similarly the temperature grid points are specified by pointers int *nt and double *temp. The total number of grid points will be ne*nt and the grid of Gaunt factor values is returned in the array with pointer double *garr. Therefore *garr must be declared as a double precision array of this total size (or larger) in the calling routine (either C or R). Within the function gaunt_array the output array is indexed as a 1-D array using the integer variable ij=i+j*(*ne). Successive rows of energy values corresponding to each of the temperature values are packed into garr in sequence. It is possible to define multidimensional arrays with 2 or more indices in C but using the simple single index arithmetic (index ij) is more convenient in the above application. So the reference garr[ij] is equivalent to using 2 indices garr[i][j] in C or garr[i,j] in R. If we compile the C functions gaunt() and gaunt_array() into a shareable library spec_fun.so we can call gaunt_array() from a R script. # Define the Gaunt factor function gaunt <- function(ekev,temp) { ne<- length(ekev) nt<- length(temp) nn<- ne*nt a<-.C("gaunt_array", as.integer(ne), as.double(ekev), as.integer(nt), as.double(temp), f=double(length=nn)) dim(a$f)<- c(ne,nt) return(a$f) } # Load the shareable (dynamically loadable) library dyn.load("spec_fun.so") # Set up vectors to sample function in log10(ekev) and log10(temp) ne<- 50 nt<- 50 lekev<-seq(length=ne,from=-1,to=3) ltemp<-seq(length=nt,from=1,to=2) ekev<- 10^lekev temp<- 10^ltemp University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 59 # Call the Gaunt factor function g<-gaunt(ekev,temp) # Plot grid in perspective persp(lekev,ltemp,g,theta=30,phi=30,col = "lightblue", xlab="log10(photon energy kev)", ylab="log10(temperature keV)", zlab="Gaunt factor") Within the definition of the R function gaunt() we find the size of the grid ne by nt using the lengths of the ekev and temp array arguments. The output array f is specified with length nn=ne*nt Before we return the output array we set the dimensions using dim(a$f)<- c(ne,nt) so that the results is a 2-D array. In the above R script the grid uses logarithmic sampling in both photon energy (ekev) and temperature (temp) so we cover a large dynamic range 0.1 < ekev < 1000 and 10 < temp < 100 both in units of keV. 4.7 Exercise 16 - Investigating thermal spectra (i) Create a C source file spec_fun.c containing the C functions gaunt() and gaunt_array() given above. Compile these functions using the R CMD SHLIB command to create the shareable object spec_fun.so. (ii) Create a R script file prog16.R which defines the R function gaunt(), loads the shareable library, generates and plots a 2-D grid of values of the Gaunt factor. (iii) Modify the R script prog16.R so that it also plots the Gaunt factor vs. photon energy for a temperature of 1 keV and 100 keV on a single graph. You can pick out the required Gaunt factor values from the 2-D array using indices as indicated in the following snippet of R script. plot(ekev,g[1:ne,nt:nt],type="l",log="x", xlab="Photon Energy keV",ylab="Gaunt factor") lines(ekev,g[1:ne,1:1],type="l") (iv) Write 4 more functions in the C source file spec_fun.c to calculate the Black-body photon spectrum and Bremsstrahlung photon spectrum. You should use the formulae in Equations 1 and 2 replacing the energy term kT by the temperature T expressed in keV. Do not include the normalisation constants A or B. The functions should have names Document: Issue: Date: Page: University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop CR1 1.0 October 12, 2015 60 following the convention set by the Gaunt factor functions, bbody(), bbody_array(), brems(), brems_array(). Re-compile to create a new shareable object file which contains the new functions along with the original Gaunt factor functions. (v) Write R scripts plot_bbody.R and plot_brems.R to plot 2-D sample grids for the Black-body and Bremsstrahlung photon spectra. The dynamic range of the photon spectra is large so in order to get a full picture it is better to plot the photon flux values using a logarithmic scale. When using the persp() function for plotting a projection of the grid this can be achieved as follows. b<-bbody(ekev,temp) z<- log10(b) z[b==0]<- NA z[z<(max(z,na.rm=T)-4)]<- NA persp(lekev,ltemp,z,theta=30,phi=30,col="lightblue") (vi) Modify both plot_bbody.R and plot_brems.R so that they also plot the photon spectra for a temperature of 1 keV and 100 keV on single graph. (vii) Modify plot_bbody.R so that it also calculates the mean photon energy in the Blackbody spectrum as a function of temperature T . Hint - to do this will require integrations over energy for each value of T . Is the result what you might expect from classical statistical mechanics if you consider the spectrum to arise from a photon gas? 5 Programming unaided So far we have provided you with examples of all the required code to help you complete the exercises. For the final exercise below you will be programming largely unaided to give you practice in writing your own code. The gravitational potential of a sphere with uniform density, radius R0 and mass M0 is V (r) = 0 − GM r GM0 2R0 if r > R0 r2 R20 −3 otherwise where r is the radius from the centre and G is the gravitational constant. If the centreq of the sphere is at position (x0 , y0 , z0 ) then the radius at position (x, y, z) is r = ((x − x0 )2 + (y − y0 )2 + (z − z0 )2 ). If you have 2 or more spheres of different sizes and at different positions then the total gravitational potential will be the sum of P the individual components, Vtot = Vn . University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 5.1 Document: Issue: Date: Page: CR1 1.0 October 12, 2015 61 Exercise 17 - The gravitational potential of the Earth-Moon system radius of Earth mass of Earth radius of Moon mass of Moon Earth-Moon distance centre-to-centre Gravitational constant 6371 km 5.972 × 1024 kg 1737 km 7.348 × 1022 kg 384400 km 6.674 × 10−11 m3 kg−1 s−2 (i) Write a C function to calculate the gravitational potential of a uniform sphere as a function of some given arbitary position (x, y, z) above or below the surface. (ii) Write a second C function that uses the above and which can be called from R to calculate the gravitational potential of a uniform sphere over a 3-D grid of points in (x, y, z). (iii) Write a R script that uses these C functions to calculate the gravitational potential of the Earth (ignoring the influence of the Moon) over a plane which contains the centre of the Earth and which extends out to several Earth radii. Plot the grid in projection. (iv) Write a R script the calculates the gravitational potential of the full Earth-Moon system along the line that joins the centres. Plot this potential as a function of position. (v) Add code to the R script to find the gravitational potential of the Earth-Moon system at the centre of the Earth and the centre of the Moon. (vi) Finally add code to the R script to estimate the energy in Joules/kg required to escape from a) the surface of the Earth and b) the surface of the Moon. 6 6.1 Quick C Reference Variables, types and declarations The fundamental data types in C are: char short int int a single character (usually 1 byte of 8 bits) usually 1 byte an integer usually 2 bytes University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop long int float double long double Document: Issue: Date: Page: CR1 1.0 October 12, 2015 62 a large integer usually 4 bytes single precision floating point usually 4 bytes double precision floating point high precision floating point The integer types including char can be qualified by unsigned or signed to determine whether or not a sign bit is included. unsigned int signed int unsigned char signed char usually 2 bytes giving range 0 to 65535 usually 2 bytes giving range -32768 to 32767 1 byte giving range 0 to 255 1 byte giving range -128 to 127 Derived types are created using the declaration operators: * & [] () 6.2 pointer, a prefix operator reference, a prefix operator array, a postfix operator function, a postfix operator Constants Integer constants are written as: 9876 987654321L 987654321l U9876 u9876 assumed type int type long type long type unsigned int type unsigned int Integers can be expressed in octal or hexadecimal: 0123 0x134A leading zero indicate octal constant leading 0x (zero x) indicates hexadecimal Floating point constants are written as: University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop 4.321 4.3e-5 3.1f 3.1e-1F 3.1l 3.1e-1L type type type type type type Document: Issue: Date: Page: CR1 1.0 October 12, 2015 63 double double float float long double long double A character string constant is enclosed in double quotes. "This is a character string constant" 6.3 Reserved identifiers There is a set of identifiers reserved for use as keywords in C and C++ and these must not be used otherwise. asm auto break case catch char class const continue default delete do double else enum extern float for friend goto if inline int long new operator private protected public register return short 6.4 Input and output signed sizeof static struct switch template this throw try typedef union unsigned virtual void volatile while scanf("%d",&value); // scan (read) standard input for value printf("value typed %d \n",value); // write to standard output fgets(line,sizeof(line),stdin); // get character string from file stream sscanf(line,"%f",&ans); // scan string for value nc=sprintf(textline,"an integer %d",ival); // write to a character string The complete set of format specifiers is: %d integer decimal notation University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop %o %x %u %c %s %e %f %g Document: Issue: Date: Page: CR1 1.0 October 12, 2015 64 integer unsigned octal notation integer unsigned hexadecimal notation integer unsigned decimal single character string of characters floating point (single or double) exponential notation floating point (single or double) decimal notation floating point as %e or %f, whichever is shorter The full width (w) and precision (p) can be specifed by including them directly after the % character, %w.pf %10d format integer with full width 10 characters %15.10%f format real with full width 15, 10 digits after decimal point The full list of escape characters is: \n \t \v \b \r \f \a \\ \? \’ \" \0 \ooo \xhhh newline horizontal tab vertical tab backspace carriage return form feed alert or bell backslash question mark single quote double quote null octal number hexadecimal number 6.5 Operators C has a very rich set of operators. Here is a list of common operators in order of precedence. [] subscripting pointer[expr] University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop () ++ ~ ! + & * () * / % + < <= > >= == != & ^ | << >> && || ?: = , Document: Issue: Date: Page: CR1 1.0 October 12, 2015 65 function call expr(expr_list) post/pre increment lvalue++ or ++lvalue complement ~expr not !expr unary minus -expr unary plus +expr address of &lvalue indirection (dereference) *expr cast (type)expr multiply expr*expr divide expr/expr modulo (remainder) expr%expr add (plus) expr+expr subtract (minus) expr-expr less than expr<expr less than or equal expr<=expr greater than expr>expr greater than or equal expr>=expr equal expr==expr not equal expr!=expr bitwise AND expr&expr bitwise exclusive OR expr^expr bitwise inclusice OR expr|expr left shift bits expr<<shift right shift bits expr>>shift logical AND expr&&expr logical inclusice OR expr||expr conditional expression expr?expr:expr simple assignment lvalue=expr comma (sequencing) expr,expr In this table lvalue is an entity which can appear on the left hand side of an assignment, typically a variable name. This may be a simple variable, an array element or a pointer. You should be careful that an lvalue is what you intend, a pointer or a primitive type. The type of both sides of an assignment should be the same. If you look at some C code you will often see composite operators like +- which means add and assign. These can be confusing and I suggest to begin with you avoid using these. 6.6 Compiler directives #include <sys/pci.h> // include a system header file University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop #include "pciadc.h" #define DEVICE_ID 0x0adc Document: Issue: Date: Page: CR1 1.0 October 12, 2015 66 // include a local header file // define a replacement macro Commonly used macro definitions in header files: EXIT_SUCCESS // function integer return OK EXIT_FAILURE // funciton integer return not OK 6.7 Conditional statement blocks The basic conditional statement block has the form: if(expression1) statement1; else if (expression2) statement2; else statement; If the statements require more than one line you must use curly braces to gather together the scope of each conditional: if(expression1) { statement1a; statement1b; ... } else if (expression2) { statement2a; statement2b; ... } else { statementa; statementb; ... } University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 67 In either case the statement or statements following the (expression) are executed if the value of the expression is true or non-zero. 6.8 Definite loops A typical definite loop has the form: int a[10]; int i; for(i=0;i<10;i++) { a[i]=i; } Indefinite and infinite loops Indefinite loops come in two forms: while(expression) { body of loop } do { body of loop } while(expression) In the second variant the body of the loop is executed before the expression is evaluated thus ensuring the body is executed at least once. An infinite loop can be set up using: for(;;) { ... if(finish loop expression) break; ... } University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop Document: Issue: Date: Page: CR1 1.0 October 12, 2015 68 Such a loop should be terminated using break as shown or return. The break statement can be used to terminate any for(), while() or do structure. 6.9 Functions and header files The main program function: the integer argc is the number of command line arguments which are passed in character string array argv. Note argv[0] is the name of the program as it occurs on the command line: int main(int argc, char* argv[]) The prototype declarations of all functions are held in header files. There are an enormous number of these files and an even larger number of prototype function declarations. The system wide header files are usually found at /usr/include on Unix and Unix-like systems. For example the floating-point mathematics functions are declared in math.h. To use any of the functions you must declare them by including the appropriate header file at the top of your source file: #include <math.h> 6.10 List of mathematical functions in C Here is a list of some of the mathematical functions available in C. If you wish to use any of the following mathematical functions in your program, you will need to ensure that you have the line #include <math.h> in the first few lines of your program, and that you are using the -lm compiler switch when compiling your code. sin(x) cos(x) tan(x) Sine of x (x in radians) Cosine of x (x in radians) Tangent of x (x in radians) University of Leicester Department of Physics and Astronomy Second Year C and R Programming Workshop asin(x) acos(x) atan(x) atan2(x,y) exp(x) log(x) log10(x) powf(x,y) sqrt(x) fabs(x) fmod(x) ceil(x) floor(x) Document: Issue: Date: Page: CR1 1.0 October 12, 2015 69 Arcsine of x (result lies between !/2 and +!/2) Arccosine of x (result lies between 0 and +!) Arctangent of x (result lies between !/2 and +!/2) Arctangent of y/x (result lies between ! and +!) Exponential function Natural logarithm (base e) of x Logarithm to base 10 of x x to the power y (xy) Square-root of x Absolute value of x Returns the remainder of x/y, with the sign of x Returns the smallest integer not less than x Returns the largest integer not greater than x In the above table, x, and y are of type double. All the above functions return double results.