Second Year C and R Programming Workshop - X

advertisement
Second Year C and R Programming Workshop
Prof. R. Willingale
October 12, 2015
Department of Physics and Astronomy
University of Leicester
University Road
Leicester LE1 7RH
Telephone
Internet
Email
+44-116-252-3556
http://www.star.le.ac.uk/zrw
zrw@le.ac.uk
Contents
1 Introduction
5
1.1
Getting started with SPECTRE . . . . . . . . . . . . . . . . . . . . . . . .
5
1.2
Logging on to SPECTRE remotely . . . . . . . . . . . . . . . . . . . . . .
7
2 Linux shell commands
8
2.1
NEDIT and EMACS text editors . . . . . . . . . . . . . . . . . . . . . . . 10
2.2
Editing text and the command-line - cut and paste . . . . . . . . . . . . . 11
2.3
Adding module load R to your bash profile . . . . . . . . . . . . . . . . . . 12
2.4
Summary of a few Linux shell commands . . . . . . . . . . . . . . . . . . . 12
2.5
Exercise 1 - Creating a C source file . . . . . . . . . . . . . . . . . . . . . . 13
1
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
3 C programming
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
2
14
3.1
Compiling and running a C program . . . . . . . . . . . . . . . . . . . . . 14
3.2
Exercise 2 - Your first C program . . . . . . . . . . . . . . . . . . . . . . . 14
3.3
Anatomy of the main C program . . . . . . . . . . . . . . . . . . . . . . . 15
3.4
Variable names and numeric data types . . . . . . . . . . . . . . . . . . . . 16
3.5
Printing output and formatting numbers . . . . . . . . . . . . . . . . . . . 17
3.6
Arithmetic expressions and mathematical functions . . . . . . . . . . . . . 18
3.7
Symbolic constants and the pre-processor . . . . . . . . . . . . . . . . . . . 19
3.8
Exercise 3 - Variable types, functions and precision . . . . . . . . . . . . . 20
3.9
Repeating instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.10 Logical expressions and relational operators . . . . . . . . . . . . . . . . . 22
3.11 Exercise 4 - A program to calculate a factorial . . . . . . . . . . . . . . . . 24
3.12 The for-loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.13 Compound statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.14 Exercise 5 - More repetition . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.15 Entering numbers into a program . . . . . . . . . . . . . . . . . . . . . . . 26
3.16 Exercise 6 - Requesting values from the terminal . . . . . . . . . . . . . . . 27
3.17 Subscripted variables (arrays) . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.18 Exercise 7 - Using arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.19 Return values from functions . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.20 Input/output using files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.21 Exercise 8 - Reading and writing text files . . . . . . . . . . . . . . . . . . 33
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
3
3.22 Conditional execution (the if-statement) and error checking . . . . . . . . . 33
3.23 Indefinite loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.24 Character strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.25 Structured data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.26 Reading in a table from a file . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.27 Exercise 9 - Reading a data table and performing linear regression . . . . . 39
3.28 User-defined functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.28.1 Defining new functions . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.28.2 Function prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.29 Prototypes of standard library functions . . . . . . . . . . . . . . . . . . . 42
3.30 Exercise 10 - Tabulation of a user-defined function . . . . . . . . . . . . . . 43
3.31 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.31.1 Assigning locations to pointer variables . . . . . . . . . . . . . . . . 45
3.31.2 De-referencing pointers . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.31.3 Using pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.32 Exercise 11 - Using pointers to return values from a function . . . . . . . . 49
3.33 Passing arrays to functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.34 Exercise 12 - Testing the final sawtooth function . . . . . . . . . . . . . . . 50
4 Programming with C and R together
50
4.1
Exercise 13 - Using R to repeat Exercise 9 . . . . . . . . . . . . . . . . . . 51
4.2
Exercise 14 - Plotting the sawtooth function . . . . . . . . . . . . . . . . . 52
4.3
Calling a C function from R . . . . . . . . . . . . . . . . . . . . . . . . . . 52
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
4
4.4
Exercise 15 - calling the sawtooth C function from R . . . . . . . . . . . . 54
4.5
Using C functions in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.6
Handling 2-D arrays in C and R . . . . . . . . . . . . . . . . . . . . . . . . 55
4.7
Exercise 16 - Investigating thermal spectra . . . . . . . . . . . . . . . . . . 59
5 Programming unaided
5.1
60
Exercise 17 - The gravitational potential of the Earth-Moon system . . . . 61
6 Quick C Reference
61
6.1
Variables, types and declarations . . . . . . . . . . . . . . . . . . . . . . . 61
6.2
Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3
Reserved identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.4
Input and output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.5
Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.6
Compiler directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.7
Conditional statement blocks . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.8
Definite loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.9
Functions and header files . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.10 List of mathematical functions in C . . . . . . . . . . . . . . . . . . . . . . 68
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
1
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
5
Introduction
This 2nd year Workshop is a direct follow-up to the introduction to computing using R
given in the 1st year Workshop.
In the 1st year workshop you used the UoL IT Windows machines to run the R
environment. In this workshop you will be using Linux running on the UoL IT SPECTRE
system.
The workshop is divided into three sections. The first section shows you how to use
SPECTRE and the Linux operating system. The second section is an introduction to C
programming. The third section is about using C and R in combination: transfering data
to/from programs written in C and the R environment; writing C functions which can be
called from within R. The total contact time is 12 hours.
This script is a pdf file and can be found on Blackboard under course PA2900 or at
http://www.star.le.ac.uk/zrw/compshop/C_R_workshop_2nd.pdf
We suggest you download the file to your Desktop for ease of use.
1.1
Getting started with SPECTRE
You can access SPECTRE using a X-Terminal client called NX running on the UoL IT
Windows system. So first you must logon to a UoL IT Windows machine. If you have
not yet used NX on your Windows account then you must install it.
Programs-->Program Installer-->NX Client (click)
Once installed you can start NX for a SPECTRE session.
Programs-->NX Client (click)
SPECTRE (click)
In order to logon to SPECTRE you must supply your UoL username and password. NX
will open a window for SPECTRE. In order to do the workshop you will need to start a
terminal window for SPECTRE.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
6
Applications-->System Tools-->Terminal (click)
The terminal window will open within the NX window and the prompt within the terminal
window will look something like:
[zrw@spectre02~]$
This indicates the username (zrw in my case), the machine (spectre02 for this particular
login) and the directory (~ is the home directory of the user).
This terminal window serves the same function as the R Console. You type commands
against the prompt and the system will execute these commands. Under Linux (or more
generally UNIX) the prompt is produced by what is called a “shell”. This is a program
which runs within the terminal and acts as an interface or wrapper around the system
giving the user access to all aspects of the system.
One application you are already familiar with is R. Before you can use R on SPECTRE
you must load the module using the following command.
[zrw@spectre03 under]$ module load R
You must do this at the start of each SPECTRE session unless you put this command
line in to your .bash_profile file. We will do this later.
Once the R module is loaded you can start R by simply typing “R return” in response
to the prompt in the terminal window (note the command is upper-case; on SPECTRE
a lower-case r will not work).
[zrw@spectre03 under]$ R
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type ’license()’ or ’licence()’ for distribution details.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
7
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type ’contributors()’ for more information and
’citation()’ on how to cite R or R packages in publications.
Type ’demo()’ for some demos, ’help()’ for on-line help, or
’help.start()’ for an HTML browser interface to help.
Type ’q()’ to quit R.
>
The above should be familiar. R is using the terminal window as a Console. You could
now go ahead and redo the 1st year workshop using SPECTRE. You might like to try a
few of the R commands you learnt in the 1st year to try it out.
1.2
Logging on to SPECTRE remotely
You can logon and use SPECTRE from your own computer/laptop on or off Campus
providing you have an internet connection.
Details about remotely connecting to Spectre can be found at:
http://www2.le.ac.uk/offices/itservices/ithelp/services/hpc/spectre/access/nx
If you are using a Windows machine then you can download NX (free) from the following
link.
http://www.star.le.ac.uk/zrw/compshop/nx/
Download all the files onto your Desktop. If you execute the *.exe files they will install
NX and the required fonts. The files SPECTRE.nxs and SPECTREFullScreen.nxs can
be used to start up NX for login to Spectre.
If you start up NX without using the *.nxs configuration files you will need to set the
configuration by hand
host:
spectre.le.ac.uk
port:
22
Desktop:
Unix KDE(or your choice)
Connection: ADSL or LAN etc. depending on your internet connection
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
8
If you are using a Mac then you should use the native OS X services. To get full
functionality, including plotting, you need to install X-Windows software. This is available
with the Mac application Xcode which you can download free.
https://developer.apple.com/xcode/
The most recent versions of Mac OS X and Xcode do not include the X11 application.
You can download this free from:
http://xquartz.macosforge.org/landing/
Xcode will also give you a local C compiler (gcc) so that you will be able to do the
workshop on your local machine if you so desire. When you have Xcode and X11 start
up a Terminal window
Applications-->Utilities->Terminal
This Terminal window will be running the bash shell (see details about shell commands
below). To connect to SPECTRE use the following command subsitituting your own
UoL username. Note the dollar sign is the shell prompt which will appear in the terminal
window.
$ ssh -X username@spectre.le.ac.uk
The ssh command sets up a secure shell connection to the target machine which in this
case is SPECTRE. The -X switch enables the X-protocol so that you can send plotting
commands and similar from SPECTRE to your local Mac.
2
Linux shell commands
The NX window which opens when you logon to SPECTRE provides a Windows-like
interface to the Linux operating system. This includes pull-down menus for starting
applications and so forth. However, the Terminal application which was described in the
introduction above provides a more basic interface using shell commands. Anything you
can do using the pull-down menus and clicking with the mouse can also be done using
shell commands and there are many things which are a lot easier to do this way.
In the following text we will use a $ at the start of a line to represent the shell prompt
(similar to the > used by R in the R console).
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
9
An important concept when using Linux is the current “working directory”. This is a
directory file (or folder in Windows speak) which will be used by applications as the
default for writing new files or reading old files. It is very similar to the same concept
used in the R environment. In fact, when you start R from a Terminal window the R
working directory will be the same as the Linux working directory.
You can find out what the current working directory is using the command:
$ pwd
/home/z/zrw
In the above example the command pwd has listed my home directory on SPECTRE. This
is performing the same function as the R command getwd() you are familiar with from
the 1st year workshop. You can list the files in the current working directory using the
command ls so
$ ls
bin
chaos...
You should have a file called bin in your home directory but you won’t have a file called
chaos!
You can create a new directory within the current working directory using the following
$ mkdir Cwork
This will create the directory file Cwork. You can then change the current working
directory using
$ cd Cwork
Here cd assumes that Cwork is within the current working directory. You can override
this by giving a complete path specification.
$ cd /home/z/zrw/Cwork
If you want to move back one step up the directory tree then use
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
10
$ cd ..
If you want to delete a file then use the rm command.
$ rm filename
This will not work if the file is a directory. To delete a directory use the command rmdir.
$ rmdir dirfilename
Use the commands rm and rmdir carfully. Once you have deleted a file it can’t be
recovered.
You can copy a file to create another file using the cp command.
$ cp oldfilename newfilename
After this there will be two files, oldfilename and newfilename.
If you want to change the name of an existing file you use the mv command.
$ mv oldfilename newfilename
After this operation oldfilename will have disappeared and will have been replaced by
newfilename.
2.1
NEDIT and EMACS text editors
You can create or edit a text file using a text editor. You are already familiar with using
such an editor within the R environment. On Linux systems there is a choice of editors.
All UNIX and Linux systems offer the vi editor (the name stands for visual for some
reason) but most users find vi strange and awkward to use (I don’t, I think it’s wonderful
and I’m using it to type this document). nedit and emacs are more conventional. To
invoke the editor use
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
11
$ nedit filename &
or
$ emacs filename &
Both of these commands open up an editor window. The & at the end of the command
tells the shell to run the editor as a “background job”. This means that you will be
able to switch focus between the editor window and the original terminal window keeping
both windows active and accessible. If filename exists in the current working directory
then the contents of the file (or at least some lines at the start) will be displayed within
the window. If filename doesn’t exist the window will be blank and the editor will be
creating a new file.
When you have finished entering or editing text in the editor window you can save the
file using the File menu at the top left of the editor screen. You don’t need to close the
editor window, in fact it is best not too if you intend to make further edits to the file.
A simple way to list a file at the terminal is to use the more command.
$ more filename
This prints out a few lines at a time. You use “ return” to scroll to the next few lines.
2.2
Editing text and the command-line - cut and paste
When using NX in Windows you can cut and paste text from one place to another, for
example from the Acrobat Reader window containing this document to the nedit window
used to edit a text file. Select the text with the mouse (hold down the left button and
sweep across text). Use ctrl-c to copy the highlighted text to the clipboard. Move to
the nedit window and use ctrl-v to paste the text from the clipboard into the file. You
will be able to lift the source code of programs listed in this document into source files on
SPECTRE using this method to save typing time. If you are logged on to SPECTRE from
another Linux machine running X-Windows or a Mac this cutting and pasting between
windows is still possible but the method is a little different.
When using the Terminal window you can recall and edit previous lines you have typed
using control key strokes in the same way that you can in the R Console window.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
ctrl-p
ctrl-n
ctrl-a
ctrl-e
ctrl-b
ctrl-f
2.3
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
12
go to previous line
go to next line
go to start of current line
go to end of current line
move 1 character back along current line
move 1 character forward along current line
Adding module load R to your bash profile
Typing module load R at the start of each SPECTRE session is tedious. You can avoid
this by putting this line into your bash (shell) profile file. Use emacs or nedit to add this
line to your .bash_profile file. This file exists in your home directory so make sure you
are in your home directory (using the cd command without specifying any file) before
trying to edit the file.
# cd
$ nedit .bash_profile &
Add the line module load R at the end of the file and save. Next time you start a terminal
screen the command R will be available without the need to load the module.
2.4
Summary of a few Linux shell commands
To summarize here is a list of the Linux shell commands we have introduced so far.
pwd
ls
mkdir
cd dirfilename
cd /full/path/file
cd ..
nedit filename &
more filename
rm filename
rmdir dirfilename
cp oldfile newfile
mv oldfile newfile
print the current working directory
list files in the current working directory
create a new directory in current working directory
change directory to file in current working directory
change directory using a complete path specification
move one step up directory tree
start editing a file
list a file at the terminal
delete file from working directory
delete directory file
create a copy of a file
change the name of a file
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
13
More details about these commands and other aspects of the UNIX/Linux operating
system can be found at the link UNIXhelp for Users in
http://www.star.le.ac.uk/zrw/compshop
2.5
Exercise 1 - Creating a C source file
(i) Create a directory Cwork in your home directory on SPECTRE.
(ii) Change the working directory to Cwork.
(iii) Use nedit to create a C source file called prog1.c containing the following short C
program. Note that the file name extension .c indicates a C source file.
/* prog1 - a
#include
#include
#include
simple first program in C */
<stdio.h>
<stdlib.h>
<math.h>
int main()
{
/* Declare variables */
float a, b, sum;
/* Assign values to variables */
a = 10.0;
b = 2.0;
/* Calculate the sum, print it out */
sum = a + b;
printf("The sum is %f\n", sum);
}
(iv) Save the file and move back to the terminal window. Use the more command to check
that the file prog1.c contains the text you typed in.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
3
3.1
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
14
C programming
Compiling and running a C program
Before a C program can be run it must be translated into the simple low-level instructions
which are understood by the Central Processing Unit (CPU). This translation step is
called compilation, and is achieved by a program called a compiler. In addition, the code
must be linked to all the system routines which are required to run the program.
The shell commands to compile, link and run the program you have just typed in are
$ cc prog1.c -lm -o prog1
$ ./prog1
This will create a new file that contains the translated version of the program, expressed
as instructions which can be understood by the CPU. We call such a file an executable,
as it is capable of being executed by the CPU (which your original source code was
not). cc is the shell command which runs the C compiler. The -o is called a
command switch (or sometimes a command qualifier) and tells the compiler what
filename to give the executable file (prog1 in this case). The -lm switch allows us
to use mathematical functions in our programs. To run this newly created program,
you just use a command which is the filename of the executable (adding ./ before
the file name tells the computer to look in the current directory for the file to run).
Developing a computer program in C (or any other compiled language) involves the
cycle edit-save-compile-link-run-edit-save-compile-link-run... very similar to
the process you used in the 1st year to develop R scripts but including the additional
compile-link stage which is performed by the C compiler.
3.2
Exercise 2 - Your first C program
(i) Compile and run prog1. If it fails you will have to edit the source file, save the new
version of the file, re-compile the program and run it again.
(ii) Copy the file prog1.c to prog2.c and use the editor to change the value of the
variables being added together in prog2.c. Compile and and run the program prog2 to
check it gets the correct answer.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
3.3
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
15
Anatomy of the main C program
The prog1 program introduces several very important concepts in computer programming.
Ignore the first three lines for now. They are necessary but are easier to understand when
you know a little more about C and will be described in section 3.7.
The line
int main()
declares that this is a main C program (function). The function main() is special and acts
as an entry point or starting point when the program is run. The int specifies that on
completion the program will return an integer value that can be used as a flag to indicate
whether the program ran correctly or not (see Section 3.19). You can ignore this for now
and just concentrate on the lines between the curly brackets ({ and }). The program
is basically a short list of instructions (called statements in C). In the C language each
statement is separated from the next by a semi-colon (;). Be very careful to check that
you place semi-colons where they are needed, missing one out will generate a lot of error
messages.
The program also contains comments, which are enclosed (delimited) by /* and */. The
comments are ignored by the compiler, so you can put whatever you like in the comments
(except other comments, i.e. comments cannot be nested). The point of a comment is to
aid readability, to explain to another human reading the program what the program is
doing (or supposed to do) at a given point. Learning to make liberal use of comments is
an important part of learning to program a computer. A program is as good as useless if
no-one can tell what it does!
The first thing the program does is to declare three “variables”. A variable is a named
entity that holds a value. The value of a variable can be updated as a program runs (hence
the name). Every variable must have a type. The C language supports both numeric and
non-numeric types; we will leave non-numeric types until Section 3.24. Computers use
two different types of numbers: integers and real numbers (commonly called floating-point
numbers). The line
float a, b, sum;
declares three variables named a, b and sum, and states that they contain floating-point
(real) values. If the word float was replaced by int then the variables would be declared
as integers. The next action the program takes is to assign values to variables. This is
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
16
called defining the variable value. Before these lines all three variables will contain an
undefined value. It is a common mistake to declare a variable but forget to give it an
initial value. Most modern compilers will warn you if you declare a variable and never use
it, or use it before you assign it a value (though it is very bad practice to rely on this).
After the variables a and b are assigned values they can be used in an arithmetic
expression. The assignment bears a passing resemblance to an equation in algebra (though
it is not an algebraic equation):
sum = a + b;
What this line does is add the values held in variables a and b, and assign this value to the
variable named sum. (A CPU would execute this using simple instructions like “get this
number from there” “add this number to that number” and “put that number there”).
The last thing the program does is print out the result. After all, a program which keeps
its answers to itself is pretty useless! More details about printf() are given in Section
3.5.
3.4
Variable names and numeric data types
Variables in C must be given a name. The variable name can be made up from upper-case
or lower-case letters (A-Z or a-z), numeric characters (0-9) or an underscore ( ). There is
one important restriction though: a variable name cannot start with a numeric character.
Variable names are case-sensitive in C, in other words value and Value are different
variables. It is a bad idea to define variables with names that are distinguished only by
their case, as this makes code difficult to interpret, and can lead to mistakes that are very
hard to track down. Most implementations of C allow variables to be given long names
(at least 32 characters). You should choose names for your variables that are reasonably
descriptive of what the variable represents (to help other programmers read your code,
for example nvalues) without being unnecessarily long-winded or difficult to type (eg.
number_of_values_in_the_file_that_I_am_reading_on_Tuesday).
Numeric variables in computer programs are not capable of representing every possible
real or integer number; they have limited precision and can only represent a limited range
of numbers. There is a trade-off between precision/range and the amount of storage space
required to represent the number held within the variable. For example the int type has
a range -2147438646 to 2147483647 and occupied 4 bytes (each byte is 8 bits).
The float data type is often referred to as single-precision, and the double type as
double-precision. Historically single-precision mathematical operations were faster than
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
17
double-precision, however with modern CPUs this is generally no longer the case. So the
important differences between single and double precision is the amount of memory or
disc space required to store them and the numerical precision they provide.
3.5
Printing output and formatting numbers
The printf statement is used to print output to the screen. printf() is an example of a
function in C. A function is a separate piece of code that you can call from your program
to perform a specific task. When calling a function you can supply arguments, which are
variables or values which the function will need to perform the task you want it to. These
arguments follow the function name and are enclosed in brackets. Functions are a very
important part of programming in C; a large number of powerful functions are provided
for you to use, and you can write your own functions. We will come back to functions in
more detail later on.
Fortunately you do not need to know how the printf() function works internally, you
only need to know what arguments to give it to print the output you want. Your program
does not need to concern itself with the mechanics of formatting numbers and printing
them out; this has all been done for you.
The printf function accepts one or more arguments. The first argument is a format
specifier, which is basically a template for what you want printed out. Take the example
in the prog1 program:
printf("The sum is %f\n", sum);
In this case the format specifier is the string "The sum is %f\n". What the printf()
function does is to work through the format specifier string and replace the format codes
(which look like %f) with the formatted representation of the values of the subsequent
arguments in order.
The format code %f is used to format real numbers, the format code %d is used for integer
values. Be very careful to ensure that the format codes are correct for the data types
of the arguments, otherwise you will quite probably get garbage! The full width and
precision can be specifed by including them directly after the % character, %n.mf
%10d
%15.10f
format integer with full width 10 characters
format real with full width 15, 10 digits after decimal point
A complete list of format specifiers in given in Section 6.4.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
18
The \n character sequence stands for newline, and asks printf() to start a new line on
the screen. The example below shows both integer and real numbers in the same line of
output:
int first;
float second;
first = 10;
second = 11.0;
printf(The value of first is %d, the value of second is %f\n,
first, second);
would print
The value of first is 10, the value of second is 11.0
3.6
Arithmetic expressions and mathematical functions
The C language supports the following operators for use in arithmetic expressions:
+
*
/
addition
subtraction
multiplication
division
In complicated arithmetic expressions the compiler will use rules of operator precedence
to work out the order in which the operations should be performed. Operator precedence
can be tricky to get to grips with. A complete list of operators in precedence order is
given in Section 6.5. For now it is enough to know that multiplication and division have
higher precedence than addition and subtraction, for example:
a * b + c * d
will evaluate a * b, then c * d, then add together the two intermediate results. If you
are in doubt how an expression might be interpreted by the compiler, it is best to use
parentheses (brackets) to make sure it understands what you really mean. So you might
use
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
19
a * (b + c) * d
in which case b + c will be evaluated first and then this intermediate result will be
multiplied by a and b.
It is possible to mix numerical types in an expression, for example to multiply an integer
(int) by a single-precision floating-point number (float). When such an expression is
evaluated the values of the variables will be automatically converted to the same type as
the highest precision variable used in the expression (for example int * float becomes
a float, int * float * double becomes a double). It is also legal to assign values of one
numeric type to variables of another type, however in some cases this will lead to loss of
precision, so care should be taken when doing this. When real numbers are assigned to
integer variables, the values are truncated (not rounded, ie. 67.9 becomes 67, not 68).
Many mathematical functions are available. For example
b = sin(a)
Here b will be assigned the value of sin(a) where a is in radians. The full list of
mathematical function is given in Section 6.10.
3.7
Symbolic constants and the pre-processor
We have seen how the C language provides variables to represent values that can change
throughout the program. Sometimes you will find yourself wanting to use constant values
in your program. Many constants will have a special meaning, for example π or the
conversion factor from degrees to radians. You could type these numbers in every time
you need them, but this is tedious, error-prone and doesn’t make for very readable code.
A better solution is to use a symbolic constant, eg. PI or DEG_TO_RAD.
The C language (unlike FORTRAN or C++) does not intrinsically provide support for
such named constants, however there is a simple way to get the same effect by using
the C preprocessor. Before a C program undergoes compilation it passes through a preprocessor phase. You have already seen pre-processor instructions (called directives) in
all of the example programs above, they begin with a hash sign, #, for example #define
or #include. The #define directive is very useful for defining constants. In the program
above we use #define like so:
#define MAXVALS 100
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
20
Whenever the C pre-processor sees the character sequence MAXVALS it will replace it with
the sequence 100. This means that the line of code which appears:
float values[MAXVALS];
will be compiled as if it had been written
float values[100];
Defining constants like this can be very useful. Firstly, as the name suggests, it is not
possible to change the value of a constant. If we assign the value of π to a variable, there
is a risk that it might inadvertently be changed, which can lead to some very hard-to-find
bugs. Another important pre-processor directive that you have already encountered is
#include. The #include directive informs the compiler that you wish to use some of
the standard library of functions which are available with the C compiler. For example if
you want to use input/output functions (such as printf, fprintf, or scanf), you must
ensure you have the line
#include <stdio.h>
close to the start of the file. To use mathematical functions, you should include the math.h
header file, thus
#include <math.h>
In the example programs you have seen so far we have already included three standard
header files (stdlib.h, stdio.h and math.h) that cover all of the requirements of those
programs.
The pre-processor has a number of other useful features, for example macros and
conditional compilation, but those are beyond the scope of this workshop.
3.8
Exercise 3 - Variable types, functions and precision
(i) Make a copy of the source file prog2.c and call it prog3.c
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
21
(ii) Edit, compile and run prog3.c to do the following: Calculate an estimate of the
constant π using the inverse function asin(). Save the result as an integer (int), a single
precision real (float) and a double precision real (double). Print out the results using
the format specifiers %d and %f. How many significant figures are produced in each case?
(iii) Add lines to divide two integers and assign the result to an integer variable. What
can you say about the results? What happens if you assign the result to a float? What
happens if you assign the two integers to real variables and then calculate the ratio using
the real numbers assigning the result to a real?
3.9
Repeating instructions
How might you modify prog1 to calculate and print, say, the factorial of the variable a?
Remember that a factorial is defined as:
a! = 1 × 2 × 3 × 4 × ... × a
so basically you must multiply a by all integers less than a. An obvious way to do this
might be to modify the program to explicitly perform the multiplications required as
follows:
fact = 1.0;
fact = fact * 2.0;
fact = fact * 3.0;
fact = fact * a;
however this is not very satisfactory because firstly it involves a lot of typing, and secondly
that the program is difficult to modify if you want to calculate the factorial of a different
number. A more elegant (and the correct) solution is a loop. What a loop does in
a computer program is to execute a bunch of instructions more than once. A loop
will continue to execute the enclosed instructions until some condition (defined by the
programmer) is satisfied. The following program will calculate the factorial of a number:
/* prog4 - calculate a factorial */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
/*
/*
/*
/*
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
22
int main()
{
int i, num;
float fact;
What number do we want to calculate the factorial of? */
num = 10;
Initialise the factorial */
fact = 1.0;
Count from 2 to num, multiply fact by counter each time */
i = 2;
while ( i <= num ) {
fact = fact * i;
i = i + 1;
}
Print the result */
printf("The factorial of %d is %f\n", num, fact);
}
This program makes use of a while-loop. The basic syntax of a while loop is:
while ( condition )
statement
in other words it repeatedly executes statement while condition is satisfied. The program
uses the variable i as a counter, which starts with a value 2 and is repeatedly incremented.
Each time it is incremented a second variable, fact, is multiplied by the counter. This
continues while the value of i is less-than-or-equal-to the value which we are trying to
calculate the factorial of (num).
3.10
Logical expressions and relational operators
In an earlier section we covered arithmetic expressions which are used to calculate
numerical values (which are typically then assigned to a variable or passed as an argument
to a function). The C language also supports a different type of expression, the logical
expression, which can be used to test whether a certain condition is true. In the example
program above the while loop has an associated logical expression
while ( i <= num )
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
23
In this case the logical expression i <= num evaluates to TRUE so long as i is less-thanor-equal- to num. The following relational operators are available in C:
==
<
<=
>
>=
equal to
less than
less than or equal to
greater than
greater than or equal to
It is a very common mistake to mix up the equality test operator (==) with the assignment
operator (=). It is a quirk in the C language that the assignment operator is legal in all
the contexts in which the equality test is valid, however they do very different things.
Most modern compilers will attempt to spot this mistake and issue a warning, but this
cannot be a substitute for due care and attention when constructing logical expressions!
C also provides a means by which conditions can be chained together. Suppose you want
to test whether the variable a is within the range 1.0 to 10.0. This would be written as:
a >= 1.0 && a < 10.0
The && operator (AND) is a boolean operator which links the two sub-expressions
together; it is only TRUE if the first sub-expression is TRUE AND the second subexpression is TRUE. Now suppose you wanted to test whether the variable a is outside
that range. There are actually two ways to do this. Firstly you can use the OR operator
(||):
a < 1.0 || a > 10.0
The || operator (OR) is TRUE if the first sub-expression is TRUE OR the second
subexpression is TRUE.
The alternative way to perform the test is to check whether the value is within the specified
range, and invert the result, using the ! operator (NOT). The value must be outside the
range if the value is NOT inside the range:
! ( a >= 1.0 && a <= 10.0 )
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
24
Note that in this example parentheses are used to dictate the order of precedence as
they were in arithmetic expressions. Logical expressions are subject to rules of operator
precedence in much the same way as arithmetic expressions are. The basic rules are that
the operators <, <=, ==, >= and > have higher precedence than && or ||. The NOT
operator (!) has higher precedence than all of the other relational operators, hence the
need for the parentheses in the example above, which would otherwise be interpreted as
if it were written:
( ! a >= 1.0 ) && ( a <= 10.0 )
3.11
Exercise 4 - A program to calculate a factorial
(i) Type in the program above (call it prog4.c), compile it and run it.
(ii) Alter prog4.c so that the counter counts from num down to 1, rather than the other
way.
(iii) A single precision floating-point variable can only represent values up to about 1038 .
What is the largest value of num for which this program can calculate the factorial? Alter
the program to use double precision floating-point, which can represent values up to 10308 .
What is the largest value of num for which the factorial can be calculated now?
3.12
The for-loop
The loop in the example the factorial program:
i = 1;
/* Initialise counter variable */
while ( i <= num ) { /* Test counter variable */
fact = fact * i;
i = i + 1;
/* Increment counter variable */
}
performs three important operations. Firstly it sets a counter variable (in this case i) to an
initial value (1), secondly it tests that the counter variable is still within the required range,
and thirdly it increments the counter variable. This form of definite loop is so common
in C that a special shorthand statement has been devised for it; the for statement. The
for statement encapsulates the three key operations into one statement, which helps
readability. The code above can be replaced by the following:
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
25
for ( i=1; i <= num; i = i + 1 ) {
fact = fact * i;
}
which is much less long-winded, and expresses all you need to know about where the loop
starts, stops, and how the counter behaves in between, in one single statement. In general
you should always use a for statement to control definite loops.
3.13
Compound statements
You have probably noticed that the example programs presented so far contain quite a
few curly brackets ({and }). Curly brackets are used in C to delimit (enclose) what is
called a compound statement.
A compound statement is quite simply a block of code made up from one or more
statements. A compound statement can be used pretty much anywhere a single statement
can be used.
When we introduced the while loop we said that the syntax was:
while ( condition )
statement
You can see that in the example program statement is actually a compound statement
consisting of the following two statements:
{
fact = fact * i;
i = i + 1;
}
Once we replaced the while loop with a for loop there is actually only one statement
executed by the loop, so strictly speaking we can drop the curly brackets:
for ( i=1; i <= num; i = i + 1 )
fact = fact * i;
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
26
You may see this often if you are reading a C program written by an experienced
programmer. It can be quite confusing at first working out where curly brackets are
needed and where they aren’t. Just remember that a compound statement is intended
to make lots of statements look like a single statement. Be very careful that every curly
bracket you use to open a compound statement has a companion bracket that closes that
compound statement. If you don’t you will almost certainly get an error message and
your program won’t compile.
3.14
Exercise 5 - More repetition
(i) Bearing in mind what you have learned about compound statements, what is wrong
with writing the loop in the factorial program as follows?
while ( i <= num )
fact = fact * i;
i = i + 1;
This program is syntactically correct, and will compile without an error. What do you
think will happen when it is run?
(ii) Copy prog4.c to prog5.c and change the factorial program so that it uses the loop
above. Compile and run the program to see if you are right.
(iii) Change prog5.c so that it uses a for loop in place of the while loop. Compile and
run the program to check that it gives the correct answer.
3.15
Entering numbers into a program
So far the example programs have been performing arithmetic operations on values of
variables which are hard-coded into the program, for example:
a = 10.0;
b = 2.0;
sum = a * b;
This approach is not very useful if you want to write flexible programs. You don’t want to
have to edit the program and recompile it every time you want to calculate the product of
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
27
a different pair of numbers. What is needed is some way to get numbers into a program,
say by typing them at the keyboard when the program runs, or by reading them from
a text file. Fortunately this is quite straightforward to do. Earlier we saw how the
standard function printf() was used to format text and numbers and print them to the
screen. There is another standard function, called scanf(), which performs the opposite
operation, namely reading values from the input and assigning these values to variables.
/* prog6 - using scanf to read values from user input */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main()
{
float a, b, sum;
/* Read two numbers */
printf("Enter two real numbers: ");
scanf("%f %f", &a, &b);
/* Calculate the sum, print it out */
sum = a + b;
printf("The sum is %f\n", sum);
}
You can see that scanf() looks very similar to printf() in this example. It reads two
real numbers from the keyboard, and assigns the values to two variables a and b. It
uses the same format specifier codes (%f to read a real number, %d to read an integer)
as printf(). The only difference is that the second and third arguments, which are
the variables intended to receive the two values are prefixed by an ampersand (&) in the
argument list. The reason for the ampersand is that it asks the C compiler to ensure that
the variables are passed to the function in such a manner that the scanf() function can
actually write values to the variables. It does this by passing the address of the variable,
rather than the value of the variable. This allows the function being called (scanf) to
write the value to the correct location in the computer’s memory. We will look at this is
much more detail in Section 3.31.
3.16
Exercise 6 - Requesting values from the terminal
(i) Type the program which uses function scanf() into the file prog6.c and test that it
works.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
3.17
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
28
Subscripted variables (arrays)
So far the variables we have been dealing with have held a single value (a scalar). For
some problems you might want to collect together a number of related numeric values,
for example a list of measurements of a length, or a temperature, or similar.
The naive approach is to declare N distinct variables, each with a different name to hold
each of the distinct values you are dealing with, for example:
float value1, value2, value3, value4;
value1=10.0;
value2=12.0;
value3=15.0;
value4=17.0;
and then to explicitly work with these variables. Clearly this is very cumbersome and
inflexible. The program will have to be altered if the number of values changes, it is very
difficult to perform mathematical operations on these values (for example subtracting a
constant from all the values, or similar). Imagine how long a program would be if it had
to deal with hundreds, thousands or millions of distinct values! Fortunately there is a
simple solution offered by almost all programming languages, and C is no exception. It
is also possible to define a single variable that represents a collection (or array) of values.
In C an array variable is declared as follows:
float value[100];
This code fragment declares an array of 100 distinct single-precision real values, and
associates them all with the variable name value.
The square bracket notation is also used to access the individual elements of the array,
for example value[0] accesses the first element of the array, value[1] the second, and
value[99] the last. Array indices are always integers. Note that the first element of
the array is numbered zero, and therefore that value[100] is not a valid element of this
array. This is a very common source of confusion to programmers new to C. Accessing
nonexistent elements of an array can cause all sorts of problems. At best you will get the
wrong answer, at worst your program will crash with a segmentation fault or bus error.
The advantages of arrays are many-fold. For a start it is possible to use another variable
as an array index (ie. within the square brackets), which means that operations can easily
be carried out on all values in an array simply by using a loop:
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
29
int i;
float value[100];
...
...
for ( i = 0; i < 100; i = i + 1 )
value[i] = value[i] + 10.0;
This example will add 10.0 to the value of every element of the array; much easier to
write and understand than had the values all been held in distinct variables. We can
also perform some very important operations on arrays that don’t really make sense for
a scalar variable (for example sorting the values into some order).
The following program will read ten numbers into an array, then calculate the mean and
standard deviation of the values:
/* prog7 - defining and using arrays */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main()
{
int i;
float value[10], sum, mean, stddev;
/* Read ten values from the keyboard into the array */
for ( i = 0; i < 10; i = i + 1 ) {
printf("Enter value %d: ", i);
scanf("%f", &value[i]);
}
/* Calculate the mean */
sum = 0.0;
for ( i = 0; i < 10; i = i + 1 )
sum = sum + value[i];
mean = sum / 10.0;
printf("Mean value is %f\n", mean);
/* Calculate the standard deviation */
sum = 0.0;
for ( i = 0; i < 10; i = i + 1 )
sum = sum + powf(value[i] - mean, 2.0);
stddev = sqrt(sum) / 10;
printf("Standard deviation is %f\n", stddev);
}
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
30
Note that the powf() function calculates the first argument raised to the power of the
second, so powf(2.0, 3.0) would return 8.0 (which is 23 ). The sqrt() function calculates
the square-root of the argument.
3.18
Exercise 7 - Using arrays
(i) Type the above program into the file prog7.c and test that it works.
3.19
Return values from functions
You have seen how, when calling a function, you can specify a number of arguments
(values which are to be used by that function to perform the operation you are requesting
of it). Once these operations have been performed, the function can return a value back
to the calling program. This value, called the return value, can be assigned to a variable,
or used in an expression. In the example above we are using two functions, sqrt() and
powf() in this way, embedding them directly into numeric expressions. Hopefully it is
obvious why powf(), sqrt() or other mathematical functions should return the value
they calculate (they would be pretty useless otherwise!), however you should note that a
large fraction of non-mathematical functions also return a value. Often this return value
is used to communicate back the success (or otherwise) of the execution of the function.
Even printf() returns a value, which indicates the number of characters actually printed,
as does the main() function, which is the main body of the program itself. In C it is
legal to call a function and ignore its return value; this is what we have been doing when
calling printf() in the previous example programs. If you don’t assign the value to a
variable, or use it in an expression, the return value is simply discarded.
3.20
Input/output using files
Reading data from the keyboard is very useful, however if you have a large amount of
data, or need to run a program on the same data time and again, it is not very sensible to
repeatedly type the numbers into the program by hand. The solution is to type numbers
into a separate file, then read them from that file as if they had been from the keyboard.
The following program will read numbers from a file and calculate their mean and standard
deviation:
/* prog8 - reading from a text file */
#include <stdio.h>
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
/*
/*
/*
/*
/*
/*
/*
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
31
#include <stdlib.h>
#include <math.h>
#define MAXVALS 100
int main()
{
float value[MAXVALS], sum, mean, stddev;
int i, count;
FILE *fh;
Open the data file */
fh = fopen("file.dat", "r");
if ( fh == NULL ) {
printf("Cannot open data file\n");
exit(EXIT_FAILURE);
}
Initialise counter */
count = 0;
Read all the values in the file */
while ( ! feof(fh) && count < MAXVALS ) {
if ( fscanf(fh, "%f", &value[count]) > 0 ) count = count + 1;
}
Close the file */
fclose(fh);
Check the file contains data */
if ( count == 0 ) {
printf("Data file contains no values\n");
exit(EXIT_FAILURE);
} else {
printf("Read %d values from file\n", count);
}
Calculate the mean */
sum = 0.0;
for ( i = 0; i < count; i = i + 1 )
sum = sum + value[i];
mean = sum / count;
printf("Mean value is %f\n", mean);
Calculate the standard deviation */
sum = 0.0;
for ( i = 0; i < count; i = i + 1 )
sum = sum + powf(value[i] - mean, 2.0);
stddev = sqrt(sum) / count;
printf("Standard deviation is %f\n", stddev);
}
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
32
This program introduces a number of new and important concepts dealing with files,
return values from functions, conditional execution, error checking, indefinite loops and
the use of the C pre-processor to represent constant values.
To use a file we need to perform three basic operations in sequence: to open the file (for
reading and/or writing), to read or write our data, and then to close the file when we
are finished with it. Every computer operating system allows files to be given filenames,
which are used by the user to identify and distinguish the files on disc, however different
operating systems use different conventions to construct the filename and represent its
location within a directory structure. The fopen() function is used to open a file and
associate a filehandle with it. In the example program above it is used as follows:
FILE *fh;
fh = fopen("file.dat", "r");
The first argument to fopen() is the name of the file, in this case enclosed in double
quotes as it is a string literal. The second argument is the access mode, in other words
whether the file is to be opened read-only (”r”), for read/write (”rw”), or write-only access
(”w”). Note that if you open an existing file for write or read/write you will overwrite
the existing contents. The value returned by the fopen function is a filehandle, which can
then be passed to other input/output functions to instruct them which file to access. A
filehandle is a variable of a special type FILE*. Reading numbers from a file is achieved
by the function fscanf(), which is very similar to the function scanf() used to read
values from the keyboard. The only difference is that it has an additional argument, the
filehandle representing the file from which the values should be read, for example:
fscanf(fh, "%f", &value[count])
The filehandle argument slots in before the format specifier; other than that everything is
the same as scanf(). Another point to notice is that the program actually make use of
the value returned by the function fscanf(). The C Standard dictates that the fscanf()
function should return the number of values actually read from the input file. The format
specifier we are using is only asking for a single real value to be read from the file, so our
call fscanf() will return the value 1, up until the point where we reach the end of the
file and there are no more values to be read. Once this happens fscanf will return zero.
The feof() function simply checks whether we have reached the end-of-file, and returns
TRUE if we have, or FALSE otherwise. Once the contents of the file have been read, the
file can be closed using:
fclose(fh);
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
33
The function fprintf() is similar to printf() except there is an extra argument which
specifies a filehandle of the destination file. So if you open a file with write access (see
above) you can output lines of text to that file (instead of the termimal) using:
FILE *fh;
fh = fopen("results.dat","w");
fprintf(fh,"Mean value is %f\n", mean);
fprintf(fh,"Standard deviation is %f\n", stddev);
fclose(fh);
3.21
Exercise 8 - Reading and writing text files
(i) Type prog8, the program that reads data from a file above, into the source file prog8.c
and create a data file file.dat which contains a list of numbers. Compile and run the
program to check that it opens the file, reads the numbers from the file and finally
calculates the results and prints them out.
(ii) What happens if file.dat contains more than one number per line (separated by
spaces)?
(iii) Add lines so that the results are also written to a new file called results.dat. Check
that the output file is created and contains the correct results.
3.22
Conditional
checking
execution
(the
if-statement)
and
error
In prog8.c we are using the function fopen() to try to open a file on disc and associate
a filehandle with it. In the case where fopen() is unable to open the named file (because
the file does not exist, or the access permissions do not allow the file to be read for
example) then fopen() will return the special value NULL. Clearly if we cannot open the
file then we cannot sensibly read any data, and we should report an error and exit from
the program. It is good programming practice to check for such errors, and can save a
lot of time for the user by giving a helpful error message. The program uses the following
code to check for such an error:
if ( fh == NULL ) {
printf("Cannot open data file\n");
exit(EXIT_FAILURE);
}
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
34
The if-statement will only execute the compound statement that follows it if the logical
expression is TRUE. In this case, if fh is equal to NULL, an error message is printed and
the program exits. A further clause (else) can be added to define what will happen if the
logical expression is not TRUE:
if ( fh == NULL ) {
printf("Cannot open data file\n");
exit(EXIT_FAILURE);
} else {
printf("The file was opened successfully\n");
}
3.23
Indefinite loops
In prog4 we introduced the concept of a definite loop, which is a loop for which the
trip count (the number of times the loop executes) was known, or could be calculated
by looking at the program, before the loop started. In prog8 we use an indefinite loop,
where the trip count is unknown prior to the commencement of the loop. Indefinite loops
are very useful; in this example we are reading numbers from a data file, and we don’t
necessarily know how many numbers will be in the data file when we start reading it. The
while loop in this case is
while ( ! feof(fh) && count < MAXVAL)
in other words while we have not reached the end-of-file, and while we have not filled up
the work array value. Using this simple technique the program will run successfully on
any data file containing up to MAXVAL values.
3.24
Character strings
So far you have learned how to declare and use numeric variables, i.e. variables whose
values are numbers. There are many occasions when the values you will want to represent
are not numeric. Say for example you were writing a program to manipulate a list of
names. Obviously it is difficult to assign the value “John Smith” to an integer or to a
real number! Fortunately computer languages, C included, allow us to declare and use
variables that represent characters and strings of characters (often called character strings,
or string variables). In computer terminology a character is a single letter, number, or
punctuation mark. For example A, D, h, 2, * and ) are all characters. A character string
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
35
is simply one or more characters strung together in sequence. The C language uses the
data-type char to represent characters. A scalar variable declared to be of type char holds
a single character. For example:
char cval;
cval = "A";
In the C language a character string is represented as an array of characters.
char name[20];
The standard library function fgets() is used read a single line for a file as a character
string. The function accepts three arguments: the string variable that will receive the
result, the maximum number of characters to read (ie. the length of the string variable),
and the handle to the open file from which the string will be read. It will read a complete
line from the file (up to a newline character), but will only store up to the specified
maximum number of characters in the string. It’s important not to read more characters
than the string can hold, otherwise your program will overwrite other variables and will
probably crash. The function fgets() actually returns a value (the location of the string
variable), but we don’t need to use this so we don’t assign this returned value to a variable,
we just discard it.
If you want to print a string variable to the screen, you can do this using printf(), just
like for numeric variables, but you should use the format specifier %s in place of the %d or
%f which would be used for numeric variables.
char name[20];
fgets(name,20,stdin);
printf("The name string is: %s",name);
3.25
Structured data types
The data in the table below are taken from the paper “The Velocity-Distance Relation
Among Extra-Galactic Nebulae” written by Edwin Hubble and Milton L. Humason in
1931 and published in the Astrophysical Journal. The data show the recessional velocity
(measured redshift), rvel km/s, and photographic magnitude (proportional to log10 of the
flux), pmag, of a number of clusters of nebulae. The second column, nneb, is the number
of nebulae identified in the cluster.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
cluster
nneb
Virgo
7
Pegasus
5
Pisces
4
Cancer
2
Perseus
4
Coma
3
Ursa_Major
1
Leo
1
Iso_I
16
Iso_II
21
rvel
890
3810
4630
4820
5230
7500
11800
19600
2350
630
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
36
pmag
12.5
15.5
15.4
16.0
16.4
17.0
18.0
19.0
13.8
11.6
Typically such data will be available as a computer text file and to analyse the data
we need to read them into arrays in a C program. This is easily done using a modified
version of prog8 above. We can read each column into an array of the appropriate type
and length. However it’s useful to keep the properties of each cluster together in a single
variable so you don’t forget what relates to what and you can pass around the collection
of variables for each cluster during any analysis.
The C language allows you to define new structured data types, which act like a collection
of related variables. Using this feature you can define a new data type to hold all of the
information pertaining to a single cluster (or some other entity) in a single variable. The
following code does this:
#define MAXCLUNAMLEN 10
struct clusterinfo {
char name[MAXCLUNAMELEN];
int nneb;
float rvel, pmag;
};
The struct keyword introduces the definition of the new data type, and is followed
by a nametag that identifies the structured type within your program, in this case
clusterinfo. This is followed by a block which declares each of the members of the
structured type, describing their name, data type and in the case of array members, their
size. Note that strings, integer and floating-point variables can be freely mixed within a
structured data type. Once this new type has been defined, you can declare variables of
the new type, and use them:
int main()
{
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
37
/* Declare a new variable ’clist’ of type ’clusterinfo’ */
struct clusterinfo clist;
/* Access the members */
clist.name="Orion";
clist.nneb = 10;
clist.rvel = 1500.0;
clist.pmag = 13.1;
printf("name %s, nneb = %d, rvel = %f, pmag = %f\n,
clist.name, clist.nneb, clist.rvel, clist.pmag);
}
The dot operator (‘.’) is used when accessing the members of a structured variable, so
clist.rvel refers to the rvel member of the structured variable clist. This is equivalent
to the $ operator in R.
Note that just like variables you cannot use a structured data type (ie. define variables
using it) before it has itself been defined. Definitions of structured data types are generally
placed near the start of any program (after the header files are #included, but before any
executable code).
You can define an array of a struc data type by adding a size specification in square
brackets. The members of the array are then referenced in the normal way, using an
integer counter.
#define MAXCLU 100
struct clusterinfo clist[MAXCLU];
int ncu;
...
...
/* Access the members */
ncu=0;
clist[ncu].name="cluster1";
clist[ncu].nneb=4;
ncu=1;
clist[ncu].name="cluster2";
...
3.26
Reading in a table from a file
Below is prog9 which reads the Hubble cluster data from a file called hubble.dat into a
structured array.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
38
/* prog9 - reading a tabulation file and performing linear regression */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define MAXCLUNAMLEN 10
#define MAXCLU 100
#define MAXHEAD 60
struct clusterinfo {
char name[MAXCLUNAMLEN];
int nneb;
float rvel,pmag;
};
int main()
{
struct clusterinfo clist[MAXCLU];
char header[MAXHEAD];
int i, count;
FILE *fh;
float sumx=0.0, sumy=0.0, sumxx=0.0, sumxy=0.0;
float del, gradient, intercept;
/* Open the data file */
fh = fopen("hubble.dat", "r");
if ( fh == NULL ) {
printf("Cannot open data file hubble.dat\n");
exit(EXIT_FAILURE);
}
/* read header line from file */
fgets(header,MAXHEAD,fh);
/* Initialise counter */
count = 0;
/* Read all cluster info from the file */
while ( ! feof(fh) && count < MAXCLU ) {
if ( fscanf(fh, "%s %d %f %f",
&clist[count].name,&clist[count].nneb,&clist[count].rvel,
&clist[count].pmag) > 0 ) count = count + 1;
}
/* Close the file */
fclose(fh);
/* Check the file contains data */
if ( count == 0 ) {
printf("Data file hubble.dat contains no data\n");
exit(EXIT_FAILURE);
} else {
printf("Read info for %d clusters from file hubble.dat\n", count);
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
39
}
/* List data and perform linear regression on log(rvel) vs. pmag */
printf("%s",header);
for (i = 0; i < count; i = i + 1) {
printf("%s %d %f %f \n",clist[i].name,clist[i].nneb,
clist[i].rvel,clist[i].pmag);
sumy = sumy + log10(clist[i].rvel);
sumx = sumx + clist[i].pmag;
sumxx = sumxx + clist[i].pmag * clist[i].pmag;
sumxy = sumxy + clist[i].pmag * log10(clist[i].rvel);
}
del=count*sumxx-sumx*sumx;
gradient=(count*sumxy-sumx*sumy)/del;
intercept=(sumxx*sumy-sumx*sumxy)/del;
/* print results */
printf("log10(rvel) = %f * pmag + %f\n", gradient,intercept);
}
The correlation between log of the velocity and the mean magnitude is closely linear.
Measured data like these led Hubble to conclude that a galaxy’s recessional velocity was
proportional to it’s distance (by calibrating the photographic magnitude in terms of log10
of the distance to the source). This law, known as Hubble’s law, demonstrates that
galaxies are not only moving away from Earth, but from each other too, providing strong
evidence that the Universe is expanding.
3.27
Exercise 9 - Reading a data table and performing linear
regression
(i) Create a data file hubble.dat from the data table above.
(ii) Create a program prog9 which reads this data file and performs a least squares fit to
find the gradient and intercept for the relationship
log10 (rvel) = grad × pmag + interc
(iii) Using the code in prog9 write down the formulae used to find the gradient m and
intercept c in the linear relationship y = mx + c given a set of values xi , yi .
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
3.28
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
40
User-defined functions
You have already made use of a few of the large number of mathematical functions in the
C standard library. For example: root_val = sqrt(val); will calculate the square-root
of the variable val and assign it to the variable root_val. A list of such functions is
given in Section 6.10. But what if you wish to calculate a mathematical function that
isn’t included in the standard C library? The C language allows you to create userdefined functions, and to use them in the same way as the standard functions provided
by the standard C library. User-defined functions can be used to implement virtually any
mathematical function, or perform non-mathematical operations. User-defined functions
can be as long and complicated as necessary, and are usually used to help to structure
the code (to break the overall program into smaller, more understandable segments). In
large C programs user-defined functions will form the bulk of the code.
3.28.1
Defining new functions
Creating user-defined functions is quite straightforward to do, but you need to supply
the compiler with a few pieces of information about your new function. Firstly you need
to give the new function a name, in order to identify it. This name should be unique
within your program. No other functions or variables that you define should have the
same name, otherwise the compiler will not know which function you are referring to
when you use that name. For this reason it is not a good idea to use the same name as a
pre-existing function in the standard C library. Defining a new function called sin() for
example could cause problems, as this function already exists in the standard library (it
calculates the sine of an angle). Secondly you need to be able to supply the argument (or
arguments) that will be passed to the function. Each argument has a name and a type
associated with it.
As a example we are going to set up a “sawtooth” function. This is periodic with the
period split into three consecutive phases, trise , tf all and tcons . The period is therefore
tp = trise + tf all + tcons . The period starts a time t = tzero where the function has a value
of 0.0. The function rises linearly to a value of 1.0 at t = tzero + trise . It then falls linearly
back to 0.0 at t = tzero + trise + tf all and then remains constant at 0.0 for the rest of the
period (a time span of tcons ). The following code defines such a function.
/* function to return sawtooth amplitude for a given time */
#include <tgmath.h>
double sawtooth_amp(double trise, double tfall, double tcons,
double tzero, double t)
{
double tperiod, tphase, trf, ft;
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
41
tperiod=trise+tfall+tcons;
tphase=t-tzero;
tphase=fmod(tphase,tperiod);
if(tphase<0.0) {
tphase=tperiod+tphase;
}
trf=trise+tfall;
if(tphase<trise) {
ft=tphase/trise;
} else if(tphase<trf) {
ft=(trf-tphase)/tfall;
} else {
ft=0.0;
}
return ft;
}
All new function definitions are basically of the form:
result-type function-name(arg-type arg1, arg-type arg2, ...)
{
<some code to calculate the result>
return result;
}
The return statement defines the value that will be returned by the function.
3.28.2
Function prototypes
When you use variables in a C program you are used to obeying the rule that a variable
must be declared before it can be used. It is not legal to refer to a variable before it has
been declared. A similar rule must be obeyed when writing a new user-defined function:
you should not refer to the function before it has been declared and/or defined. A function
prototype looks like just the first line of a function definition. The function body (the
code which is executed when the function is invoked) is missing. The prototype of our
sawtooth function would look like this:
double sawtooth_amp(double trise, double tfall, double tcons,
double tzero, double t);
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
42
A function prototype serves two purposes: firstly it declares the function (so when the
compiler encounters the named function in an expression it can distinguish it from a
variable), and secondly it provides the compiler with information about the parameters
that the function accepts, and the type of the result it returns. This latter point is very
important, because it allows the compiler to check that the parameters you are passing
to a function when you call it are of the correct type (and perform a limited range of
conversions if required). Proper use of function prototypes helps the compiler to ensure
that you haven’t made any typographical errors when writing your program.
The prototype of a user-defined function should be placed in a program before any code
that refers to the function. The best place to put function prototypes in your program is
very close to the top of the file, after any #include statements you are using, for example:
/*
/*
/*
/*
#include <stdio.h>
#include <math.h>
Function prototypes */
double sawtooth_amp(double trise, double tfall, double tcons,
double tzero, double t);
... define your new functions here ... */
int main()
{
... main body of the program ... */
... use the function here ... */
}
3.29
Prototypes of standard library functions
You may have been wondering the precise purpose of the #include <stdio.h>
and #include <math.h> pre-processor directives which you have been using without
explanation to this point. The major purpose of these directives is to include into your
program the prototypes of the standard library functions that you wish to use. For
example the standard header file stdio.h contains prototypes for input/output-related
functions (eg. printf, scanf, fprintf, fscanf, etc.), and math.h contains prototypes
for mathematical functions (sin, cos, exp, sqrt, etc.). In all there are fourteen so-called
Standard headers that declare prototypes for functions in the Standard library. See the
Sections 3.7 and 6.6 for more information about Standard headers.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
3.30
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
43
Exercise 10 - Tabulation of a user-defined function
(i) Type the function source code for sawtooth_amp() into a file sawtooth.c and compile
it using the command:
$ cc -c sawtooth.c
The -c switch tells the compiler to simply compile the source code and create an object
file called sawtooth.o. If the compilation is successful then this file should have been
created on the current working directory. Check that this file exists using:
$ ls sawtooth*
The * is a wildcard used to specify any file which starts with sawtooth. Note that this
compilation has not created an executable because no main() function was present in the
source file.
(ii) Below is a short program which uses the function sawtooth_amp() and creates a
tabulation on the file sawtooth.dat.
/* prog10 - tabulation of sawtooth function */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define MAXT 1000
double sawtooth_amp(double tr, double tf, double tc, double tz, double t);
int main()
{
double tmin=-14.,tmax=14.,tsam;
double trise=1.0,tfall=5.0,tcons=4.0,tzero=3.0;
int nt=MAXT,i;
double t[MAXT],ft[MAXT];
FILE *fh;
/* Set up time samples and calculate function */
tsam=(tmax-tmin)/(nt-1);
for(i=0; i<nt; i=i+1) {
t[i]=tmin+tsam*i;
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
44
ft[i]=sawtooth_amp(trise,tfall,tcons,tzero,t[i]);
}
/* Produce a tabulation of function on file */
fh = fopen("sawtooth.dat","w");
fprintf(fh,"i t amp\n");
for(i=0; i<nt; i=i+1) {
fprintf(fh,"%d %f %f\n",i,t[i],ft[i]);
}
fclose(fh);
}
Note that this code contains a prototype for the sawtooth_amp() function but doesn′ t
include the code for that function (which is in a separate file sawtooth.c). Type the
program code into a file prog10.c and compile it using the command:
$ cc prog10.c sawtooth.o -lm -o prog10
Note that we now include the object code file sawtooth.o in the list of input files to
the compiler. The compiler will pick up this file to get the compiled (object) code for
our user defined function. Run the prog10 executable to generate the tabulation file
sawtooth.dat and check that this file contains the tabulation expected.
3.31
Pointers
Pointers are an important and widely used part of the C language. They are an extremely
powerful facility, and one of the prime reasons that C is such a versatile and widely-used
language. You have already met and understood the concept of a variable. A variable can
be thought of as a container that holds a value (the variables you have met so far have
all held the values of numbers). A pointer (or pointer variable) is different from a normal
variable; rather than containing a value it points to a location that contains a value. Just
like normal variables you must declare pointers before you use them. A statement that
declares a pointer variable looks very similar to one declaring a normal variable:
int var; /* Declare a variable */
int *ptr; /* Declare a pointer */
This example declares a variable var and a pointer ptr. Notice the presence of the * in
the declaration of ptr; it is this that identifies ptr as a pointer variable rather than a
normal variable.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
45
The value of a pointer variable is undefined unless you actually assign something to it.
You can make the pointer ptr point to the variable var like this:
ptr = &var;
The ampersand (&) operator returns the location of the variable it is applied to, so &var is
the location of the variable var. This is the syntax which we introduced in the argument
list of scanf() above. You can assign this location to the pointer variable using the
assignment operator in the usual way, using the assignment operator (=).
An important point to note here is that ptr is NOT a variable of type int (integer).
It is a variable of type “pointer-to-int”. You cannot assign integer values to ptr, so for
example the following two statements are NOT legal:
ptr = 10;
ptr = var;
3.31.1
Assigning locations to pointer variables
The value of a pointer variable can be assigned to another pointer variable of the same
type:
int val;
int *ptr1, *ptr2;
ptr1 = &val;
ptr2 = ptr1;
/* ptr1 points to val */
/* ptr2 now points to val */
In this example ptr1 is initialised to point at the variable val, then the value of ptr1 is
assigned to ptr2. At this point both ptr1 and ptr2 will point at the variable val.
Pointers are not restricted to pointing at integer variables, they can point at variables of
any type:
float fval, *fptr;
double dval, *dptr;
int ival, *iptr;
Document:
Issue:
Date:
Page:
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
CR1
1.0
October 12, 2015
46
However you must be careful; when assigning to pointers they must point to variables of
the correct type. Using the definitions above:
iptr
fptr
dptr
iptr
dptr
dptr
fptr
=
=
=
=
=
=
=
&ival; /* OK */
&fval; /* OK */
&dval; /* OK */
&fval; /* Not legal, iptr is
&fval; /* Not legal, dptr is
iptr; /* Not legal, pointers
dptr; /* Not legal, pointers
not
not
are
are
pointer-to-float */
pointer-to-float */
not the same type */
not the same type */
If you try to make an illegal pointer assignment the compiler will produce an error and
won’t compile your code.
3.31.2
De-referencing pointers
Once a pointer variable has been assigned a location of a variable (in other words once it
points at a variable) you can read from and write to that variable by using the pointer.
To do this you must place an asterisk (*) in front of the name of the pointer, for example:
int val, other;
int *ptr;
ptr = &val;
*ptr = 10;
/* Assign value to variable val */
other = *ptr; /* Access value of variable val through the pointer */
printf(%d\n, other); /* Will print 10 */
When referring to a pointer in this way you are accessing (reading from and writing to)
the location that the pointer points to, not the pointer variable itself. This is called
de-referencing the pointer.
3.31.3
Using pointers
Pointers have a number of uses, but primarily they are used when your program needs to
know the location of a variable, rather than just it’s value.
You have already seen how you can write a user-defined function to compute and return
a single value, sawtooth_amp() defined in the source file sawtooth.c. In such a function
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
47
the return statement can only return the value of one variable. When calling a function
in ‘C’ you can use variables to supply any input parameters necessary (the arguments),
however when you do this you are passing the value of the variable, not the variable itself.
The values of variables passed in this way can be used within the function but new values
or variables calculated by the function cannot be passed back to the calling routine. This
behaviour is called pass-by-value.
The way to get around this is to pass pointers to the variables as parameters (arguments)
to the function, rather than the value of the variables. Using these pointers the function
can be instructed to write values into the variables pointed to by the pointers. So in
addition to the providing the returned value the function can also change the values of
variables within the calling routine.
The “sawtooth” function defined above is periodic so as well as having an amplitude at a
given time there is also a phase angle defined as a function of time. Conventionally this
phase angle has units of radians and takes values in the range 0 − π. To illustrate the
use of pointers to return values from a function the function below calculates both the
amplitude and phase of our sawtooth.
#include <tgmath.h>
#define PI 3.1415926535898
/* function to return both amplitude and phase for a given time */
void sawtooth_amp_pha(double trise, double tfall, double tcons,
double tzero, double t, double *amp, double *pha)
{
double tperiod, tphase, trf;
tperiod=trise+tfall+tcons;
tphase=t-tzero;
tphase=fmod(tphase,tperiod);
if(tphase<0.0) {
tphase=tperiod+tphase;
}
trf=trise+tfall;
if(tphase<trise) {
*amp=tphase/trise;
} else if(tphase<trf) {
*amp=(trf-tphase)/tfall;
} else {
*amp=0.0;
}
*pha=(tphase/tperiod)*PI*2.0;
}
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
48
The return type for function sawtooth_amp_pha() is declared void because now we
are not using the return value. Instead we have declared two arguments as pointers
to double *amp and double *pha. Within the body of the function the values associated
with these pointers are assigned, *amp=... and *pha=....
The following program uses the sawtooth_amp_pha() function to produce a tabulation
of both the amplitude and phase of the sawtooth.
/* prog11 - tabulation of sawtooth function amplitude and phase */
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define MAXT 1000
void sawtooth_amp_pha(double tr, double tf, double tc, double tz,
double t, double *amp, double *pha);
int main()
{
double tmin=-14.,tmax=14.,tsam;
double trise=1.0,tfall=5.0,tcons=4.0,tzero=3.0;
int nt=MAXT,i;
double t[MAXT],amp[MAXT],pha[MAXT];
FILE *fh;
/* Set up time samples and calculate function */
tsam=(tmax-tmin)/(nt-1);
for(i=0; i<nt; i=i+1) {
t[i]=tmin+tsam*i;
sawtooth_amp_pha(trise,tfall,tcons,tzero,
t[i],&amp[i],&pha[i]);
}
/* Produce a tabulation of function on file */
fh = fopen("sawtooth.dat","w");
fprintf(fh,"i t amp pha\n");
for(i=0; i<nt; i=i+1) {
fprintf(fh,"%d %f %f %f\n",i,t[i],amp[i],pha[i]);
}
fclose(fh);
}
In prog11 you can see that by passing a pointer to a variable using &amp[i] and &pha[i]
in the call to sawtooth_amp_pha() rather than the it is possible to write user-defined
functions that calculate and return more than just a single value. The function calculates
the results and writes them directly into the variables that our pointers point to. This is
called pass-by-reference; you are passing a reference to a variable rather than its value.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
49
Because the function is using pointers to write directly to these variables, you no longer
need the function to supply a return value and the return value has type void and there
is no return statement.
You should recall that the library function scanf() to read input from the keyboard uses
the same mechanism. The arguments to this function are passed as pointers so that the
function can assign the values it reads from the keyboard to the variables we want to
contain them.
3.32
Exercise 11 - Using pointers to return values from a
function
(i) Edit the source file sawtooth.c so that includes the definition of the function
sawtooth_amp_pha(). Re-compile this source file to produce an object file sawtooth.o
which contains both sawtooth_amp() and sawtooth_amp_pha().
(ii) Type the program prog11 into prog11.c and compile and run the program.
Remember to include the sawtooth.o file in the compiler input list so that it picks up
the sawtooth functions. Run the program to check that it produces a tabulation which
contains both the amplitude and phase as a function of time.
3.33
Passing arrays to functions
An array in C is a variable with multiple values. These multiple values are arranged in
successive locations in the computers memory. Specifying the location of the first element
of the array, the number of elements in the array and the data-type of each element is
sufficient to fully describe the array. This is the way that arrays are passed to functions
in C. A side effect of this is that all arrays in C are passed to functions by reference, not
by value, and therefore a function that accepts an array as an argument can write to the
array, to alter the values in the elements. In prog11 we used a call which illustrates this.
sawtooth_amp_pha(trise,tfall,tcons,tzero,t[i],&amp[i],&pha[i]);
The functions sawtooth_amp() and sawtooth_amp_pha() calculate the value of the
function at a single value of time t. We can improve on this by setting up a function
which calculates the amplitude and phase for an array of time values. The following
function does just that.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
50
/* function to return amplitude and phase for a series of times */
void sawtooth(double *tr, double *tf, double *tc, double *tz,
int *nt, double *t, double *amp, double *pha)
{
int i;
for(i = 0; i < *nt; i=i+1) {
sawtooth_amp_pha(*tr,*tf,*tc,*tz,t[i],&amp[i],&pha[i]);
}
}
You will notice that we have declared all the arguments to the new function sawtooth()
as pointers. This is strictly unnecessary for the first six arguments but as you will see it
is required if we want to call the function from R as we do in Section 4.
3.34
Exercise 12 - Testing the final sawtooth function
(i) Type the function sawtooth() into the source file sawtooth.c and compile
it. The object file sawtooth.o will now contain three functions, sawtooth_amp(),
sawtooth_amp_pha() and sawtooth().
(ii) Write a program prog12 to use sawtooth() to generate the tabulation file
sawtooth.dat. The call to the function should now occur outside the for loop and look
something like:
sawtooth(&trise,&tfall,&tcons,&tzero,&nt,t,amp,pha);
The variables t, amp and pha are declared as arrays so the variable names on their own
(without an index specification) act as pointers. It is important the the value of nt passed
to the function is less or equal to the declared size of the arrays. If the arrays have less
than nt elements then the sawtooth function will attempt to use array indices which are
out of range and start overwriting other code.
4
Programming with C and R together
The First Year Computing Workshop provided an introduction to using the S language
within the R environment. All the basic elements of procedural programming were covered
albeit in a simple form. This Second Year Workshop has so far given you an introduction
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
51
to C using the same basic elements of procedural progamming. If you look through the
exercises in both workshops you will see they follow a similar pattern. What can be done
using R can be done using C. Of course there is an obvious difference. R operates at a
much higher level in the sense that the R environment provides facilities like plotting and
complicated statistical analysis through simple commands. C acts at a much lower level
concerned with the details of data types, pointers and values etc..
In this last part of the Second Year Workshop we show you how to use C and R together
to capitalise on the advantages of both.
4.1
Exercise 13 - Using R to repeat Exercise 9
(i) Write an R script, prog13.R, to perform the same linear regression on the data in the
hubble.dat file as you did in Exercise 9. You can modify the script from Exercise 6 in
the First Year Workshop to do this.
(ii) Check that the answer you get is the same as that from prog9. If there is a small
difference can you suggest why that might be?
(iii) Include R code in prog13.R to plot the data points and plot the best fit linear
regression line.
Comparing Exercise 9 using C and Exercise 13 using R it should be obvious that
performing linear modelling (or linear least squares fitting) using R is significantly easier
than using C. This is because in R you are making use of a large body of software which
has been incorporated into the R environment. In addition, if you use the lm() command
in R it performs a lot more statistical analysis over and above simple linear regression. For
example you can calculate and list the estimate standard errors on the fitted parameters
(gradient and intercept) using the object created.
(iv) Use the lm() command to do the linear regression in prog13 and then list the
coefficients and standard errors as follows:
lmfit<-lm(y~x)
cat("fit parameters: ",lmfit$coefficients,"\n")
cat("standard errors: ",sqrt(diag(vcov(lmfit))),"\n")
The function vcov() produces the covariance matrix. Then diag() picks out the diagonal
of this matrix and finally we estimate the standard errors by taking the sqrt() of the
variance values along the diagonal.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
4.2
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
52
Exercise 14 - Plotting the sawtooth function
(i) Write a R script prog14.R which plots the sawtooth function using the tabulation
produced in Exercise 12. Plot two panels, one for the amplitude vs. time and the other
for the phase vs. time. Use this script to check that your sawtooth functions are working
correctly. Are the parameters trise, tfall, tcons and tzero doing the right things?
Try changing these and re-running prog12 to check.
This illustrates how you can perform some calculation or analysis using a C program,
write the results to a tabulation file (in this case sawtooth.dat), read the file as a table
into R and use R to plot out the results.
Of course you could pass data in the opposite direction by performing some analysis using
R, writing the results to a tabulation file using write.table() and then use a C program
to read the tabulation and perform further analysis.
4.3
Calling a C function from R
Routines (functions) can be written in C (or Fortran 77 or C++) and linked to R at runtime so that they can be called from within R. Such routines are compiled and then linked
into a shareble object library (also called a dynamically loadable library, ddl, in Windows
speak) and this library is then loaded into R when R is running. Two R functions,
.C() and .Fortran(), are used to interface with the external routines. There is a fixed
mapping between vectors in R and variables/arrays in C or Fortran which must be used.
R storage mode
logical
integer
double
complex
character
raw
C type
int *
int *
double *
Rcomplex *
char **
unsigned char *
FORTRAN type
INTEGER
INTEGER
DOUBLE PRECISION
DOUBLE COMPLEX
CHARACTER*255
none
All the C type specifications are pointers so all the arguments (parameters) of the C
function are passed by reference not value. The C function should not return any value
except through its arguments so the function should be defined as type void. This may
seem somewhat restricting but in practice it isn’t. If you have a C function you want
to call but it doesn’t conform to the above restrictions you can always write a simple
wrapper function that does conform. A function which does conform is the sawtooth()
function we used in Exercise 12. This is acting as wrapper for the sawtooth_amp_pha()
which doesn’t conform.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
53
You can use R to compile and link this function into a shareable library using the shell
command:
$ R CMD SHLIB sawtooth.c
This will create a file sawtooth.so which is the shareble object. Actually you can perform
the same task by using the C compiler directly but using the R command above ensures
that the shareble object created is linked with all the required libraries.
Note: if you have previously compiled the sawtooth.c file the above command to create
the shareable object might fail because it doesn’t like the sawtooth.o object file you
created. If this happens you should delete the sawtooth.o file and then reissue the R
CMD command:
$ rm sawtooth.o
$ R CMD SHLIB sawtooth.c
The is because the code must be compiled with the −f P IC switch (PIC stands for
Position Independent Code) in order to create a shareable object library.
The following R script uses the .C() function to call the sawtooth() function, load the
shareable object and then call the function.
# Define the sawtooth function using the C interface
sawtooth <- function(tr, tf, tc, tz, t) {
a<-.C("sawtooth", as.double(tr), as.double(tf),
as.double(tc), as.double(tz), as.integer(length(t)), as.double(t),
amp = double(length(t)), pha = double(length(t)))
a$t=t
return(a)
}
# Load the shareable (dynamically loadable) library
dyn.load("sawtooth.so")
# Set up vectors to sample function
t<-seq(length=1000,from=-14,to=14)
# Call the sawtooth function and plot the results
ss<-sawtooth(1.0,4.0,5.0,3.0,t)
par(mfrow=c(1,2))
plot(ss$t,ss$amp,type="l")
plot(ss$t,ss$pha,type="l")
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
54
The arguments of .C() are straighforward. The first argument is a character string
which is the name of the C function, in this case sawtooth. The rest of the arguments
correspond directly to the arguments of our C function. Arguments like as.double(tr)
specify that the R vector tr is to be passed as a numeric vector (double precision). Note
that tr is given in the argument list of the R function we are defining. Arguments like
as.integer(length(t)) specify that the length of the R vector (also in the argument
list of the R function being defined) is to be passed to the C routine as a pointer to
an integer. Values are returned to R from the C function using arguments specify like
amp=double(length(t)). This is creating a R vector assigned to the name amp with the
correct type (double) and length (the same size as the argument vector t). When the
.C() function executes it will create a R object. In the above this object is assigned to
the name a. This object will contain components called amp and pha which are the results
returned from the function. In addition the line a$t=t will assign a component in a which
is t, the times at which the function was evaluated. This is not strictly necessary but it is
neat. The object created will contain the original time sample vector and the amplitude
and phase vectors created by the function call. Finally the object a is returned as the
value of the function.
Before the newly defined R function sawtooth() can be called we must load the shareble
object. This is done by the line
dyn.load("sawtooth.so")
R looks for this file in the current working directory unless a path specification is included
in the file name.
In the script the returned object is assigned to the name ss and plot() picks up the
components using the usual $ operator, plot(ss$t,ss$amp,type="l").
4.4
Exercise 15 - calling the sawtooth C function from R
(i) Create the shareble object sawtooth.so using the R CMD SHLIB command.
(ii) Type the R script which defines and uses sawtooth into the file prog15.R and try
executing this script. You should get a plot similar to the one produced by prog14.R
above.
(iii) Check that the sawtooth function written in C and called in R is behaving properly.
Do this by changing the value of the arguments passed to the sawtooth() function in
prog15.R and examining the plots produced to see if the sawtooth shape is being generated
properly by your code.
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
4.5
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
55
Using C functions in R
The sawtooth function demonstrates how vector objects can be passed between R and
C. This example is a little more complicated than is often the case because we have
constructed our sawtooth to generate both an amplitude and a phase. In order to pass
these back we constructed an object called a within the definition of the R sawtooth()
function and assigned this as the object ss when we used the function. The object contains
three components, amp, pha and the orginal t. If we simply want to return a single vector
we can make things simpler.
# Define the sawtooth amplitude function using the C interface
sawtoothamp <- function(tr, tf, tc, tz, t) {
a<-.C("sawtooth", as.double(tr), as.double(tf),
as.double(tc), as.double(tz), as.integer(length(t)), as.double(t),
amp = double(length(t)), pha = double(length(t)))
return(a$amp)
}
# Load the shareable (dynamically loadable) library
dyn.load("sawtooth.so")
# Set up vectors to sample function
t<-seq(length=1000,from=-14,to=14)
# Call the sawtooth function and plot the results
s<-sawtoothamp(1.0,4.0,5.0,3.0,t)
plot(t,s,type="l")
We could simplify even more by using a C function which didn’t return the phase array.
In the above this is being calculated unnecessarily by the C sawtooth() function and
ignored.
4.6
Handling 2-D arrays in C and R
In many applications it is useful to use 2-D arrays to sample functions of 2 variables or
represent some form of image or 2-D distribution. In this section we consider the spectra of
thermal sources as examples of functions of 2 variables (photon energy and temperature)
which can be sampled using 2-D arrays.
The Planck function expressed as a function of frequency ν gives the spectral energy
density of Black-body radiation.
U(ν) =
8πh
ν3
c3 exp(hν/kT ) − 1
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
56
where h is Planck’s constant, c is the velocity of light and k is Boltzmann’s constant.
This function has great significance in the history of Physics and still has prominence in
many areas of research. It is particularly important in the analysis of observations of the
Cosmic Microwave Background and modern cosmology. We can re-write the function in
terms of photons per unit energy interval by substituting E = hν.
N(E, T ) = A
E2
exp(E/kT ) − 1
(1)
Now we are counting photons and not energy so the power in the numerator drops to 2
and we have gathered together the leading constants into a single normalisation constant
A (which will not concern us here). Equation 1 is the form of the Black-body spectrum
which is most useful to observers using photon counting spectral instruments.
The optical depth of a Black-body is infinite and the source is said to be optically
thick. Every photon we see has been scattered many times exchanging energy with the
electrons and ions such that the radiation is in thermal equilibrium with the matter
in the source. Black-body sources arise from dense matter. The spectrum of thermal
radiation seen from much more rarified plasma is different. If the optical depth of
the source is zero (or close to zero) then the source is said to be optically thin. The
photons we see are then characteristic of the emission processes involved and have not
been thermalised by any scattering. The continuum thermal radiation seen from such a
source is bremsstrahlung radiation which is caused by the braking or deceleration of the
negatively charged electrons as they pass close to positively charged ions. The volume
emissivity of thermal bremsstrahlung radiation is given by
J(ν) = 6.8 × 10−37 Z 2 Ne Nz T −1/2 G(ν, T )exp(−hν/kT )
Watts s−1 m−3 Hz−1 where Z is the nuclear charge of the ions, Ne and Nz are the number
densities of the electrons and ions and G(ν, T ) is the so-called Gaunt factor calculated
using quantum mechanics to describe the scattering process which produces the radiation.
Again, we can re-write this in terms of photons per unit energy interval
N(E, T ) = B.G(E, T )T −1/2
exp(−E/kT )
E
(2)
Note that the factor E in the denominator on the right-hand side converts the spectral
energy to a number of photons. B is a normalisation constant which contains details of
the geometry and composition of the source and will not concern us here. The Gaunt
factor is a complicated function which can be approximated using polynomials. We won’t
be concerned with the physical details here so instead of specifying the Gaunt factor using
equations we will proceed by giving you a C function which calculates an approximation
to the Gaunt factor.
#include <math.h>
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
57
/*
Born approximation to the Gaunt factor
Based on the Kellogg, Baldwin & Koch (ApJ 199, 299) polynomial fits
to the numerical values in Karzas & Latter (ApJS 6, 167)
arguments: ekev photon energy kev, temp temperature keV
R. Willingale Nov 2013
*/
double gaunt(double ekev, double temp) {
double x,y,z,t2,y2,bo,aio,fac;
x=(ekev/temp)*0.5;
if(x<2.0) {
t2=x*x/3.75/3.75;
aio=(((((.0045813*t2+.0360768)*t2+.2659732)*t2+1.2067492)*t2+
3.0899424)*t2+3.5156229)*t2+1.0;
y=x*0.5;
z=log(y);
y2=y*y;
bo=-aio*z+(((((.0000074*y2+.0001075)*y2+.00262698)*y2+
.0348859)*y2+.23069756)*y2+.4227842)*y2-.57721566;
fac=exp(x)*bo;
} else {
y=2./x;
bo=(((((.00053208*y-.0025154)*y+.00587872)*y-.01062446)*y+
.02189568)*y-.07832358)*y+1.25331414;
fac=bo/sqrt(x);
}
return .551329*fac;
}
You will note that the photon energy is specified by the variable ekev and is expressed
in unit of keV (kilo electron volts). You will also notice that Boltzmann’s constant is not
used. This is because the temperature variable temp is also expressed in unit of keV. We
convert from temperature in Kelvin to eV by multiplying by Boltzmann’s constant and
dividing by the charge on the electron.
We can sample the Gaunt factor over a 2-D grid (array) of points using the following C
function.
/* Generate a 2-D array of Gaunt factor values */
void gaunt_array(int *ne, double *ekev, int *nt, double *temp, double *garr) {
int i,j,ij;
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
58
for(i=0; i< *ne; i++) {
for(j=0; j< *nt; j++) {
ij=i+j*(*ne);
garr[ij]=gaunt(ekev[i],temp[j]);
}
}
}
The pointer int *ne specifies the number of photon energy grid points. The pointer
double *ekev specifies the array of photon energy values. Similarly the temperature
grid points are specified by pointers int *nt and double *temp. The total number of
grid points will be ne*nt and the grid of Gaunt factor values is returned in the array
with pointer double *garr. Therefore *garr must be declared as a double precision
array of this total size (or larger) in the calling routine (either C or R). Within the
function gaunt_array the output array is indexed as a 1-D array using the integer variable
ij=i+j*(*ne). Successive rows of energy values corresponding to each of the temperature
values are packed into garr in sequence. It is possible to define multidimensional arrays
with 2 or more indices in C but using the simple single index arithmetic (index ij) is more
convenient in the above application. So the reference garr[ij] is equivalent to using 2
indices garr[i][j] in C or garr[i,j] in R.
If we compile the C functions gaunt() and gaunt_array() into a shareable library
spec_fun.so we can call gaunt_array() from a R script.
# Define the Gaunt factor function
gaunt <- function(ekev,temp) {
ne<- length(ekev)
nt<- length(temp)
nn<- ne*nt
a<-.C("gaunt_array", as.integer(ne), as.double(ekev),
as.integer(nt), as.double(temp), f=double(length=nn))
dim(a$f)<- c(ne,nt)
return(a$f)
}
# Load the shareable (dynamically loadable) library
dyn.load("spec_fun.so")
# Set up vectors to sample function in log10(ekev) and log10(temp)
ne<- 50
nt<- 50
lekev<-seq(length=ne,from=-1,to=3)
ltemp<-seq(length=nt,from=1,to=2)
ekev<- 10^lekev
temp<- 10^ltemp
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
59
# Call the Gaunt factor function
g<-gaunt(ekev,temp)
# Plot grid in perspective
persp(lekev,ltemp,g,theta=30,phi=30,col = "lightblue",
xlab="log10(photon energy kev)",
ylab="log10(temperature keV)", zlab="Gaunt factor")
Within the definition of the R function gaunt() we find the size of the grid ne by nt using
the lengths of the ekev and temp array arguments. The output array f is specified with
length nn=ne*nt Before we return the output array we set the dimensions using
dim(a$f)<- c(ne,nt)
so that the results is a 2-D array. In the above R script the grid uses logarithmic sampling
in both photon energy (ekev) and temperature (temp) so we cover a large dynamic range
0.1 < ekev < 1000 and 10 < temp < 100 both in units of keV.
4.7
Exercise 16 - Investigating thermal spectra
(i) Create a C source file spec_fun.c containing the C functions gaunt() and
gaunt_array() given above. Compile these functions using the R CMD SHLIB command
to create the shareable object spec_fun.so.
(ii) Create a R script file prog16.R which defines the R function gaunt(), loads the
shareable library, generates and plots a 2-D grid of values of the Gaunt factor.
(iii) Modify the R script prog16.R so that it also plots the Gaunt factor vs. photon energy
for a temperature of 1 keV and 100 keV on a single graph. You can pick out the required
Gaunt factor values from the 2-D array using indices as indicated in the following snippet
of R script.
plot(ekev,g[1:ne,nt:nt],type="l",log="x",
xlab="Photon Energy keV",ylab="Gaunt factor")
lines(ekev,g[1:ne,1:1],type="l")
(iv) Write 4 more functions in the C source file spec_fun.c to calculate the Black-body
photon spectrum and Bremsstrahlung photon spectrum. You should use the formulae in
Equations 1 and 2 replacing the energy term kT by the temperature T expressed in keV.
Do not include the normalisation constants A or B. The functions should have names
Document:
Issue:
Date:
Page:
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
CR1
1.0
October 12, 2015
60
following the convention set by the Gaunt factor functions, bbody(), bbody_array(),
brems(), brems_array(). Re-compile to create a new shareable object file which contains
the new functions along with the original Gaunt factor functions.
(v) Write R scripts plot_bbody.R and plot_brems.R to plot 2-D sample grids for the
Black-body and Bremsstrahlung photon spectra. The dynamic range of the photon
spectra is large so in order to get a full picture it is better to plot the photon flux values
using a logarithmic scale. When using the persp() function for plotting a projection of
the grid this can be achieved as follows.
b<-bbody(ekev,temp)
z<- log10(b)
z[b==0]<- NA
z[z<(max(z,na.rm=T)-4)]<- NA
persp(lekev,ltemp,z,theta=30,phi=30,col="lightblue")
(vi) Modify both plot_bbody.R and plot_brems.R so that they also plot the photon
spectra for a temperature of 1 keV and 100 keV on single graph.
(vii) Modify plot_bbody.R so that it also calculates the mean photon energy in the Blackbody spectrum as a function of temperature T . Hint - to do this will require integrations
over energy for each value of T . Is the result what you might expect from classical
statistical mechanics if you consider the spectrum to arise from a photon gas?
5
Programming unaided
So far we have provided you with examples of all the required code to help you complete
the exercises. For the final exercise below you will be programming largely unaided to
give you practice in writing your own code.
The gravitational potential of a sphere with uniform density, radius R0 and mass M0 is




V (r) = 


0
− GM
r
GM0
2R0
if r > R0
r2
R20
−3
otherwise
where r is the radius from the centre and G is the gravitational constant. If the
centreq of the sphere is at position (x0 , y0 , z0 ) then the radius at position (x, y, z) is
r = ((x − x0 )2 + (y − y0 )2 + (z − z0 )2 ). If you have 2 or more spheres of different
sizes and at different positions then the total gravitational potential will be the sum of
P
the individual components, Vtot = Vn .
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
5.1
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
61
Exercise 17 - The gravitational potential of the Earth-Moon
system
radius of Earth
mass of Earth
radius of Moon
mass of Moon
Earth-Moon distance centre-to-centre
Gravitational constant
6371 km
5.972 × 1024 kg
1737 km
7.348 × 1022 kg
384400 km
6.674 × 10−11 m3 kg−1 s−2
(i) Write a C function to calculate the gravitational potential of a uniform sphere as a
function of some given arbitary position (x, y, z) above or below the surface.
(ii) Write a second C function that uses the above and which can be called from R to
calculate the gravitational potential of a uniform sphere over a 3-D grid of points in
(x, y, z).
(iii) Write a R script that uses these C functions to calculate the gravitational potential
of the Earth (ignoring the influence of the Moon) over a plane which contains the centre
of the Earth and which extends out to several Earth radii. Plot the grid in projection.
(iv) Write a R script the calculates the gravitational potential of the full Earth-Moon
system along the line that joins the centres. Plot this potential as a function of position.
(v) Add code to the R script to find the gravitational potential of the Earth-Moon system
at the centre of the Earth and the centre of the Moon.
(vi) Finally add code to the R script to estimate the energy in Joules/kg required to
escape from a) the surface of the Earth and b) the surface of the Moon.
6
6.1
Quick C Reference
Variables, types and declarations
The fundamental data types in C are:
char
short int
int
a single character (usually 1 byte of 8 bits)
usually 1 byte
an integer usually 2 bytes
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
long int
float
double
long double
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
62
a large integer usually 4 bytes
single precision floating point usually 4 bytes
double precision floating point
high precision floating point
The integer types including char can be qualified by unsigned or signed to determine
whether or not a sign bit is included.
unsigned int
signed int
unsigned char
signed char
usually 2 bytes giving range 0 to 65535
usually 2 bytes giving range -32768 to 32767
1 byte giving range 0 to 255
1 byte giving range -128 to 127
Derived types are created using the declaration operators:
*
&
[]
()
6.2
pointer, a prefix operator
reference, a prefix operator
array, a postfix operator
function, a postfix operator
Constants
Integer constants are written as:
9876
987654321L
987654321l
U9876
u9876
assumed type int
type long
type long
type unsigned int
type unsigned int
Integers can be expressed in octal or hexadecimal:
0123
0x134A
leading zero indicate octal constant
leading 0x (zero x) indicates hexadecimal
Floating point constants are written as:
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
4.321
4.3e-5
3.1f
3.1e-1F
3.1l
3.1e-1L
type
type
type
type
type
type
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
63
double
double
float
float
long double
long double
A character string constant is enclosed in double quotes.
"This is a character string constant"
6.3
Reserved identifiers
There is a set of identifiers reserved for use as keywords in C and C++ and these must
not be used otherwise.
asm
auto
break
case
catch
char
class
const
continue
default
delete
do
double
else
enum
extern
float
for
friend
goto
if
inline
int
long
new
operator
private
protected
public
register
return
short
6.4
Input and output
signed
sizeof
static
struct
switch
template
this
throw
try
typedef
union
unsigned
virtual
void
volatile
while
scanf("%d",&value);
// scan (read) standard input for value
printf("value typed %d \n",value);
// write to standard output
fgets(line,sizeof(line),stdin);
// get character string from file stream
sscanf(line,"%f",&ans);
// scan string for value
nc=sprintf(textline,"an integer %d",ival);
// write to a character string
The complete set of format specifiers is:
%d
integer decimal notation
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
%o
%x
%u
%c
%s
%e
%f
%g
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
64
integer unsigned octal notation
integer unsigned hexadecimal notation
integer unsigned decimal
single character
string of characters
floating point (single or double) exponential notation
floating point (single or double) decimal notation
floating point as %e or %f, whichever is shorter
The full width (w) and precision (p) can be specifed by including them directly after the
% character, %w.pf
%10d
format integer with full width 10 characters
%15.10%f format real with full width 15, 10 digits after decimal point
The full list of escape characters is:
\n
\t
\v
\b
\r
\f
\a
\\
\?
\’
\"
\0
\ooo
\xhhh
newline
horizontal tab
vertical tab
backspace
carriage return
form feed
alert or bell
backslash
question mark
single quote
double quote
null
octal number
hexadecimal number
6.5
Operators
C has a very rich set of operators. Here is a list of common operators in order of
precedence.
[]
subscripting pointer[expr]
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
()
++
~
!
+
&
*
()
*
/
%
+
<
<=
>
>=
==
!=
&
^
|
<<
>>
&&
||
?:
=
,
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
65
function call expr(expr_list)
post/pre increment lvalue++ or ++lvalue
complement ~expr
not !expr
unary minus -expr
unary plus +expr
address of &lvalue
indirection (dereference) *expr
cast (type)expr
multiply expr*expr
divide expr/expr
modulo (remainder) expr%expr
add (plus) expr+expr
subtract (minus) expr-expr
less than expr<expr
less than or equal expr<=expr
greater than expr>expr
greater than or equal expr>=expr
equal expr==expr
not equal expr!=expr
bitwise AND expr&expr
bitwise exclusive OR expr^expr
bitwise inclusice OR expr|expr
left shift bits expr<<shift
right shift bits expr>>shift
logical AND expr&&expr
logical inclusice OR expr||expr
conditional expression expr?expr:expr
simple assignment lvalue=expr
comma (sequencing) expr,expr
In this table lvalue is an entity which can appear on the left hand side of an assignment,
typically a variable name. This may be a simple variable, an array element or a pointer.
You should be careful that an lvalue is what you intend, a pointer or a primitive type.
The type of both sides of an assignment should be the same. If you look at some C code
you will often see composite operators like +- which means add and assign. These can be
confusing and I suggest to begin with you avoid using these.
6.6
Compiler directives
#include <sys/pci.h>
// include a system header file
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
#include "pciadc.h"
#define DEVICE_ID 0x0adc
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
66
// include a local header file
// define a replacement macro
Commonly used macro definitions in header files:
EXIT_SUCCESS // function integer return OK
EXIT_FAILURE // funciton integer return not OK
6.7
Conditional statement blocks
The basic conditional statement block has the form:
if(expression1)
statement1;
else if (expression2)
statement2;
else
statement;
If the statements require more than one line you must use curly braces to gather together
the scope of each conditional:
if(expression1)
{
statement1a;
statement1b;
...
}
else if (expression2)
{
statement2a;
statement2b;
...
}
else
{
statementa;
statementb;
...
}
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
67
In either case the statement or statements following the (expression) are executed if the
value of the expression is true or non-zero.
6.8
Definite loops
A typical definite loop has the form:
int a[10];
int i;
for(i=0;i<10;i++)
{
a[i]=i;
}
Indefinite and infinite loops Indefinite loops come in two forms:
while(expression)
{
body of loop
}
do
{
body of loop
}
while(expression)
In the second variant the body of the loop is executed before the expression is evaluated
thus ensuring the body is executed at least once.
An infinite loop can be set up using:
for(;;)
{
...
if(finish loop expression)
break;
...
}
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
68
Such a loop should be terminated using break as shown or return. The break statement
can be used to terminate any for(), while() or do structure.
6.9
Functions and header files
The main program function: the integer argc is the number of command line arguments
which are passed in character string array argv. Note argv[0] is the name of the program
as it occurs on the command line:
int main(int argc, char* argv[])
The prototype declarations of all functions are held in header files. There are an enormous
number of these files and an even larger number of prototype function declarations.
The system wide header files are usually found at /usr/include on Unix and Unix-like
systems. For example the floating-point mathematics functions are declared in math.h.
To use any of the functions you must declare them by including the appropriate header
file at the top of your source file:
#include <math.h>
6.10
List of mathematical functions in C
Here is a list of some of the mathematical functions available in C. If you wish to use any
of the following mathematical functions in your program, you will need to ensure that
you have the line
#include <math.h>
in the first few lines of your program, and that you are using the -lm compiler switch
when compiling your code.
sin(x)
cos(x)
tan(x)
Sine of x (x in radians)
Cosine of x (x in radians)
Tangent of x (x in radians)
University of Leicester
Department of Physics and Astronomy
Second Year C and R Programming Workshop
asin(x)
acos(x)
atan(x)
atan2(x,y)
exp(x)
log(x)
log10(x)
powf(x,y)
sqrt(x)
fabs(x)
fmod(x)
ceil(x)
floor(x)
Document:
Issue:
Date:
Page:
CR1
1.0
October 12, 2015
69
Arcsine of x (result lies between !/2 and +!/2)
Arccosine of x (result lies between 0 and +!)
Arctangent of x (result lies between !/2 and +!/2)
Arctangent of y/x (result lies between ! and +!)
Exponential function
Natural logarithm (base e) of x
Logarithm to base 10 of x
x to the power y (xy)
Square-root of x
Absolute value of x
Returns the remainder of x/y, with the sign of x
Returns the smallest integer not less than x
Returns the largest integer not greater than x
In the above table, x, and y are of type double. All the above functions return double
results.
Download