Powerpoint slides - Dynamic Connectome Lab

advertisement
Introduction to
programming languages
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
1
Objectives



Concepts of programming
Programming languages
Development of computer programs
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
2
Why computer programs ?


Problems:
•
•
•
•
•
Arranging the text of a letter
Collecting and maintaining data about customers
Calculating the best investment portfolio
Making a photo with your mobile phone
Synchronising the components of car engine
Computer programs aim to solve such problems
related to electronically stored and processed data
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
3
Solving problems





Problem description
Data collections
Problem analysis (including data analysis)
Designing a solution
Implementing the solution
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
4
Algorithms




Algorithm = systematic processing of actual or virtual
data
Specification of input and output data
Specification of methods of data processing
E.g. Euclid’s greatest common divisor algorithm:
•
•
•
•
•
a, b two positive numbers – which is their gcd ?
x = a, y = b
If x > y then n = x, d = y otherwise n = y, d = x
n = q * d + r, x = d, y = r
If y = 0 then gcd = x
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
5
Early computers



Binary data entry – punch-cards
Machine language: e.g. MOV A,B; LLR; etc.
Difficult to program – easy to make errors
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
6
Constants and variables






Constant: a fixed value, e.g. 5
Constant: a fixed value with a name, e.g. a=5
Variable x – a place holder for a value (e.g. number,
text)
‘:=‘ assignation of a value to a variable = the
contents of the variable with a given name takes a
certain specified value
Makes sense: x := x +1
•
x := 5, x := x+1, now the value of x is 6
Other variables: s := ‘Hello!’, y := (2, ‘apples’, ‘table’)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
7
Data types





Data is stored in variables with names
Variable: name + type + contents
Type determines what kind of contents the variable
may have: e.g. integer, floating point real, string,
combination of other data types
E.g.
•
•
int x, x := 5 is allowed, x := 5.1 is not allowed
string s, s := ‘hello kids’ is allowed, s := 3 is not allowed
Type definition for combined types:
•
•
addr = record (int nr, string st, string ct, string pc)
addr a, a := (5, ‘Hyde’, ‘York’, ‘YO2 4RH’)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
8
Operators




Operators: +, -, concatenate, <=:
•
a:=5+3, s:=concatenate(‘hot’, ‘dog’), a<=5
Each type has a range of operators that can be
applied to variables of that type
Operator overload: some operators may apply in
different ways to data of different types
In case of subtypes, e.g. real and integer, additional
operators may apply to the subtype – e.g. integer
division
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
9
Early programming languages



Fortran, Cobol
Better than machine code
Introduce flow control
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
10
Flow control – 1 (conditions)




If-then-else
Branching depending on condition
If <condition> then <Tblock> else <Fblock>
E.g.
•
•
If x=5 then a=2 else a=1
If (signal, left) then (turn, left) else (turn, right)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
11
Flow control – 2 (loops)







for – fixed length cycling
for <init statement>, <increment statement>, <condition
statement>, <execution statement>
E.g.
•
for {i:=1,a:=1}, i:=i+1, i<=100, do a:=a*i;
while, repeat – variable length cycling
while <condition statement>, <execution statement>
repeat < execution statement>, <condition statement>
E.g.
•
•
while i<100, do a:=a*i, i:=i+1
repeat a:=a*i, i:=i+1, until i=100
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
12
Structured programming



Structured programming was introduced in the late
60’s – early 70’s
Pascal, C
Flow control is packaged into procedures, data are
separated between program structures  better
understanding, better design, better programs with
fewer errors
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
13
Procedures and functions




Procedures: blocks of programs containing flow control
structures with a set of specified input data and a set of
specified output data
Functions: similar to procedures, but generates a single output
data (i.e. it is like a function)
Procedures are called with a set of actual values of their formal
input variables and a set of variables specified for their formal
output variables
E.g.
•
•
•
•
•
procedure Draw (int x,y,z,w)
procedure Prediction (int x,y,z; var int a,b)
int function Length (string s)
Length(‘hello’)
Draw(10,10,50,50)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
14
Recurrent procedures



Recurrent procedure: procedure that calls itself
Data separation
E.g.
•
Procedure Gcd (int a,b; var int g)
int x,y,r,q,n,d
x:=a; y:=b;
if x>y then {n:=x; d:=y} else {n:=y; d:=x};
q:=n div d; r:=n – q*d;
x:=d; y:=r;
if y=0 then g:=x else Gcd(x,y,g);
end;
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
15
Object oriented programming



Object oriented programming emerges in the 70s
and becomes mainstream programming paradigm in
the late-80s – early 90s
Aims:
•
•
•
Better description of real world problems
Better software design
Increased reliability of large software systems
Smalltalk, Delphi, C++, C#, Java
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
16
Classes and objects – 1




Class: encapsulation of data and data manipulation, such that
interference with outside is the minimal necessary
Class: attributes and methods – some visible from the outside,
most visible only inside
E.g.
•
Class Square
int llx,lly,dx,color
Create
Destroy
Draw
FillDraw
Square S , S.Create – an object is an instance of a class
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
17
Classes and objects – 2



Classes can be defined as derivatives of other classes –
inheritance
Derived classes inherit attributes and methods from the parent
class and may add further attributes and methods to these or
may change the definition of some inherited
E.g. Class Rectangle (Square)
int dy
(new attribute)
(int llx,lly,dx,color – inherited)
Draw
(redefined)
FillDraw (redefined)
Rotate (new method)
(Create, Destroy – inherited)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
18
Flow control with exceptions





Objects are instances of classes and many objects exist
simultaneously  concurrent execution of objects
Objects interact by sending messages – i.e. invoking methods
of them, which are visible from the outside
Flow control: try – catch – throw
Exception: incorrect execution because of some reason
E.g.
try
R.Draw;
return(‘OK’);
catch (exception e)
throw GraphicsExceptionFault;
return(‘Error’);
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
19
Functional programming



Everything is written as a function, the program is a
combination of functions
LISP
Applied in AI (Artificial Intelligence)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
20
Declarative programming





Instructions are not necessarily specified directly
What is wanted is declared, but how to get it is not
specified
Prolog – logic programming used in AI
SQL – database language
Declarative programming is closer to natural
language than imperative programming (describing
how to do things – e.g. C, C++, Java), but it may
imply much longer execution time
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
21
Compilation vs. interpretation



Compilation: the program is translated into a sequence of
machine codes that can be executed directly by the processor –
the whole program is translated (compiled) at once, when it is
finished, the compiled program is executed  compilers
Interpretation: the program is interpreted by taking
instructions/declarations one-by-one, each interpretation leads
to a brief machine code translation that is executed, then the
next instruction/declaration is interpreted – the program is
translated (interpreted) as it is executed, and at any time only a
small part is translated into machine code  interpretors
Compilers usually generate faster running programs, while
interpretors leave more space for interactive use of programs
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
22
Interpreted or compiled?

BASIC

C/C++

Java

R

Matlab

Perl
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
23
Reusable software




Developing software takes long time – it is desirable
to re-use existing software to solve partial problems
of new problems
Re-use is facilitated by documentation – description
of what is written in the program and why
Early programming languages did not support very
much re-use
Object oriented programming languages provide
very much support for re-use
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
24
Component-based programming



Component-based programming is the current major
trend in software development
New software is built by combining existing
components in novel ways – relies very much on reuse of existing software
E.g. classes or objects can be purchased or used as
service providers, most of the software does not
have to written from scratch – for example handling
of a printer or reading standard file formats (like
XML)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
25
Software development







Problem analysis
Data analysis
Design
Development and integration
Prototype
Testing
Use and maintenance
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
26
Software development: problem
analysis





What is the problem that needs the software solution
E.g.
•
•
Management of data bases in a uniform manner
Visualisation of complex scientific data
Identification of users
Collection of information and data about user needs
and requirements
Analysis of collected information and data
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
27
Software development: data & design





Collection and analysis of relevant data
Analysis of data formats – needs and requirements
Design the relevant information flow
Design data structures supporting the information
flow
Design processing of the data
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
28
Software development: integration &
implementation



Development of software components implementing
the design
Acquiring existing components based on design
requirements, and analysis of features of existing
components
Integration of existing components and writing of
integration software and possible other components
that cannot be bought-in off-the-shelf
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
29
Software development: prototype &
testing



Development of a small-scale prototype to test
functionalities
Testing of components of the software system – test
scenarios, use cases
Elimination and correction of faults and errors
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
30
Software development: use and
maintenance




Installation and training of users
Deployment of the software
Maintenance
Updates and patches
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
31
Summary








Algorithms
History of programming languages: machine code; early
languages: Fortran, Cobol; structured programming: Pascal, C;
object oriented programming: C++, C#, Java; functional
programming: Lisp; declarative programming: SQL
Constants, variables, data types
Flow control structures: if-then-else, for, while, repeat
Procedures and functions
Classes: encapsulation, inheritance
Compilers and Interpreters
Software development process
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
32
Q&A






Is it true that Java is a declarative language ?
Is it true that only variables of the same type can be compared
by comparison operators ?
Can we use the ‘for’ flow control mechanism to execute the
same set of operations for 10 or 20 times depending on the
value of some processed data ?
Is it true that a class is an instance of an object ?
Can we use the try-catch-throw flow control in concurrent
environments, with many objects executed at the same time ?
Can we develop a prototype of a software before meeting the
users to collect user requirements ?
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
33
Download