Python Primer Patrice Koehl Modified by Xin Liu in Apr., 2011 Department of Computer Sciences, University of California, Davis. Acknowledgments: This primer is mostly a compilation of information found in books / web resources that I highly recommend: - - http://docs.python.org/reference/introduction.html ; this is a reference manual, and not a tutorial, but provides invaluable information about the language: do not hesitate to consult it! http://wiki.python.org/moin/BeginnersGuide ; a Beginner’s guide with many links to resources for writing and running Python programs Michael Dawson, “Python programming for the absolute beginner”, 3rd edition, Thomson Course Technology, ISBN: 978-1435455009; this is the class textbook; a great resource book on Python Introduction 1. Why Python? “Python” is an interpreted computer language developed in the 1980s and first released in 1991. Its design philosophy emphasizes programmer productivity and code readability. It is important to understand that there is always more than one way to solve a problem. In programming, Python focuses on getting the job done. One Python program may be faster than another, or more concise, or easier to understand, but if both do the same things, there won’t be a judgment that defines which one is “better”. This also means that you do not need to know every detail about the language to do what you want with it. Python has strength that makes it an ideal language to learn and use: - It is completely free, and available on all operating systems It is very easy to learn Python was designed to be easy for humans to write, rather than easy for computers to understand. Python syntax is more like English than many other programming languages Python “talks” texts. It works with words and sentences, instead of characters. Files are series of lines, instead of individual bytes. - Python is very portable. Python programs can be run on any computers, as long as Python is installed on it. Python is a “high-level” language: you do not have to worry about the computer’s operation (such as allocation and de-allocation of memory, …). It is only fair to mention that these strengths can also translate into weaknesses: - Python takes care for you of all “low-level” operations: this may not always lead to efficient code - Python is interpreted, and loses the efficiency of compiled languages. - Python users then to write programs for small, specific jobs. These programs are usually for the programmer’s eye only, and as such are often incomprehensible to everyone but the original programmer. In that respect, I can only emphasize the need for clarity, as well as for useful comments in your source files! - Python was designed to be easy for humans. As a consequence, it is relatively lenient on the style you use. This can lead to bad programming habits. As an analogy, think of what would happen to your English writing style if nobody had ever cared about how you write as long as they understand what you have written. To avoid this, the key is to develop first a method to solve your problem that is independent of Python (or any other language), and then to adapt this method to Python. 2. What is Python used for? Python has been successfully implemented in many software applications as a scripting language. Python is a very useful programming language for web applications. Python is used widely for game development, for 3D animation packages, in the information security industry,… 3. How do I get Python? Python has been ported to many platforms, and will certainly run on the standard operating systems such as UNIX, Linux, Solaris, FreeBSD, all flavors of Windows, and Apple MacOS. Python 2 versus Python 3 In December 2008, the Python consortium released a completely new version of Python, Python 3.0, that is not backward compatible: this means that programs written with Python 1 or Python 2 may not run under Python 3.0. At this stage, we move to Python 3.1.3. If you have Python 2 code, a tool, 2to3, will help you convert most of Python 2.x code to Python 3.x code by handling most of the incompatibilities. Where to get Python: - You can get the source to the latest stable release of Python from http://www.python.org. Remember that you want Python 3. - Binary distributions for some ports are available at the same address - You can get binary packages of Python for Linux, Solaris, Mac OS and Windows from ActiveState at http://www.activestate.com/ActivePython (free for download) Installing Python on Linux/UNIX Python is freely available, and usually comes packaged with most Linux/UNIX distribution. Type python from a shell prompt to check this. If you see something that starts with the text Python, then you already have python. If you don’t, check the install media (CD or DVD from which you installed Linux): the Python package should be available on it. Otherwise, get the binaries from the site mentioned above. Installing Python on Mac Os Again, Python comes packaged with the different flavors of Mac OS X, and you probably have nothing to do! Check it using python from a terminal. If you do not have it, I would advise getting it from ActiveState. Installing Python on Windows Installing ActivePython is quite straightforward. Download ActivePython’s Python installer for Windows from the activestate web site (again, it is free for download). Choose the appropriate version for the operating system you have (32 bit, or 64 bit). I would strongly advise using the MSI installer, in which case you will need Windows Installer 2.0+ (which you probably already use). It should work under Vista, but I have not tried it. 4. Getting an IDE for Python IDEs (integrated Development Environment) are great tools for learning a computer language and use it efficiently. There are many IDEs available for Python: see http://wiki.python.org/moin/IntegratedDevelopmentEnvironments for a list of such IDEs. I strongly recommend IDLE that is available for nearly all platforms. See http://www.python.org/idle/doc/idlemain.html ; it should come by default when you install Python on Windows; you have to install it however on Linux and MacOS platforms. For the latter, please check: http://challenge.ncss.edu.au/gsg-osx/ . 5. Using Python You have two main ways to use Python: - Use it directly through a Control Window: again, I strongly advise using the IDLE interface - Write the Python program (module) using a text editor, and then execute this program through the python interpreter. Again, this can be done using IDLE; alternatively, you can use standard text editors for this task (see below) Editing a Python program Python source code is just plain test and should be written with a plain text editor, rather than a word processor. If you are using Windows, you can use Notepad, despite its annoying tendency to rename file extension to .txt. You may also use Word, as long as you save the file as text, with line breaks. I would really recommend getting a good programmer’s editor. For Windows and Mac, I can recommend jEdit (http://www.jedit.org/ ): it is free (open source), runs under Windows, Mac OS X, Unix and Linux. It is easy to use, highly customizable, with many useful plugins. Another option for Mac user is TextWrangler (http://www.barebones.com/products/textwrangler/ ). Naming a Python program Traditionally, UNIX programs take no extension, while Windows files take a three-letter extension to indicate their type (.exe for an executable, .doc for a document –usually Word file-, .xls for a spreadsheet, …); the standard extension for Python program is py. Obviously, the choice of the name in front of the extension is entirely yours! Using Python in an IDE If you are mainly using your computer in a graphical environment like Windows or X, you may not be familiar with using the command line interface, or “shell”. The “shell” is the program that gets input from you through the keyboard. The “shell prompt” or just “prompt” refers to the text that prompts you to enter a command. The standard prompt in IDLE is: >>> i.e. 3 chevrons. In this primer, I will use a prompt that looks like: >>> I will show the text that you would type in bold and the text the computer generates in italic: >>> print (“Hello world!”) Hello World! 6. Your first Python program Traditionally, the first program anyone writes in a new language is called “Hello World!”, where you make the program prints that statement. Python allows us to do so using the print statement. The simplest form of the print statement takes a single argument and writes it to the standard output, i.e. the command window you have open. So your program consists of the single statement: print( “Hello World!”) You can execute this command directly in the IDLE main window, or you can incorporate it into a Python module, hello.py. The file hello.py contains: # print(‘Hello World!\n’) # The different elements of a Python script: - Documenting the program: any line (except the first) starting with a sharp (#) is treated as a command line and ignored. This allows you to provide comments on what your program is doing: this is extremely useful, so use it! More generally, a line in a Python script may contain some Python code, and be followed by a comment. This means that we can document the program “inline”. - Keywords: Instructions that Python recognizes and understands. The word print in the program above is one example. There are two types of keywords: functions (such as the print keyword); these are the verbs of the programming language and they tell python what to do. Control keywords, such as if and else. The number of Python keywords is small: and as assert break exec is try del elif else except in return from global if import raise def not or pass print continue for while with yield class finally lambda It is a good idea to respect keywords, and not use them as names in your programs! - Modules: Pythons come with a large list of modules that increases its functionality; these modules add keywords to the small list provided above, but are only available when the module has been specifically called. For example, adding: use numpy adds the modules of numerical functions “numpy” that are now accessible to the programmer. - Statements: Statements are the sentences of the program. Python is lenient however, and does not need a full stop to end a statement. The indentation levels of consecutive lines are used to generate INDENT and DEDENT, which in turn are used to determine the grouping of statements. - White space: White space is the name given to tabs, spaces, and new lines. Python is quite strict about where you put white space in your program. For example, we have seen that we use indentation to help show the block structure of statements. - Escape sequences: Python provides a mechanism called “escape sequences” to output special characters/actions: the sequence \n in the program above tells Python to start a new line. Here is a list of the more common escape sequences (also called “metacharacters”): Escape Sequence \t \n \r \’ \” \\ \b \a Meaning Tab Start a new line Carriage return Single quote Double quote Backslash Back up one character (‘backspace’) Alarm (rings the system bell) Simple exercises: 1) Write a program printline.py, that prints the sentence “This is my second program”: a. As a single line b. With a single word on each line. 2) Find an online manual for Python 3) Which of the following statements are likely to cause problems: a. b. c. d. print (“This is a valid statement\n”) print (“This is a valid statement”\n) print (“This is a ”valid” statement”) printx (“This is a valid statement\n”) Chapter 1: Scalar Variables and Data Types 1. Python as a calculator The Python interpreter acts as a simple calculator: you can type an expression at it and it will write the value. Expression syntax is straightforward: the operators +, -, * and / work just like on your regular calculator; parentheses can be used for grouping. For example: >>> 1+3 4 >>> # This is a comment >>> 2+2 # and a comment on the same line as code 4 >>> (60-5*6)/3 10 >>> 7//3 # Integer division returns the floor: 2 >>> 7//-3 -3 Remember that, by default, Python only has a limited set of keywords. For example, it only knows how to do the basic mathematical operations (+,-,/,x). If you want a more scientific calculator, you need to first import the math functions included in the module “math”: From math import * 2. Python Variables A variable is a name reference to a memory location. Variables provide an easy handle to keep track of data stored in memory. Most often, we do not know the exact value of what is in a particular memory location; rather we know the type of data that is stored there. Python has three main types of variables: - Scalar variables hold the basic building blocks of data: numbers, and characters. Array variables hold lists referenced by numbers (indices) Dictionary variables hold lists references by labels. The name of a variable can be practically any combination of characters and of arbitrary length. Note that the type of a variable cannot usually not be guessed from its name: I strongly advise you to choose a name for a variable that makes this type explicit. For example you can use names like X, X_list, X_dic to define a scalar, a list, and a dictionary, respectively. There are a few rules regarding variable names that you need to be aware of: - The first character of the name of a variable cannot be a digit Spaces are one type of characters that are not allowed: use underscore instead. Variables are case sensitive: this means that abc refers to a different location in memory than ABC. Creating a variable is as simple as making up a variable name and assigning a value to it. Assigning a value to a variable is easy: all you have to do is write an equation, with the variable name on the left, an = sign, and the value on the left. The = sign is called the assignment operator: >>>Width=4 >>>Height=3*12 >>>Area=Width*Height >>>print (Area) 144 >>>x=y=z=0 >>>DNA=’aattgcg’ >>>Name_list=[‘John’,’David’] # Note that the value of an assignment is not written # Python allows multiple assignments: x, y and z are set to 0 # assign a string variable # set up a list of names 3. Special variable Python has one special variable, _, that points to the place in memory that stores the more recent result: >>> 4+5 9 >>>print( _) 9 This special variable “_” should be considered as “read-only”, i.e. I strongly advise against assigning a value to it!! 4. Scalar variables Python has two types of scalar values: numbers and strings. Both types ca be assigned to a scalar variable. 4.1 Numbers Numbers are specified in any of the common integer or floating point format: >>>x = 1 >>>y = 5.14 >>>z = 3.25E-7 # Integer # Floating point # Scientific notation Numbers can also be represented using binary or hexadecimal notations, but we will not need that. Table of the most common number operators in Python: Operator = + * / // ** % abs(x) int(x) float(x) += -= *= /= Meaning Assign Add Subtract Multiply Divide Integer divide Exponentiation Modulus Absolute value of x x converted to integer x converted to float Assign add Assign subtract Assign multiply Assign divide Python allows us to use all standard arithmetic operators on numbers, plus a few others. The mathematical operations are performed in the standard order of precedence: power comes first, then multiplication has a higher precedence than addition and subtraction: 2+3*4 is equal to 14, and not 20. If we want the multiplication to be performed on 2+3, we need to include parenthesis: (2+3)*4. These are exactly the rules used by Python. Some of the operators listed in the table above are unusual, and require more explanation: The modulo operator: i=52 j=3 k=i%j In the example given above, the variable k holds the remainder of the division of 52 by 3, i.e. 1. Operating and assigning at once: Operations that fetch a value from memory, modify it and store it back in memory are very common: Python has introduced a special syntax for those. Generally: i = i <operator> b; can be written as: i <some operator> = b; Let us see an example: # a = 5*4 print( “5 times four is”, a, end=’\n’) a =a+4 print (“Plus three is “,a) a/=3 print (“Divided by three is “,a) In this example, “a” takes successively the values 20, 24 and 8. This works for +=, -=, *=, /=, **= and %=. 4.2 Strings A string is a group of characters attached together, enclosed by quotation marks. For now, we will only consider double quotes. Just like with numbers, many operations can be performed on strings: the most common ones are listed in the table below. String operator a+b a*i a[i:j:k] a[::-1] a.split(sep) a.strip() a.upper() a.lower() a.capitalize() a.count(‘sub’) a.replace(‘sub1’,’sub2’,n) Meaning Concatenates strings a and b Repeats string a i times Returns a string containing all characters of a between position i and j, with step k; if k is negative, starts from the right Returns a string that is the reverse of a Split string a into a list, using sep to decide where to cut Returns a string equal to a, but that has been stripped of any “white” characters at the beginning and end of a (space, tab, CR,…) Returns a string equal to a, but with all letters uppercase Returns a string equal to a, but with all letters lowercase Returns a string equal to a, but with the first word capitalized Counts the number of instances of the substring ‘sub’ in the string a Returns a string equal to a, but with n instances of substring sub1 replaced with substring sub2; if n is not given, all instances are returned Concatenating strings: The + operator, when placed between two strings, creates a single string that concatenates the two original strings. In the following example: # >>>A==”ATTGC” >>>B=”GGCCT” >>>C=A+B The variable C contains the string “ATTGCGGCCT”. Note that the concatenation operator can be attached to an assignment: C+=”G”; Adds a “G” at the end of the string contained in C. Repeating a string The operator “*” repeats a string a given number of times: >>> text=”No! “ >>>newtext=text*5 >>> print (newtext) No! No! No! No! No! Indexing and slicing strings Characters within a string can be accessed both front and backwards. Frontways, a string starts at position 0 and the character desired is found via an offset value: String[i] is the character at position i (starting from 0) from the left side of the string. You can also find the same character by using a negative offset value from the end of the string: String[-i] is the character at position i from the right side of the string. >>> S = ‘Index’ >>> S[0] I >>> S[3] e >>> S[-1] x >>> S[-3] d Slicing is a very useful and heavily used function in Python: it allows you to extract specific substrings of a string. The syntax for slicing is: b = S[i:j:k] b collects characters between positions i and j (j not included), starting at I, every k characters. Note that you do not have to specify i, j and/or k: - if you do not specify i, you start at the first character of the string - if you do not specify j, you go up to the last character of the string - if you do not specify k, it is set by default to 1 Note also that k can be negative, in which case you start from the right end of the string. For example, b = S[::-1] reverses the string S and stores it in b. Examples: >>> S = ‘This is a string’ >>> b = S[1:3] >>> print( b) ‘hi’ >>> S[5:12:3] ‘iat’ >>> S[1:5:-1] ‘’ >>> S[5:1:-1] ‘i si’ >>> S[10::] ‘string’ >>> S[::-1] ‘gnirts a si sihT’ # Select substring from position 1 to 3, 3 not included # Select every third character, between position 5 and 10 # Starts from the end of the string; but order 1:5 is wrong get nothing: # correct syntax # all characters from position 10 till the end # reverse the whole string The other string manipulations described below apply a function on the string. The syntax is: string.function(argument) where string is the string considered, function is the function applied, and argument are parameters for the function, if any. Breaking a string into a list A string can be broken down into a list using the function split. The syntax is: A.split(sep) where A is the string, and sep the separator. If sep is not provided, Python uses the white space. Examples: >>>text=”This is a test case; it has two parts” >>>text.split() [‘This’,’is’,’a’,’test’,’case;’,’it’,’has’,’two’,’parts’] >>> text.split(‘;’) [‘This is a test case’,’ it has two parts’] >>> text.split(‘a’) [‘This is ‘,’ test c’,’se; it h’,’s two p’,’rts’] Striping a string A string may have leading or lagging white characters, such as blanks, tabs, or carriage return. It is a good idea to remove those, using the function strip(). Changing case - Setting the whole string as upper case: Setting the whole string as lower case: Capitalizing the string: >>> S = ‘This Is A Test’ >>> S.upper() ‘THIS IS A TEST’ >>> S.lower() ‘this is a test’ >>> S.lower().capitalize() ‘This is a test’ >>> S = ‘ This is a test ‘ >>> S.lstrip() ‘This is a test’ apply function upper() apply function lower() apply function capitalize() # All upper case # All lower case # Set proper case # Remove leading space Counting occurrence of substrings Count is a function that finds and counts the number of occurrence of a substring in a string: >>> S=’aattggccttaa’ >>> S.count(‘a’) 4 >>> S.count(‘A’) 0 >>> S.count(‘at’) 1 >>> S.count(‘Gc’) 0 # Number of character ‘a’ in the string # Remember that python is case sensitive # Number of ‘at’ in the string Replace Replace is a function that substitutes a string for another: String.replace(‘sub1’,’sub2’,n) String is the string on which replace is applied; n instances of ‘sub1’ are replaced with ‘sub2’; if n is not provided, all instances of ‘sub1’ are replaced. >>> S=’This is a test case’ >>> S.replace(‘is’,’was’) ‘Thwas was a test case’ >>> S.replace(‘is’,’was’,1) ‘Thwas is a test case’ # replaces all instances of ‘is’ # replaces only first instance 5. Input data in a Python program Often when we write a Python script, we need to be able to ask the user for additional data when he/she runs the program. This is done using the function input (raw_input() in Python 2.x is replaced with input() in Python 3: answer = input(“Question :”) where: - “Question” is the string printed on the screen to let the user know what he/she needs to input answer is a string that stores the answer of the user. Note that the result of input is always a string. If you expect an integer or a float from the user, you need to change the type: age = int(input(“What is your age :”)) age is now an integer that contains the age entered by the user. Exercises: 1. Without the aid of a computer, work out the order in which each of the following expressions would be computed and their value. i. 2 + 6/4-3*5+1 ii. 17 + -3**3/2 iii. 26+3**4*2 iv. 2*2**2+2 Verify your answer using Python. 2. Without the aid of a computer, work out these successive expressions and give the values of a, b, c and d upon completion. Then check your answer using a Python script: a=4 b=9 c=5 d= a*2+b*3 b%=a a=b-1; 3. Write a Python program that: i. Reads a sentence from standard input ii. Writes this sentence on standard output all in lower case iii. Writes this sentence on standard output with all vowels in upper case and all consonants in lower case iv. Writes the sentence in reverse order 4. Write a Python program that: i. Reads a sentence from standard input ii. Counts the number of words and the number of characters, not included space iii. Counts the number of vowels. 5. Write a Python program that reads from standard input the amount of a restaurant bill and outputs two options for tipping, one based on 15% of the bill, the other based on 20% of the bill. 6. Write a Python program that: i. Reads a sentence ii. Remove all vowels iii. Replaces all v and b in the original sentence with b and v, respectively (i.e. for example string ‘cvvbt’ becomes ‘cbbvt’ iv. Count number of letters in the modified sentence v. Writes the resulting sentence and number of letters on standard output Chapter 2: Lists, Arrays and Dictionaries 6. Higher order organization of data In the previous chapter, we have seen the concept of scalar variables that define memory space in which we store a scalar, i.e. single numbers or strings. Scalar values however are usually insufficient to deal with current data. Imagine writing a program that analyzes the whole human genome. Since it contains approx. 30,000 genes, we would have to create 30,000 variables to store their sequences! Initializing these variables would already be a daunting task, but imagine that you wanted to count the number of times the sequence ATG appears in each gene, you would have to write one line of code for each gene, hence 30,000 lines, changing only the variable name for the gene! Fortunately, Python thinks that laziness is a virtue, and would never tolerate that you have to write 30,000 lines of code. Two special types of variables exist to help managing long lists of items, namely arrays and dictionaries. These variables store lists of data, and each piece of data is referred to as an element. In this chapter, we will look in details at what are lists, and how they are stored and manipulated within arrays and dictionaries. We are all familiar with lists: think about a shopping list, a soccer team roster, all integers between 1 and 10, genes in the human genomes,…A list can be ordered (increasing or decreasing values for numbers, lexicographic order for strings), or unordered. Python has two types of lists, tuples and lists. Tuples are immutable, i.e. they cannot be modified once created, while lists are mutable, i.e. they can be modified once created. 7. Tuples 2.1 Tuples in Python By definition, a tuple is a set of comma-separated values enclosed in parentheses. Examples: - (1,2,3,4,5,6,7,8,9,10) is the tuple of integers between 1 and 10 - (‘Monday’,’Tuesday’,’Wednesday’,’Thursday’,’Friday’,’Saturday’,’Sunday’) is the tuple containing the days in the week. 2.2 Accessing tuple values We have seen how to build a tuple. Another thing that is useful is to be able to access a specific element or set of elements from a tuple. The way to do this is to place the number or numbers of the element (s) we want in square brackets after the tuple. Let us look at an example: >>> week=(‘Monday’,’Tuesday’,’Wednesday’,’Thursday’,’Friday’,’Saturday’,’Sunday’) >>> print(y[2]) Which day do you think will be printed? If you try it, you will get: ‘Wednesday’ Why didn’t we get “Tuesday”, the second element of the list? This is because Python starts counting from 0 and not 1!! This is something important to remember. The element you want does not have to be literal: it can be a variable as well. As an exercise, write a small program that reads in a number between 1 and 7, and outputs the corresponding day of the week. The answer is on the next page. >>> # Read in day considered >>> day=int(input(“Enter a number from 1 to 7 : “)) >>> # >>> # print element ‘day’ of the list of days of the week: >>> # >>> print (week[day]) The examples above show you how to single out one element of a tuple; if you wanted more than one element, you can “splice” the tuple, the same way we splice strings. For example, >>>print (week[0:2]) will print (‘Monday’,’Tuesday’) Remember that the range i:j means from position i to position j, j not included. 8. Lists A list in Python is created by enclosing its elements in brackets: >>> [“Monday”,”Tuesday”,”Wednesday”,”Thursday”,”Friday”,”Saturday”,”Sunday”] Elements in a list are accessed the same way elements are accessed in tuples. Special lists: ranges Often the lists we use have a simple structure: the numbers from 0 to 9, or the numbers from 10 to 20. We do not need to write these lists explicitly: Python has the option to specify a range of numbers. The two examples cited would be written in Python as: >>> list(range(10)) [0,1,2,3,4,5,6,7,8,9] >>> list(range(10,21)) [10,11,12,13,14,15,16,17,18,19,20] >>> list(range(1,10,2)) [1,3,5,7,9] Note that lists (and tuples) in Python can be mixed: you can include strings, numbers, scalar variables and even lists in a single list! 9. Arrays There is not much we can do with lists and tuples, except print them. Even when you print them, the statements can become cumbersome. Also, there is no way to manipulate directly a list: if we wanted to create a new list from an existing list by removing its last element, we could not. The solution offered by Python is to store lists and tuples into arrays. 9.1 Assigning arrays Names for arrays follow the same rules as those defined for scalar variables. We store a list into an array the same way we store a scalar into a scalar variable, by assigning it with =: >>> days=(‘Monday’,’Tuesday’,’Wednesday’,’Thursday’,’Friday’,’Saturday’,’Sunday’) for a tuple, or >>> days=[‘Monday’,’Tuesday’,’Wednesday’,’Thursday’,’Friday’,’Saturday’,’Sunday’] for a list. Note: the name of an array does not indicate if it contains a list or a tuple: try to use names that are explicit enough that there are no ambiguities. 0 1 2 3 Figure 2.1: Scalar variables and arrays. A scalar variable is like a single box, while an array behaves like a chest of drawers. Each of the drawers is assigned a number, or index, which starts at 0. 4 Scalar variable: One storage box Array: Ņchest of drawers Ó Once we have assigned a list to an array, we can use it where we would use a list. For example, >>> print( days) will print: [‘Monday’,’Tuesday’,’Wednesday’,’Thursday’,’Friday’,’Saturday’,’Sunday’] 9.2 Accessing one element in an array We can access one element in an array by using the index of the drawers it has been assigned to. Remember how we access an element in a list: [‘Monday’,’Tuesday’,’Wednesday’,’Thursday’,’Friday’,’Saturday’,’Sunday’][0] gives us the first element in the list, i.e. ‘Monday’. We could do: days[0]; to access the first element of the array days. Accessing an element in an array works both ways: we can either retrieve the value contained in the position considered, or assign a value to that position. For example, numbers = [0,1,5]; Creates an array names numbers that contains the list [0,1,5]. This list can then be modified: numbers[0]=3; numbers[1]=4; numbers[2]=5; The array numbers now contains the list [3,4,5]. Important: you can change elements in a list, but you will get an error message if you try the same thing in a tuple. 9.3 Array manipulation Python provides a list of functions that manipulates list. Let A be a list: Type Notation Function Adding values A.append(obj) Adds obj at the end of list A A.extend(list) Adds list at the end of list A A.insert(index,item) Adds item at position index in A, and move the remaining items to the right Remove values del A[i] Removes element at position i in the list A Item=A.pop(index) Removes object at position index in A, and stores it in variable item A.remove(item) Search for item in A, and remove first instance Reverse A.reverse() Reverses list A Sorting A.sort() Sorts list A in place, in increasing order Searching I=A.index(item) Search for item in list A, and puts index of first occurrence in i Counting N=A.count(item) Counts the number of occurrence of item in A Length N=len(A) Finds number of items in A, and stores in N Important again: you can manipulate elements in a list, but you will get an error message if you try the same thing in a tuple. 9.4 From arrays to string and back. join: from array to string: You can concatenate all elements of a list into a single string using the operator join. The syntax is: >>> A=’’.join(LIST) It concatenates all elements of LIST and stores them in the string A. Not the presence of ‘’ before join. Split and list: from string to array We have seen in chapter 1 that the function split will break down a string into a list, using a specified delimiter to mark the different elements. If the delimiter is omitted, split separates each word in the string. For example, if A=”This is a test”, A.split() will generate the list [“This”,”is”,”a”,”test”]. Split however cannot break down a word into characters: to do that, we use the function list. For example, if we want to break a paragraph into sentences, a sentence into words, and a word into characters, we use: >>> sentences = paragraph.split(‘.‘) >>> words=sentence.split(‘ ‘) >>> characters=list(word) In all three cases, the result is a list and the input was a string. 10.Dictionary 5.1 Definition Key Value Figure 2.2: The dictionary variable A dictionary is a special array for which each element is indexed by a string instead of an integer. The string is referred to as a key, while the corresponding item is the value. Each element of a dictionary is therefore a pair (key,value). Arrays are very good for maintaining and manipulating lists. They have one limitations however: each element in an ARRAY is indexed with an integer value, varying from 0 to its last index, len(ARRAY). You can think about it as the array representing a street of houses, with each house defined by its number. There are instances however in which it would be more convenient to refer to the different houses using the name of its inhabitants: this is exactly what a dictionary variable does (see figure 2.2). Dictionaries are also referred to as associative arrays or hashes. 5.2 Assigning values to dictionaries A dictionary is a set of pairs of value, with each pair containing a key and an item. Dictionaries are enclosed in curly brackets. For example: >>> country = { “Paris”:”France”, “Washington”: “USA”, “London”:”England”, “Ottawa”:”Canada”,“Beijing”:”China”} Creates a dictionary of countries with their capitals, where the capitals serve as keys. Note that keys must be unique. If you try to add a new entry with the same key as an existing entry, the old one will be overwritten. Dictionary items on the other hand need not be unique. Dictionaries can also be created by zipping two tuples: >>> seq1=(“Paris”,”Washington”,”London”,”Ottawa”,”Beijing”) >>> seq2=(“France”,”USA”,”England”,”Canada”,”China”) >>> d = dict(zip(seq1,seq2)) >>> d { “Paris”:”France”, “Washington”: “USA”, “London”:”England”, “Ottawa”:”Canada”, “Beijing”:”China”} 5.2 Accessing elements in dictionaries This is similar to looking inside an array, except that the positions in the dictionaries are indexed by their keys, and not by an integer index. Using the dictionary d defined above, >>> d[“Paris”] “France” >>> d[“Beijing”] “China” If you give a key however that is not in the dictionary, Python will output an error message. To circumvent this problem, it is often better to use the function get: if key does not exist, Python will output “None”, or an error message that you have defined: >>> d.get(“Paris”) “France” >>> d.get(“Mexico City”) >>> d.get(“Mexico City”,”This key is not defined”) ‘This key is not defined’ 5.3 Manipulating dictionaries Adding new key-value pairs to a dictionary is simply done by assignment. For example, we could have added Germany and Mexico in our dictionary d using: >>> d[“Berlin”]=”Germany” >>> d[“Mexico City”]=”Mexico” We can also change the entries in a dictionary just by reassigning them. To remove an entry in a dictionary, we need to use the function del. For example, to remove the key-value (“Paris”:”France”) from our dictionary d: del d[“Paris”] Other useful functions on dictionaries are illustrated in the example below: >>> d = {“A”:”California”,”B”:”Nevada”,”C”:”Oregon”} >>> >>> d.keys() # list all keys of d [‘A’,’B’,’C’] >>> >>> d.values() # list all values of d [‘California’,’Nevada’,’Oregon’] >>> >>> d.has_key(‘A’) # check if a given key is known in d True >>>d.has_key(‘D’) False Exercises: 7. Write a program that prints all the numbers from 1 to 100. Your program should have much fewer than 100 lines of code! 8. Starting with the word GENE1=”ATGTTGATGTG”, write a Python program that creates the new words GENE2, GENE3, GENE4 and GENE5 such that: i. GENE2 only contains the last two letters of GENE1 ii. GENE3 only contains the first two letters of GENE1 iii. GENE4 only contains the letters at positions 2,4,6,8 and 10 in GENE1 iv. GENE5 only contains the first 3 and last 3 letters of GENE1 9. Suppose you have a Python program that read in a whole page from a book into an array PAGE, with each item of the array corresponding to a line. Add code to this program to create a new array SENTENCES that contains the same text, but now with each element in SENTENCES being one sentence. 10. Let d be a dictionary whose pairs key:value are country:capital. Write a Python program that prints the keys and values of d, with the keys sorted in alphabetical order. Test your program on d = {“France”:”Paris”,”Belgium”:”Brussels”,”Mexico”:”Mexico City”,”Argentina”:”Buenos Aires”,”China”:”Beijing”} Chapter 3: Control Structures 11.Higher order organization of Python instructions In the previous chapters, we have introduced the different types of variables known by Python, as well as the operators that manipulate these variables. The programs we have studied so far have all been sequential, with each line corresponding to one instruction: this is definitely not optimal. For example, we have introduced in the previous chapter the concept of lists and arrays, to avoid having to use many scalar variables to store data (remember that if we were to store the whole human genome, we would need either 30,000 scalar variables, one for each gene, or a single array, whose items are the individual genes); if we wanted to perform the same operation on each of these genes, we would still have to write one line for each gene. In addition, the programs we have written so far would attempt to perform all their instructions, once given the input. Again, this is not always desired: we may want to perform some instructions only if a certain condition is satisfied. Again, Python has thought about these issues, and offers solutions in the form of control structures: the if structure that allows to control if a block of instruction need to be executed, and the for structure (and equivalent), that repeats a set of instructions for a preset number of times. In this chapter, we will look in details on the syntax and usage of these two structures. Figure 3.1: The three main types of flow in a computer program: sequential, in which instructions are executed successively, conditional, in which the blocks “instructions 1” and “instructions 2” are executed if the Condition is True or False, respectively, and repeating, in which instructions are repeated over a whole list. 12.Logical operators Most of the control structure we will see in this chapter test if a condition is true or false. For programmers, “truth” is easier to define in terms of what is not truth! In Python, there is a short, specific list of false values: An empty string, “ “, is false The number zero and the string “0” are both false. An empty list, (), is false. The singleton None (i.e. no value) is false. Everything else is true. 2.3 Comparing numbers and strings We can test whether a number is bigger, smaller, or the same as another. Similarly, we can test if a string comes before or after another string, based on the alphabetical order. All the results of these tests are TRUE or FALSE. Table 3.1 lists the common comparison operators available in Python. Notice that the numeric operators look a little different from what we have learned in Math: this is because Python does not use the fancy fonts available in text editors, so symbols like , , do not exist. Notice also that the numeric comparison for equality uses two = symbols (==): this is because the single = is reserved for assignment. Table 3.1 : Python comparison operators comparison Corresponding question a == b Is a equal to b ? a != b Is a not equal to b ? a>b Is a greater than b ? a >= b Is a greater than or equal to b ? a<b Is a less than b ? a <= b Is a less than or equal to b ? a in b Is the value a in the list (or tuple) b? a not in b Is the value a not in the list (or tuple) b? These comparisons apply both to numeric value and to strings. Note that you can compare numbers to strings, but the result can be arbitrary: I would strongly advise to make sure that the types of the variables that are compared are the same! 2.4 Combining logical operators We can join together several tests into one, by the use of the logical operator and and or a and b True if both a and b are true. a or b True if either a, or b, or both are true. not a True if a is false. 13.Conditional structures 13.1 If The most fundamental control structure is the if structure. It is used to protect a block of code that only needs to be executed if a prior condition is met (i.e. is TRUE). The generate format of an if statement is: >>> if condition: code block Note about the format: - the : indicates the end of the condition, and flags the beginning of the end structure - Notice that the code block is indented with respect to the rest of the code: this is required; it allows you to clearly identify which part of the code is conditional to the current condition. The condition is one of the logical expressions we have seen in the previous section. The code block is a set of instructions grouped together. The code block is only executed if the condition is TRUE. if statements are very useful to check inputs from users, to check if a variable contains 0 before it is used in a division,…. As example, let us write a small program that asks the user to enter a number, reads it in, and checks if it is divisible by n, where n is also read in: number=int(input(“Enter your number --> “)) n=int(input(“Test divisibility by --> “)) i=number%n if i != 0: print (“The number “,number,” is not divisible by “,n,”\n”) 13.2 Else When making a choice, sometimes you have two different things you want to do, depending upon the outcome of the conditional. This is done using an if …else structure that has the following format: if condition: block code 1 else: block code 2 Block code 1 is executed if the condition is true, and block code 2 is executed otherwise. Here is an example of a program asking for a password, and comparing it with a pre-stored string: hidden=”Mypasscode” password=input(“Enter your password : “) if password == hidden: print (“You entered the right password\n”) else: print (“Wrong password !!\n”) Python also provides a control structure when there are more than two choices: the elif structure is a combination of else and if. It is written as: if CONDITION1: block code 1 elif CONDITION2: block code 2 else : block code 3 Note that any numbers of elif can follow an if. 14.Loops One of the most obvious things to do with an array is to apply a code block to every item in the array: loops allow you to do that. Every loop has three main parts: An entry condition that starts the loop The code block that serves as the “body” of the loop An exit condition Obviously, all three are important. Without the entry condition, the loop won’t be executed; a loop without body won’t do any thing; and finally, without a proper exit condition, the program will never exit the loop (this leads to what is referred to an infinite loop, and often results from a bug in the exit loop). There are two types of loops: determinate and indeterminate. Determinate loops carry their end condition with them from the beginning, and repeat its code block an exact number of times. Indeterminate loops rely upon code within the body of the loop to alter the exit condition so the loop can exit. We will see one determinate loop structure, for, and one indeterminate loop structure, while. 4.1 For loop The most basic type of determinate loop is the for loop. Its basic structure is: for variable in listA: code block Note the syntax similar to the syntax of an if statement: the : at the end of the condition, and the indentation of the code block. A for loop is simple: it loops over all possible values of variable as found in the list listA, executing each time the code block. For example, the Python program: >>> names=[“John”,”Jane”,”Smith”] >>> j=0 >>> for name in names: j+=1 print “The name number “,j,” in the list is “,name Will print out: The name number 1 in the list is ‘John’ The name number 2 in the list is ‘Jane’ The name number 3 in the list is ‘Smith’ Note that if the list is empty the loop is not executed at all. The for loop is very useful for iterating over the elements of an array. It can also be used to loop over a set of integer values: remember that you can create a list of integers using the function range. Here is an example program that computes the sum of the squares of the numbers from 0 to N, where N in read as input: >>> N=int(input(“Enter the last integer considered --> “)) >>> Sum=0 >>> for i in range(0,N+1,1): Sum+=i**2 >>> print (“The sum of the squares between 0 and “,N,” is “,Sum) 4.2 While loop Sometimes, we face a situation where neither Python nor we know in advance how many times a loop will need to execute. This is the case for example when reading a file: we do not know in advance how many lines it has. Python has a structure for that: the while loop: while TEST: code block; The while structure executes the code block as long as the TEST expression evaluates as TRUE. For example, here is a program that prints the number between 0 and N, where N is input: >>> N=int(input(“Enter N --> “)) >>> print (“Counting numbers from 0 to “,N,”\n”) i=0 while i < N+1: print (i) i+=1 Note that it is important to make sure that the code block includes a modification of the test: if we had forgotten the line i+=1 in the example above, the while loop would have become an infinite loop. Note that any for loop can be written as a while loop. In practice however, it is better to use a for loop, as Python executes them faster 4.3 Break points in loops Python provides two functions that can be used to control loops from inside its code block: break allows you to exit the loop, while continue skips the following step in the loop. Here is an example of a program that counts and prints number from 1 to N (given as input), skipping numbers that are multiples of 5 and stopping the count if the number reached squared is equal to N: >>> N = int(input(“Enter N --> “)) >>> for i in range(0,N+1,1): if i**2 == N: break else: if i%5==0: continue print (i) # Input N # Start loop from 0 to N (N included) # test if i**2 is equal to N… # if it is stop counting # Test if i is a multiple of 5 # if it is, move to next value Exercises: 1. Write a program that reads in an integer value n and outputs n! (Reminder: n!=1x2x3x4….xn). 2. Write a program that reads in a word, and writes it out with the letters at even positions in uppercase, and the letters at odd positions in lower case. 3. In cryptography, the Ceasar cipher is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a shift of 3, A would be replaced by D, B would become E, …, X would become A, Y would become B, and Z would become C. a. Write a program that reads in a sentence, and substitutes it using the Ceasar cipher with a shift of 3. b. Write a program that reads in sentence that has been encrypted with the Ceasar cipher with a shift of 3, and decrypts it. c. Repeat a and b above for a Ceasar cipher with a shift N, where N is given as input, N between 0 and 10