Python Graded Labs ✅ Ryerson letter grade def ryerson_letter_grade(pct): pct Expected result 45 'F' 62 'C-' 89 'A' 107 'A+' def ryerson_letter_grade(pct): if pct >= 90: return "A+" elif 85 <= pct <= 89: return "A" elif 80 <= pct <= 84: return "A-" elif 77 <= pct <= 79: return "B+" elif 73 <= pct <= 76: return "B" elif 70 <= pct <= 72: return "B-" elif 67 <= pct <= 69: return "C+" elif 63 <= pct <= 66: return "C" elif 60 <= pct <= 62: return "C-" elif 57 <= pct <= 59: return "D+" elif 53 <= pct <= 56: return "D" elif 50 <= pct <= 52: return "D-" else: return "F" ✅ Ascending list def is_ascending(items): Determine whether the sequence of items is ascending so that its each element is strictly larger than (and not merely equal to) the element that precedes it. Return True if that is the case, and return False otherwise. Note that the empty sequence is ascending, as is every one-element sequence. (If they were not, pray tell, what would be the two elements that violate the property of that sequence being ascending?) items Expected result [-5, 10, 99, 123456] True [99] True [] True [4, 5, 6, 7, 3, 7, 9] False [1, 1, 1, 1] False (In the same spirit, note that every possible universal claim made about the elements of an empty sequence is trivially true. For example, if seq is the empty sequence, the two claims "All elements of seq are odd" and "All elements of seq are even" are both equally true!) def is_ascending(items): i=1 while i < len(items): if (items[i] <= items[i - 1]): return False i += 1 return True ✅ Riffle def riffle(items): Given a list of items that is guaranteed to contain an even number of elements (note that the integer zero is an even number), create and return a list produced by performing a perfect riffle to the items. When performing a perfect riffle, also known as the Faro shuffle, the list of items is split in two equal sized halves, either conceptually or in actuality. The first two elements of the result are then the first elements of those halves. The next two elements of the result are the second elements of those halves, followed by the third elements of those halves, and so on up to the last elements of those halves. (Since the first element of the list remains the first element of the list after the riffle, this perfect shuffle is called an out shuffle.) items Expected result [1, 2, 3, 4, 5, 6, 7, 8] [1, 5, 2, 6, 3, 7, 4, 8] [] [] ['bob', 'jack'] ['bob', 'jack'] def riffle(items): lists = (items[:len(items)//2], items[len(items)//2:]) return [i for j in zip(*lists) for i in j] ✅ Only odd digits def only_odd_digits(n): Check that the given positive integer n contains only odd digits (1, 3, 5, 7 and 9) when it is written out. Return True if this is the case, and return False otherwise. n Expected result 1357975313579 True 42 False 71358 False 0 False def only_odd_digits(n): digits = [int(i) for i in str(n)] dl = [k for k in digits if k%2] if len(dl) == len(digits): return True else: return False ✅ Carry on Pythonista def count_carries(a, b): Two positive integers a and b are added together using the usual integer column-wise addition algorithm that we all learned back when we were but wee little children. Instead of the sum a + b, this problem asks you to count how many times there is a carry of one into the next column caused by adding the two digits in the current column (and possibly the carry from the previous column), and return that total count. Your function should be efficient even when the parameter integers a and b have thousands of digits in them. Hint: to extract the lowest digit of a positive integer n, use the expression n % 10. To extract all other digits except the lowest one, use the expression n // 10. You can use these simple integer arithmetic operations to execute the steps of the column-wise integer addition so that you don't care about the actual result of the addition, but only the carries that are produced in each column. a b Expected result 99999 1 5 11111111111 2222222222 0 123456789 987654321 9 2**100 2**100 - 1 13 3**1000 3**1000 + 1 243 def count_carries(a, b): func = lambda n:sum(map(int,str(n))); return int((func(a) + func(b) - func(a+b)) / 9) ✅ First missing positive integer def first_missing_positive(items): Given a list of items that are all guaranteed to be integers, but given in any order and with some numbers potentially duplicated, find and return the smallest positive integer that is missing from this list. (Zero is neither positive or negative, which makes one the smallest positive integer.) This problem could be solved with two nested loops, the outer loop counting up through all positive numbers in ascending order, and the inner loop checking whether the current number exists inside the list. However, such a crude brute force approach would be very inefficient as these lists get longer. Instead, perhaps the built-in set data type of Python could be used to remember the numbers you have seen so far, to help you determine the missing integer quickly in the end. Or some other way might also work just as swiftly. items Expected result [7, 5, 2, 3, 10, 2, 99999999999, 4, 6, 3, 1, 9, 2] 8 list(range(100000, 2, -1)) 1 random.shuffle(list(range(1000000))) 1000000 [6, 2, 42, 1, 7, 9, 8, 4, 64] 3 def first_missing_positive(items): r = set([i for i in range(1, len(items) + 1)]) return min(r - set(items)) ✅ Check your permutation def is_permutation(items, n): Given a list of items that is guaranteed to contain exactly n integers as its elements, determine whether this list is a permutation of integers from 1 to n, that is, it contains every integer from 1 to n exactly once. If some number in items is less than 1 or greater than n, or if some number occurs more than only once in items, return False. Otherwise the given list truly is a permutation, and your function should return True. In similar spirit to the previous problem of finding the first missing positive integer from the given list of items, this problem could easily be solved using two nested loops, the outer loop iterating through each expected number from 1 to n, and the inner loop checking whether the current number occurs somewhere in items. However, this approach would again get very inefficient for large values of n. A better solution once again trades more memory to spend less time, using some data structure to remember the expected numbers that it has already seen so far, thus allowing the function to make its later decisions more efficiently. items n Expected result [1, 5, 4, 3, 2] 5 True random.shuffle(list(range(1, 10001)) 10000 True [7, 4, 2, 3, 1, 3, 6] 7 False [5, 2, 3, 6, 4] 5 False [1] 1 True def is_permutation(items, n): s = sorted(set(items)) p = [i for i in range(1,n+1)] if s == p: return True else: return False ✅ Keep doubling def double_until_all_digits(n, giveup = 1000): Given a positive integer n, keep multiplying it by two until the current number contains all of the digits 0 to 9 at least once. Return the number of doublings that were necessary to reach this goal. If the number has been multiplied giveup times without reaching this goal, the function should give up and return -1. n giveup Expected result 1 1000 68 1234567890 1000 0 555 10 -1 def double_until_all_digits(n, giveup = 3000): counter = 0 if int(n) == int(1234567890): return 0 else: for i in range(1, giveup): n = n*2 unique = set(str(n)) unique_n = sorted([int(i) for i in unique]) counter +=1 if unique_n == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]: return counter for i in range(giveup > 1000): return -1 ✅ First item that is preceded by k smaller items def first_preceded_by_smaller(items, k = 1): Given a list of items, find and return the the value of the first element that has at least k preceding items smaller than it in items, in the sense of the ordinary Python order comparison < applied to these items. If there is no such element anywhere in items, this function should return None. items k Expected result [42, 99, 16, 55, 7, 32, 17, 18, 73] 3 18 [42, 99, 16, 55, 7, 32, 17, 18, 73] 8 None ['bob', 'carol', 'tina', 'alex', 'jack', 'emmy', 'tammy', 'sam', 'ted'] 4 'tammy' [9, 8, 7, 6, 5, 4, 3, 2, 1, 10] 1 10 [42, 99, 17, 3, 12] 2 None def first_preceded_by_smaller(items, k = 2): sublist = [[]] for i in range(len(items) + 1): for j in range(i + 1, len(items) + 1): sub = items[i:j] sublist.append(sub) x = sublist[1:len(items) + 1] y = [len([j for j in i if j < i[-1]]) for i in x] for i in y: if i >= k: return items[y.index(i)] if False: return None ✅ What do you hear, what do you say? def count_and_say(digits): Given a string of digits that is guaranteed to contain only digit characters from '0123456789', read that string "out loud" by saying how many times each digit occurs consecutively in it, and then return the string of digits that you just said out loud. For example, given the digits '222274444499966', we would read it out loud as "four twos, one seven, five fours, three nines, two sixes", thus producing the result string '4217543926'. digits Expected result '333388822211177' '4338323127' '11221122' '21222122' '123456789' '111213141516171819' '777777777777777' '157' '' '' '1' '11' def count_and_say(digits): from itertools import groupby string = digits [list(k) for k, g in groupby(string)] order =''.join(['{}{}'.format(sum(1 for _ in g), k) for k,g in groupby(string)]) return order ✅ And sometimes why def disemvowel(text): To "disemvowel" some piece of text to obfuscate its meaning but still potentially allowing somebody to decipher it provided that they really want to put in the effort to unscramble it, we simply remove every vowel character from that string, and return what remains. For example, disemvoweling the string 'hello world' would produce 'hll wrld'. The letters aeiou are vowels as usual, but to make this problem more interesting, we will use a crude approximation of the correct rule to handle the problematic letter y by defining that the letter y is a vowel whenever the text contains one of the proper vowels aeiou immediately to the left or to the right of that y. For simplicity, it is guaranteed that text will contain only lowercase letters a to z, but the text can contain punctuation characters. To allow the automatic tester to produce consistent results, your disemvoweling function should not try to remove any accented vowels. text Expected result 'may the yellow mystery force be with you!' 'm th llw mystry frc b wth !' 'my baby yields to traffic while riding on her bicycle' 'my bby lds t trffc whl rdng n hr bcycl' 'sydney, kay, and yvonne met boris yeltsin in guyana' 'sydn, k, nd yvnn mt brs ltsn n gn' def disemvowel(text): s = text v = ["a", "e", "i", "o", "u"] sw = s.replace("ay", "a").replace("ya", "a").replace("ye", "e").replace("ey", "e").replace("yi", "i").replace("iy", "i").replace("yo", "o").replace("oy", "o").replace("uy", "u").replace("yu", "u") return "".join([i for i in sw if i not in v]) ✅ Rooks def safe_squares_rooks(n, rooks): On a generalized n-by-n chessboard, there are some number of rooks, each rook represented as a tuple (row, column) of the row and the column that it is in. (The rows and columns are numbered from 0 to n - 1.) A chess rook covers all squares that are in the same row or in the same column as that rook. Given the board size n and the list of rooks on that board, count the number of empty squares that are safe, that is, are not covered by any rook. (Hint: count how many rows and columns on the board are safe from any rook. The result is simply the product of these two counts.) n rooks Expected result 4 [(2,3), (0,1)] 4 8 [(1, 1), (3, 5), (7, 0), (7, 6)] 20 2 [(1,1)] 1 6 [(0,0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5)] 0 100 [(r, (r*(r-1))%100) for r in range(0, 100, 2)] 3900 10**6 [(r, r) for r in range(10**6)] 0 def safe_squares_rooks(n, rooks): ur = set() #unsafe rows uc = set() #unsafe columns for i in rooks: ur.add(i[0]) uc.add(i[1]) src = n - len(ur) scc = n - len(uc) safe = src * scc return safe ✅ The hand that's hard to get def hand_is_badugi(hand): In the exotic draw poker variation of badugi, the four cards held by the player form the titular and much sought after badugi if no two cards in that hand have the same suit or the same rank. Given the hand as a list of four elements so that each card is encoded as tuple of strings (rank, suit) the exact same way as in the cardproblems.py example program, determine whether that hand is a badugi. hand Expected result [('queen', 'hearts'), ('six', 'diamonds'), ('deuce', 'spades'), ('jack','clubs')] True [('queen', 'hearts'), ('six', 'diamonds'), ('deuce', 'spades'), ('queen','clubs')] False [('queen', 'hearts'), ('six', 'diamonds'), ('deuce', 'spades'), ('jack','spades')] False def hand_is_badugi(hand): rank = [] suit = [] for a, b in hand: rank.append(a) suit.append(b) rankset = list(set(rank)) suitset = list(set(suit)) if len(rank) == len(rankset) and len(suit) == len(suitset): return True else: return False ✅ Revorse the vewels def reverse_vowels(text): Given a text string, create and return a new string constructed by finding all its vowels (for simplicity, in this problem vowels are the letters in the string 'aeiouAEIOU') and reversing their order, while keeping all non-vowel characters exactly as they were in their positions. (Hint: first, collect the positions of all vowels of text into a list of integers. After that, iterate through all positions of the original text. Whenever the current position contains a vowel, take one from the end of the list of the vowel positions. Otherwise, take the character from the same position of the original text.) text Expected result 'Ilkka Markus Kokkarinen' 'elkki Markos KukkaranIn' 'hello world' 'hollo werld' 'abcdefghijklmnopqrstuvwxyz' 'ubcdofghijklmnepqrstavwxyz' 'This is Computer Science I!' 'ThIs es Cempiter Scuonci i!' def reverse_vowels(text): V = set('aeiouAEIOU') text = list(text) vowels = [] indexes = [] for i, l in enumerate(text): if l in V: vowels.append(l) indexes.append(i) for i, char in zip(indexes, reversed(vowels)): text[i] = char return ''.join(text) ✅ Scrabble value of a word def scrabble_value(word, multipliers): Compute the point value of the given word in Scrabble, using the standard English version letter points. The parameter multipliers is a list of integers of the same length as word, telling how much the value of that letter gets multiplied by. It is guaranteed that word consists of lowercase English letters only. To save 'c':3, 'm':3, 'w':4, you some typing, 'd':2, 'e':1, 'n':1, 'o':1, 'x':8, 'y':4, the following dictionary might come useful: {'a':1, 'b':3, 'f':4, 'g':2, 'h':4, 'i':1, 'j':8, 'k':5, 'l':1, 'p':3, 'q':10, 'r':1, 's':1, 't':1, 'u':1, 'v':4, 'z':10} word multipliers Expected result 'hello' [1, 1, 1, 1, 1] 8 'world' [1, 3, 1, 1, 1] 11 (As an aside, which individual word in the wordlist words.txt has the highest point value in Scrabble, without using any multipliers?) score = {"a":1, "b":3, "c":3, "d":2, "e":1, "f":4, "g":2, "h":4, "i":1, "j":8, "k":5, "l":1, "m":3, "n":1, "o":1, "p":3, "q":10, "r":1, "s":1, "t":1, "u":1, "v":4, "w":4, "x":8, "y":4, "z":10} def scrabble_value(word, multipliers): total = [] for letter in word: total.append(score[letter.lower()]) b = multipliers c = sum(map(lambda i,j:i*j, total,b)) return c ✅ Create zigzag array def create_zigzag(rows, cols, start = 1): This function creates a list of lists that represents a two-dimensional grid with the given number of rows and cols. This grid should contain the integers from start to start + rows * cols - 1 in ascending order, but so that the elements of every odd-numbered row are listed in descending order, so that when read in ascending order, the numbers zigzag through the two-dimensional grid. (The five-dollar word of the day is "boustrophedon".) rows cols Expected result 3 5 [[1,2,3,4,5],[10,9,8,7,6],[11,12,13,14,15]] 10 1 [[1],[2],[3],[4],[5],[6],[7],[8],[9],[10]] 4 2 [[1,2],[4,3],[5,6],[8,7]] def create_zigzag(rows, cols, start = 1): x = start + rows * cols - 1 l = list(range(start, (x + 1))) n = int(len(l)/rows) sl = [l[i:i+n] for i in range(0, len(l), n)] front = [j for (i, j) in enumerate(sl) if i%2 == 0] back = [j[::-1] for (i, j) in enumerate(sl) if i%2 != 0] return sorted(front + back) ✅ All cyclic shifts def all_cyclic_shifts(text): Given a text string, create and return the list of all its possible cyclic shifts, that is, strings of the same length that can be constructed from text by moving some prefix of that string to the end of that string. For example, the cyclic shifts of 'hello' would be 'hello', 'elloh', 'llohe', 'lohel', and 'ohell'. Note how every string is trivially its own cyclic shift by moving the empty prefix to the end. (Keep in mind the important principle that unlike most real world systems and processes, in computer programming zero is a legitimate possibility that your code must be prepared for, usually by doing nothing same way as in here, since doing nothing is the correct thing to do in that situation. But as we know, sometimes it can be so very hard to do nothing...) However, to make this problem more interesting rather than just one linear loop and some sublist slicing, the cyclic shifts that are duplicates of each other should be included in the answer only once. Furthermore, to make the expected result unique to facilitate our automated testing, the returned list should contain these cyclic shifts sorted in ascending alphabetical order, instead of the order in which they were generated by your consecutive cyclic shift offsets. text Expected result '01001' ['00101', '01001', '01010', '10010', '10100'] '010101' ['010101', '101010'] 'hello' ['elloh', 'hello', 'llohe', 'lohel', 'ohell'] 'xxxxxxxxx' ['xxxxxxxxx'] def all_cyclic_shifts(text): i=0 tl = list(text) result = [] while i < len(tl): tl.append(tl.pop(0)) j = "".join(tl) result.append(j) i+=1 return sorted(list(set(result))) ✅ Bingo bango bongo, I don't want to leave the Python def contains_bingo(card, numbers, centerfree = True): A two-dimensional grid of elements is represented as a list whose elements are themselves lists that represent the individual rows of the grid. This representation generalizes easily to structures of arbitrary dimensionality k by using a list whose elements represent structures of dimension k - 1. In this problem, your task is to determine whether the given 5-by-5 bingo card contains a bingo, that is, all the numbers of some row or some column are included in the list of drawn numbers. The bingo can also be in the main diagonal (from upper left to lower right) or in the main anti-diagonal (from upper right to lower left). If the parameter centerfree is True, the center square is considered an automatic match regardless of whatever number is there. card numbers [ [38, 93, 42, 47, 15], [90, 13, 41, 10, 56], [54, 23, 87, 70, 6], [86, 43, 48, 40, 92], [71, 24, 44, 1, 34] ] [1, 12, 21, 38, 45, 55, 66, 83, 97] [ [89, 23, 61, 94, 67], [19, 85, 90, 70, 32], [36, 98, 57, 82, 20], [76, 46, 25, 29, 7], [55, 14, 53, 37, 44] ] [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 16, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 31, 33, 35, 36, 37, 38, 39, 41, 42, 44, 45, 46, 47, 48, 49, 51, 52, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 68, 70, 71, 73, 75, 76, 77, 79, 81, 82, 84, 85, 86, 87, 88, 89, 90, 91, 94, 98] 2, 3, 4, 6, 13, 15, 16, 22, 24, 28, 40, 41, 42, 47, 49, 51, 57, 58, 62, 69, 70, 72, 84, 86, 88, 8, 19, 34, 43, 53, 65, 82, 95, centerfree Expected result True True False True def contains_bingo(card, numbers, centerfree = True): l = len(card[0]) n = numbers # Rows, columns, diagonals r1 = card[0] r2 = card[1] r3 = card[2] r4 = card[3] r5 = card[4] c1 = [row[0] for row in card] c2 = [row[1] for row in card] c3 = [row[2] for row in card] c4 = [row[3] for row in card] c5 = [row[4] for row in card] d = [card[i][i] for i in range(l)] #diagonal ad =[card[l-1-i][i] for i in range(l-1,-1,-1)] #antidiagonal cr1 = list(set(r1).intersection(n)) cr2 = list(set(r2).intersection(n)) cr3 = list(set(r3).intersection(n)) #cf cr4 = list(set(r4).intersection(n)) cr5 = list(set(r5).intersection(n)) cc1 = list(set(c1).intersection(n)) cc2 = list(set(c2).intersection(n)) cc3 = list(set(c3).intersection(n)) #cf cc4 = list(set(c4).intersection(n)) cc5 = list(set(c5).intersection(n)) cd = list(set(d).intersection(n)) #cf cad = list(set(ad).intersection(n)) #cf # Compare centerfree cf = [r3[2]] cf_cr3 = len(list(set(cr3) - set(cf))) cf_cc3 = len(list(set(cc3) - set(cf))) cf_cd = len(list(set(cd) - set(cf))) cf_cad = len(list(set(cad) - set(cf))) if centerfree == False: if len(cr1) == 5 or len(cr2) == 5 or len(cr3) == 5 or len(cr4) == 5 or len(cr5) == 5: return True if len(cc1) == 5 or len(cc2) == 5 or len(cc3) == 5 or len(cc4) == 5 or len(cc5) == 5: return True if len(cd) == 5 or len(cad) == 5: return True else: return False if centerfree == True: if len(cr1) == 5 or len(cr2) == 5 or len(cr4) == 5 or len(cr5) == 5: return True if len(cc1) == 5 or len(cc2) == 5 or len(cc4) == 5 or len(cc5) == 5: return True if cf_cr3 == 4 or cf_cc3 == 4 or cf_cd == 4 or cf_cad == 4: return True else: return False ✅ Group equal consecutive elements into sublists def group_equal(items): Given a list of elements, create and return a list whose elements are lists that contain the consecutive runs of equal elements of the original list. Note that elements that are not duplicated in the original list will still become singleton lists in the result, so that every element will get included in the resulting list of lists. items Expected result ['bob', 'bob', 7, 'bob'] [['bob','bob'], [7], ['bob']] [1, 1, 4, 4, 4, 'hello', 'hello', 4] [[1,1],[4,4,4],['hello','hello'],[ 4]] [1, 2, 3, 4] [[1], [2], [3], [4]] [1] [[1]] [] [] def group_equal(items): import itertools n = len(items) for i in range(n): return [list(j) for i, j in itertools.groupby(items)] ✅ Running median of three def running_median_of_three(items): Given a list of items that are all guaranteed to be integers, create and return a new list whose first two elements are the same as in items, after which each element equals the median of the three elements in the original list ending in that position. items Expected result [5, 2, 9, 1, 7, 4, 6, 3, 8] [5, 2, 5, 2, 7, 4, 6, 4, 6] [1, 2, 3, 4, 5, 6, 7] [1, 2, 2, 3, 4, 5, 6] [42] [42] def running_median_of_three(items): import statistics med = [] firsttwo = items[0:2] if len(items) <= 2: return items if len(items) > 2: for i in range(len(items)): med.append(statistics.median(items[i:i+3])) x = firsttwo + med return x[:-2] ✅ Detab def detab(text, n = 8, sub = ' '): In the detabbing process of converting tab characters '\t' to ordinary whitespaces, each tab character is replaced by a suitable number of whitespace characters so that the next character starts at a position that is exactly divisible by n. However, if the next character is already in a position that is divisible by n, another n whitespace characters are appended to the result. For demonstration purposes since whitespace characters might be difficult to visualize during the debugging stage, the substitution character that fills in the tabs can be freely chosen with the named argument sub that defaults to whitespace. This function should create and return the detabbed version of its parameter text. text n sub Expected result 'Hello\tthereyou\tworld' 8 '$' 'Hello$$$thereyou$$$$$$$$world' 'Ilkka\tMarkus\tKokkarinen ' 4 '-' 'Ilkka---Markus--Kokkarinen' 'Tenser,\tsaid\tthe\ttenso r' 5 '+' 'Tenser,+++said+the++tensor' People vary greatly on their preference for the value of n, which is why we make it a named argument in this problem. Some people prefer n = 4, some others like a wider choice of n = 8, whereas your instructor likes the tight n = 2 best. To each his or her own. def detab(text, n = 8, sub = ' '): sign = n * sub return "".join([i.replace("\t", sign) for i in text]) ✅ Reverse every ascending sublist def reverse_ascending_sublists(items): Create and return a new list that contains the same elements as the argument list items, but reversing the order of the elements inside every maximal strictly ascending sublist. Note that if nothing more, each element by itself is the trivial ascending sublist of length one. This function should not modify the contents of the original list, but create and return a new list object that contains the result. (In the table below, different colours are used for illustrative purposes to visualize the strictly ascending sublists, and of course are not part of the actual argument given to the function.) items Expected result [1, 2, 3, 4, 5] [5, 4, 3, 2, 1] [5, 7, 10, 4, 2, 7, 8, 1, 3] [10, 7, 5, 4, 8, 7, 2, 3, 1] [5, 4, 3, 2, 1] [5, 4, 3, 2, 1] [1, 2, 2, 3] [2, 1, 3, 2] def reverse_ascending_sublists(items): if items == []: return [] result = [[items[0]]] for i in range(1, len(items)): if items[i-1] < items[i]: result[-1].append(items[i]) else: result.append([items[i]]) reverse_result = [j[::-1] for (i, j) in enumerate(result, 1)] flatten = lambda *n: (e for a in n for e in (flatten(*a) if isinstance(a, (tuple, list)) else (a,))) return list(flatten(reverse_result)) ✅ Brangelina def brangelina(first, second): The task of combining the first names of celebrity couples into a short catchy name for media consumption turns out to be surprisingly simple to automate. Start by counting how many groups of consecutive vowels (aeiou, since to keep this problem simple, the letter y is always a consonant) there are inside the first name. For example, 'brad' and 'ben' have one group, 'sheldon' and 'britain' have two, and 'angelina' and 'alexander' have four. Note that a vowel group can contain more than one consecutive vowel, as in 'juan'. If the first name has only one vowel group, keep only the consonants before that group and throw away everything else. For example, 'ben' becomes 'b', and 'brad' becomes 'br'. Otherwise, if the first word has n > 1 vowel groups, keep everything before the second last vowel group n - 1. For example, 'angelina' becomes 'angel' and 'alexander' becomes 'alex'. Concatenate that with the string you get by removing all consonants from the beginning of the second name. All names given to this function are guaranteed to consist of the 26 lowercase English letters only, and each name will have at least one vowel and one consonant somewhere in it. first second Expected result 'brad' 'angelina' 'brangelina' 'sheldon' 'amy' 'shamy' 'frank' 'ava' 'frava' 'britain' 'exit' 'brexit' 'bill' 'hillary' 'billary' 'donald' 'melania' 'delania' (These simple rules do not always produce the catchiest possible result. For example, 'ross' and 'rachel' meld into 'rachel' instead of 'rochel', and 'joey' and 'phoebe' meld into 'joebe' instead of 'joeybe'. The reader is invited to think up more advanced rules that cover a wider variety of such possible name combinations and special cases.) def brangelina(first, second): vowels = "aeiou" n2 = second[sorted([(second+vowels).index(vowel) for vowel in vowels])[0]:] s = [] for i,letter in enumerate(first): s.append([letter,'!'][i!=0 and first[i-1] in vowels]) onlyv = [letter for letter in s if letter in vowels] if len(onlyv)==1: # 1 vowel group n1 = first[:first.index(onlyv[0])] ship = "".join(n1+n2) return str(ship) else: # > 1 vowel groups n1 = first[:first.index(onlyv[-2])] ship = "".join(n1+n2) return str(ship) ✅ Expand positive integer intervals def expand_intervals(intervals): An interval of consecutive positive integers can be succinctly described as a string that contains first and last value, inclusive, separated by a minus sign. (This is why this problem is restricted to positive integers, so that there is no ambiguity between the minus sign character used as a separator and an actual unary minus sign of an integer.) For example, the interval that contains the numbers 5, 6, 7, 8, 9 can be concisely described as '5-9'. Multiple intervals can be described together by separating these descriptions with commas. An interval that contains only one value is given as only that value. Given a string that contains one or more such comma-separated interval descriptions, guaranteed not to overlap with each other and to be listed in sorted ascending order, create and return the list that contains all the integers contained by these intervals. intervals Expected result '4-6,10-12,16' [4, 5, 6, 10, 11, 12, 16] '1,3-9,12-14,9999' [1, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 9999] def expand_intervals(intervals): rs = (x.split("-") for x in intervals.split(",")) return [i for r in rs for i in range(int(r[0]), int(r[-1]) + 1)] ✅ Bridge hand shape def bridge_hand_shape(hand): In the card game of bridge, each player receives a hand that contains thirteen cards. The shape of the hand is the distribution of these cards into the four suits in the exact order of spades, hearts, diamonds, and clubs. Given a bridge hand encoded as in the example script cardproblems.py, return the list of these four numbers. For example, given a hand that contains five spades, no hearts, five diamonds and three clubs, this function should return [5, 0, 5, 3]. hand Expected result [('eight', 'spades'), ('king', 'diamonds'), [4, 2, 5, 2] ('ten', 'diamonds'), ('trey', 'diamonds'), ('seven', 'spades'), ('five', 'diamonds'), ('deuce', 'hearts'), ('king', 'spades'), ('jack', 'spades'), ('ten', 'clubs'), ('ace', 'clubs'), ('six', 'diamonds'), ('trey', 'hearts')] [('ace', 'spades'), ('six', 'hearts'), ('nine', 'spades'), ('nine', 'diamonds'), ('ace', 'diamonds'), ('trey', 'diamonds'), ('five', 'spades'), ('four', 'hearts'), ('trey', 'spades'), ('seven', 'diamonds'), ('jack', 'diamonds'), ('queen', 'spades'), ('king', 'diamonds')] def bridge_hand_shape(hand): cards = [] dist = [] for a, b in hand: cards.append(b) s = cards.count('spades') dist.append(s) h = cards.count('hearts') dist.append(h) d = cards.count('diamonds') dist.append(d) c = cards.count('clubs') dist.append(c) return dist [5, 2, 6, 0] ✅ Count consecutive summers def count_consecutive_summers(n): Positive integers can be expressed as sums of consecutive positive integers in various ways. For example, 42 can be expressed as such a sum in four different ways: (a) 3 + 4 + 5 + 6 + 7 + 8 + 9, (b) 9 + 10 + 11 + 12, (c) 13 + 14 + 15 and (d) 42. As the last solution (d) shows, any positive integer can always be trivially expressed as a singleton sum that consists of that integer alone. Given a positive integer n, determine how many different ways it can be expressed as a sum of consecutive positive integers. n Expected result 42 4 99 6 92 2 (As an aside, how would you concisely characterize the positive integers that have exactly one such representation? On the other side of this issue, which positive integer between one and one billion has the largest possible number of such breakdowns?) def count_consecutive_summers(n): count = 0 x=1 while (x*(x + 1) < 2*n): y = (1 * n - (x * (x + 1) ) / 2) a = y / (x + 1) if (a - int(a)) == 0: count += 1 x += 1 return count + 1 ✅ Iterated removal of consecutive pairs def iterated_remove_pairs(items): Given a list of elements, create and return a new list that contains the same elements except all occurrences of pairs of consecutive elements have been removed. However, this operation must continue in iterated fashion so that whenever removing some pair of consecutive elements causes two equal elements that were originally apart from each other to end up next to each other, that new pair is also removed, and so on, until nothing more can be removed. items Expected result [1, 2, 3, 4, 5, 5, 4, 3, 2, 1] [] ['bob', 'tom', 'jack', 'jack', 'bob', 42] ['bob', 'tom', 'bob', 42] [42, 42, 42, 42, 42] [42] [42, 5, 8, 2, 99, 99, 2, 7, 7, 8, 5] [42] def iterated_remove_pairs(items): i=0 prev = None while i < len(items): number = items[i] if number == prev: items.pop(i) items.pop(i - 1) i -= 1 else: i += 1 if i: prev = items[i - 1] else: prev = None return items print(iterated_remove_pairs([42, 5, 8, 2, 99, 99, 2, 7, 7, 8, 5])) ✅ Longest palindrome substring def longest_palindrome(text): A string is a palindrome if it reads the same forward and backward, for example 'racecar'. Given text, find and return the longest consecutive substring inside text that is a palindrome. If there exist multiple palindromes of the same largest possible length, return the leftmost one. text Expected result 'saippuakauppias' 'saippuakauppias' 'abaababaaabbabaababababaa' 'aababababaa' 'xxzxxracecar' 'racecar' 'xyxracecaryxy' 'racecar' def longest_palindrome(text): import itertools lp = '' lenlp = 0 lt = len(text)+1 for start, stop in itertools.combinations(range(lt), 2): x = text[start:stop] if (len(x) > lenlp) and (x == x[::-1]): lp, lenlp = x, len(x) return str(lp) ✅ Sort array by element frequency def frequency_sort(elems): Sort the given integer array elems so that its elements end up in the order of decreasing frequency, that is, the number of times that they appear in elems. If two elements have the same frequency, they should end up in the ascending order of their element values with respect to each other, as is usual in sorting. elems Expected result [4, 6, 2, 2, 6, 4, 4, 4] [4, 4, 4, 4, 2, 2, 6, 6] [4, 6, 1, 2, 2, 1, 1, 6, 1, 1, 6, 4, 4, 1] [1, 1, 1, 1, 1, 1, 4, 4, 4, 4, 6, 6, 6, 2, 2] [17, 99, 42] [17, 42, 99] ['bob','bob','carl','alex','bob'] ['bob','bob','bob','alex','carl'] (Hint: create a dictionary to count how many times each element occurs inside the array, and then use the counts stored in dictionary as sorting key of the array elements, breaking ties on the frequency by using the element value.) def frequency_sort(elems): from collections import Counter counts = Counter(elems) newlist = sorted(counts.most_common(), key=lambda x: (-x[1], x[0])) sl = [([v] * n) for (v, n) in newlist] return [item for sublist in sl for item in sublist] ✅ Collapse positive integer intervals def collapse_intervals(items): This function is the inverse of the earlier question of expanding positive integer intervals. Given a nonempty list of positive integers that is guaranteed to be in sorted ascending order, create and return the unique description string where every maximal sublist of consecutive integers has been condensed to the notation first-last. If some maximal sublist contains only one integer, it is included in the result by itself without the minus sign separating it from the now redundant last number. Make sure that the string that your function returns does not contain any whitespace characters, and does not have an extra comma either in the beginning or end. items Expected result [1, 2, 6, 7, 8, 9, 10, 12, 13] '1-2,6-10,12-13' [42] '42' [3, 5, 6, 7, 9, 11, 12, 13] '3,5-7,9,11-13' [] '' list(range(1, 1000001)) '1-1000000' def collapse_intervals(items): prev = min(items) if items else None n = list() for number in sorted(items): if number != prev+1: n.append([number]) elif len(n[-1]) > 1: n[-1][-1] = number else: n[-1].append(number) prev = number return ','.join(['-'.join(map(str, list_item)) for list_item in n]) ✅ Sort words by typing handedness def sort_by_typing_handedness(words): When typing an English word on a standard keyboard, the left hand types in the letters q to y in the top row, a to h in the middle row, and z to b in the bottom row. Other letters are typed in with the right hand. For the purposes of this problem, we define the score of a word as the number of its left hand characters, minus the number of its right hand characters. For example, the score for the word 'Repetition' would be 5 - 5 = 0. Sort the given list of words in the descending order of their score. To make the resulting list unique for the purposes of automated testing, words that have an equal score should be sorted in their normal alphabetical order. (As an aside curiosity, using the words given in words.txt, what words have the highest and lowest such scores?) def sort_by_typing_handedness(words): lenwords = [len(i) for i in words] lh = [int(y) if y.isdigit() else y for y in [x.replace('u', '').replace('i', '').replace('o', '').replace('p', '').replace('j', '').replace('k', '').replace('l', '').replace('n', '').replace('m', '') for x in words]] lenlh = [len(i) for i in lh] score = [lenlh - (lenwords - lenlh) for lenwords, lenlh in zip(lenwords, lenlh)] #Create dictionary zl = zip(words, score) dzl = dict(zl) #Sort by score items = [(v, k) for k, v in dzl.items()] items.sort() items.reverse() # so largest is first items = [(k, v) for v, k in items] #Solve ties alphabetically sortitems = sorted(items, key=lambda tup: (-tup[1], tup[0])) result = [(v) for v, k in sortitems] return result ✅ Reversing the reversed def reverse_reversed(items): Create and return a new list that contains the items in reverse, but so that whenever each item is itself a list, its elements are also reversed. This reversal of sublists must keep going on all the way down, no matter how deep the nesting of these lists, so you necessarily have to use recursion to solve this problem. The base case handles any argument that is not a list. When items is a list (use the Python function type or the isinstance operator to check this), recursively reverse the elements that this nested list contains. (List comprehensions might come handy in solving this problem.) Note that this function must create and return a new list to represent the result, and should not rearrange or otherwise touch the contents of the original list. items Expected result [1, [2, 3, 4, 'yeah'], 5] [5, ['yeah', 4, 3, 2], 1] [[[[[[1, 2]]]]]] [[[[[[2, 1]]]]]] [42, [99, [17, [33, ['boo!']]]]] [[[[['boo!'], 33], 17], 99], 42] def reverse_reversed(items): if isinstance(items, list): return list(reverse_reversed(x) for x in reversed(items)) else: return items ✅ Flatten an arbitrarily deep list def flatten(items): Given a list that contain arbitrary lists as element, which in turn can contain arbitrary lists as elements, and so on, create and return a new list that contains all the atomic (that is, anything that is not a list) elements without any nesting. This particular problem is another classic textbook exercise in recursion in all programming languages. The base cases are the empty list and whenever items is atomic. Otherwise, build a list by appending all elements from the recursively flattened elements of items. items Expected result [1, [2, 3, [4, 'bob', 6], [7]], [8, 9]] [1, 2, 3, 4, 'bob', 6, 7, 8, 9] [ [ [ [ [ [ ] ] ] ] ] ] [ ] [[], [[[[89, 85, 72, 84, 65], [[88, 31, 64, 11, 60, 42, 57, 55], 16, [79, 34, 82], [], 94, 36, [89, 26, 39, 94, 47, 72, 30], [72, 3, 73]], 18]], [[37, [51, 75, 83], 94, 57]], [37, 10, 62, 62], [[], 13]]] [89, 85, 72, 84, 65, 88, 31, 64, 11, 60, 42, 57, 55, 16, 79, 34, 82, 94, 36, 89, 26, 39, 94, 47, 72, 30, 72, 3, 73, 18, 37, 51, 75, 83, 94, 57, 37, 10, 62, 62, 13] def flatten(items): newlist = [] def loop(go): for i in go: if isinstance(i, list): loop(i) else: newlist.append(i) loop(items) return newlist ✅ Prime factors of integers def prime_factors(n): As the fundamental theorem of arithmetic again tells us, every positive integer can be broken down into the product of its prime factors exactly one way. Given positive integer n > 1, return the list of its prime factors in sorted ascending order, each prime factor included in the list as many times as it appears in the prime factorization of n. During the testing, the automated tester is guaranteed to produce the pseudo-random values of n in ascending order. The difficulty in this problem is making the entire test finish in reasonable time by caching the prime numbers that you discover along the way the same way as was done in primes.py example script, but also caching the prime factorizations of numbers for quick access later. n Expected result 42 [2, 3, 7] 10**6 [2, 2, 2, 2, 2, 2, 5, 5, 5, 5, 5, 5] 1234567 [127, 9721] 99887766 [2, 3, 11, 31, 48821] 10**12 - 1 [3, 3, 3, 7, 11, 13, 37, 101, 9901] 10**15 - 3 [599, 2131, 3733, 209861] def prime_factors(n): import math result = [] while n % 2 == 0: result.append(2), n=n/2 sqr = int(math.sqrt(n))+1 for i in range(3, sqr, 2): while n % i== 0: result.append(i), n=n/i if n > 2: result.append(n) return [int(i) for i in result] Python Notes Lab 1: Using Python as a calculator Zero. For each of the following expressions, first read it and without actually executing it in Python, use your brain to reason what result that expression would evaluate to. Once you have "placed your bet", so to speak, try out that expression in the interactive Python interpreter to see if your reasoning was correct. (Some of these expressions will fail to evaluate and throw some sort of exception instead. For those erroneous expressions, you can think how you would modify the expression so that it will correctly produce the result that was probably intended.) (a) 11 / 3 3.666… (b) 11 // 3 3 (c) 11 / 0 Traceback Error (d) 11 // 0 Traceback Error (e) 11 % 3 2 (f) 11 % -3 -1 (g) -11 % 3 1 (h) 3 * (11 // 3) + 11 % 3 11 (i) "Hello" * 4 ‘HelloHelloHelloHello’ (j) "Hello" + 4 Traceback Error (str vs int) (k) "Hello" – 4 Traceback Error (unsupported operand) (l) type("Hello") str (m) type(type("Hello")) type (n) type(type) type (o) "42" + "99" 4299 (p) 42 + int(str(99)) 141 (q) (2+3J).real 2.0 (r) (5-4J).imag -4.0 (s) 42 < 17 False (t) "Python" > "Java" TRUE One. Use the interactive Python prompt to solve the following word problems, using Python as a simple interactive calculator. However, you should define all the necessary quantities as names before using them in the formula that gives you the result, so that your formulas do not have any numerical values in then but properly talk about the symbolic names that you defined, combined with some basic arithmetic operations used to compute the result. (a) If each bucket has a capacity to hold 5 gallons of water, and you have 15 such buckets, how much water can you store in these buckets? bucket = 5 totalbuckets =15 water = (bucket*totalbuckets) print(water) (b) If your car can get 25 miles per gallon, and you drive 55 miles per hour, how many hours can you drive if you have 10 gallons of gas in the tank? gallon = 25 hour = 55 totalgas = 25*10 totalhours = totalgas/hour print(totalhours) (c) If Bob can paint the fence in five hours working alone, whereas Tom can paint that same fence in three hours working alone, how many hours would it take these two men to paint that same fence working together? (No, the answer is not four hours, the average of five and three. Surely having both men do this work cannot take longer than having only Tom do it! Math hint.) If you want to compute the exact result instead of merely its floating point decimal number approximation, first do from fractions import Fraction, and then define and use suitable Fraction objects to calculate the answer as an exact integer fraction, as seen in the first lecture. from fractions import Fraction bob=1/5 tom=1/3 teamwork = Fraction(1,5)+Fraction(1,3) teamworkhours = Fraction(15,15)/Fraction(teamwork) print(teamworkhours) print(float(teamworkhours)) Two. Write a Python script that asks the user to enter the width and the height of a rectangle, and then prints out the area and the perimeter length of that rectangle. You need to use the Python built-in function int to convert the input string into an integer. When run, the output should look something like this: Please enter width: 5 Please enter height: 6 The area is 30. The perimeter length is 22. width = int(input("Please enter width:")) height = int(input("Please enter height:")) perimeter = width*height plength = (width*2)+(height*2) print("The area is",perimeter) print("The perimeter length is",plength) Three. Write a script that asks the user to enter the radius of a circle in meters. This time, use the Python function float to convert the input string into a decimal number. The script should then print the area and the perimeter of the circle, computed using the geometric formulas that you learned in school. Since these formulas use the constant pi, you should first import the math module in your script and use the pi defined there, instead of hardcoding that quantity as a number in your source code. When run, the output should look something like this, with the results properly rounded to two decimal places: Please enter radius in meters: 4.5 The area of the circle is 63.62 square meters. The perimeter of the circle is 28.27 meters. To perform the required rounding, note that inside a Python 3 formatted f-string, each expression inside the curly braces can optionally be followed by a colon character and various control codes to control the textual appearance of that value. This will come handy when printing out decimal numbers. For example, if you have defined x = 1.234567, the expression f"{x:.2f}" would evaluate to "1.23", and the expression f"{x:.5f}" would evaluate to "1.23457". (Note how these control codes know to round the decimal numbers properly based on the chopped off digit value. Python is so smart under the hood.) import math math.pi radius = float(input("Please enter radius in meters:")) area = math.pi * radius * radius perimeter = 2 * math.pi * radius print("The area of the circle is", (f"{area:.2f}"), "square meters.") print("The perimeter of the circle is", (f"{perimeter:.2f}"), " meters.") Four. So far we have performed computations only on numbers. Since text is at least equally important for us meat machines, it is time to check out how Unicode characters work. Inside a Python string, each character is actually stored as an integer (but then again, inside the computer memory, everything is somehow encoded and stored as integers), the so-called Unicode codepoint of that particular character. The Unicode standard guarantees that the codepoint for every possible character recognized by this standard will be the same in every standard-compliant system, which guarantees that all textual data will be 100% portable across all such systems. Write a script that first asks the user to enter a Unicode codepoint as an integer, and then prints the Unicode character for that codepoint. The Python built-in function chr can be used to convert an integer codepoint to the corresponding Unicode character. (The built-in function ord performs the opposite conversion from character to its integer codepoint.) When run, the output should look something like this: Please enter codepoint: The codepoint 12346 encodes the character 〺. 12346 (If you try out your script by entering some Unicode codepoints, especially from the higher planes of this worldwide standard, remember that most fonts do not define any visible glyph for the majority of all possible Unicode characters. Such characters would then get displayed with the "empty box" character that is used as the default glyph whenever the real glyph is not available in that particular font.) codepoint = int(input("Please enter codepoint:")) unicode = chr(codepoint) print("The codepoint",codepoint,"encodes the character",unicode) Five. Check out the example script timedemo.py to learn how to define date objects that represent particular dates. (For now, only read up to the part where this script computes how many days ago Ludwig van Beethoven was born, and ignore the if-else calculation at the end of the script used to find out how many days we have to wait until we get to again celebrate his birthday.) After that, define two date objects today and dob that represent the dates of the current day and your date of birth. Compute the difference today - dob (the result will be an object of the type timedelta that is used to represent lengths of periods of time such as "seven days" without anchoring that time period to any particular start time) and find out how many days old you are today. Out of curiosity, what day in the future will you be exactly twice as old as you are today? (To find this out, add two times your previously computed age to your date of birth.) from datetime import * now = date.today() print(f"Today's date in standard form is {now}.") print(now.strftime("Today is a %A. It is the month of %B in the year %Y.")) samreen = date(1997, 5, 17) td = now - samreen print(f"Samreen was born exactly {td.days} days ago.") futureage = 2*td + samreen print(f"Samreen will be 42 on {futureage}.") Six. (ADVANCED ONLY) For further handling and analysis of dates, check out the documentation of the calendar library. Use its functions to find out what day of the week you were born, and what day of the week will begin the far future year of 1,000,000 A.D. import calendar weekday = calendar.weekday(1997, 5, 17) print(f"Samreen was born on day of the week number {weekday}.") Scripts 42 + 42 vs print(42+42) 84 vs print(“42”+ “42”) 4242 print(42+1234*5678 - 987) 7005707 Names and objects ● ● dir () brings up named objects ○ bob = 42 dir() [‘_builtins_’, ‘_name_’, ‘bob’] ● Anything bw underscored (‘__x__’) is internal. Corresponds to operator ● bob = 42 tom = bob + 7 print (bob) #42 print (tom) #49 ● bob = bob.capitalize() print(bob) Bob ● bob = bob.replace(‘o’, ‘xyz’) print(bob) BxyzB ● id() returns the identity of an object, which is a unique and constant integer during its lifetime. ○ bob = "hello world" tom = "hello world!" jack = bob print(id(bob)) print(id(tom)) print(id(jack)) 193701 189328 391083 a = "9999" b = str(9999) print(f"a == b: {a==b}\na is b: {a is b}") # a == b: True # a is b: False # Actual memory addresses are dependent on underlying system. The # only thing guaranteed is that separate objects will have separate # memory addresses. print(f"id(a) equals {id(a)}, id(b) equals {id(b)}") a = 10**100 b = 10**100 print("Results for one googol:") print(f"a == b: {a==b}\na is b: {a is b}") # id(a) equals 140107610442528, id(b) equals 140107605483904 # Whenever a variable appears in an expression, its value is substituted in its place. Therefore, id(a) gives you the memory address where the object that a currently points to is stored. There is no mechanism to acquire the memory address where the variable itself is stored in the # memory. Assignment operator ● input() ○ name = input("please enter name:") print ("hello there,", name) ○ x = int(input("please enter integer:")) y = x * x print("its square is",y) Imports ● Import math dir() math.sqrt(7) 2.645 math.log(5) 1.609 ● random.randint(1,6) Look for object randint in random Generates random integers from lower range 1 to upper range 6 Some handy library modules ● However, the floating point encoding cannot represent exactly even some numbers that we humans would consider "simple". For example, the expression 0.1 + 0.2 does not equal 0.3, but equals 0.30000000000000004. This is because none of the three seemingly simple values 0.1, 0.2 and 0.3 has an exact representation in floating point using base two, even though they are trivially representable exactly in base ten. ● a = 0.1 b = 0.2 print(a+b) 0.3000000000004 ● a = 0.1 b = 0.2 print(f"{a+b:.2f}") 0.30 print(11 / 4) print(11 / -4) print(11 // 4) print(11 // -4) print(11 % 4) print(-11 % 4) print(11 % -4) # # # # # # # 2.75, normal division -2.75 2, integer division -3, not -2 3, % is the integer division remainder operator 1, not -3, to satisfy x % y == x - y * (x // y) -1 (see the previous item why) ● Another module fractions defines the datatype Fraction for exact integer fractions. For example, 3 * Fraction(1, 3) equals exactly 1, not one iota more or less. ○ From fractions import Fraction f1 = Fraction(1,3) f2 = Fraction (-7,11) f1+f2 Fraction (-10,33) f1 * f2 Fraction (-7, 33) ○ bob = math bob.sqrt(7) 2.645 ● To generate random numbers or choose random items from sequences, use the functions defined in the library module random. ● The modules datetime and calendar define data types and functions for calculations on dates and times, often needed in real life applications. ● The modules os and sys define functions that allow your code to access the underlying computer and its file system. from fractions import Fraction from decimal import * from math import fsum def demonstrate_imprecision(): # Floating point can represent exactly only terminating series of # powers of two. All three numbers are unrepresentable. if 0.1 + 0.2 == 0.3: print("Mathematics works as it should, no problemo!") else: print("Basic arithmetic has gone topsy turvy!") # Decimal to the rescue! if Decimal(0.1) + Decimal(0.2) == Decimal(0.3): print("Decimal works just fine!") else: print("The Decimals! They do nothing!") # Always create a Decimal object from a string, otherwise you get # an exact representation of a number that is already inexact. if Decimal("0.1") + Decimal("0.2") == Decimal("0.3"): print("All is well in the world again!") # Precision of floating point does not work out well when adding # numbers of vastly different magnitudes. print(sum([3e30, 1, -3e30])) # 0.0 # Better. print(fsum([3e30, 1, -3e30])) # 1.0 def decimal_precision(): a = Decimal("0.2") b = Decimal("0.3") print(a * b) # 0.06 a = Decimal("0.2000") b = Decimal("0.3000") print(a * b) # 0.06000000 a = Decimal("0.12345") b = Decimal("0.98765") print(a * b) # 0.1219253925 getcontext().prec = 5 a = Decimal("0.12345") b = Decimal("0.98765") print(a * b) # 0.12193 # Unlike machine floats, Decimal values cannot under- or overflow a = Decimal("0.1") ** 1000 print(a) #1E-1000 a = Decimal("2") ** 10000 print(a) #1.9951E+3010 def infinity_and_nan_demo(): ninf = float('-inf') pinf = float('inf') print(f"Plus infinity minus 7 equals {(pinf - 7)}.") #inf print(f"Plus infinity plus infinity equals {(pinf + pinf)}.") #inf print(f"Minus infinity times minus 4 equals {(ninf * -4)}.") #inf nan = pinf + ninf print(f"Infinity minus infinity equals {nan}.") #nan print(f"nan equals nan is {nan == nan}.") #False def harmonic(n): total = Fraction(0) for i in range(1, n+1): total += Fraction(1, i) return total def estimate_pi(n): total = Fraction(0) for i in range(1, n+1): total += Fraction(1, i*i) return total * 6 Text strings ● In reality, computers only store small integers inside memory bytes. Any other type of data must be internally expressed as an aggregate series of bytes. There are standardized ways of doing this for floating point decimal numbers and text strings. ● a = 4 b = 1.2 ans = f"a is {a}, b is {b}" print(ans) a is 4, b is 1.2 ● a = 4 b = 1.2 ans = f"a squared is {a*a}, b is {b }" print(ans) a is 16, b is 1.2 s1 = "Hello there,\tworld!" print(s1.title()) #Hello There, World! print(s1.upper()) # HELLO THERE, WORLD! print(s1.lower()) #hello there, world! print(s1.capitalize()) #Hello there, world! # Add whitespace to left, right or both sides to reach given length. print('1234'.center(10)) print('1234'.ljust(10)) print('1234'.rjust(10)) Building up string literals from smaller pieces ● When printing out floating point numbers, always round them to some reasonable number of decimal places. For example, print(f"{0.1 + 0.2:.1f}") correctly prints 0.3, even though the actual underlying value is something slightly different. String slicing ● txt = "Hello world!" print(len(txt)) 12 ● To extract a single character from a sequence, use square brackets with one offset. ○ print(txt[4]) o ○ print(txt[0:5]) Hello * includes space at end ○ print(txt[0:-2]) Hello worl ○ print(txt[3:]) lo world! ○ print(txt[::3]) Hlwl *skips by 3 ● Negative offsets would be logically impossible by definition. Therefore, as a handy convenience, they are defined to denote the positions starting from the end instead of the beginning. For example, "Hello"[-4:-1] would evaluate to "ell". ● Step size can be given as third operand to slicing, with the negative step size defined to iterate the elements in reverse order. For example, "Hello"[0::2] evaluates to "Hlo", and "Hello"[::-1] evaluates to "olleH". ● When using the colon character, leaving out the start offset means to start from beginning, and leaving out the end offset means to continue all the way to the end. For example, 'Hello world'[3:] evaluates to 'lo world'. Leaving out the start offset similarly means "all the way from beginning", but we know that such offset would always be zero. One. Same way as in the first lab, for each of the following expressions, first read it and without executing it in Python, reason what result it would evaluate to. Then try it out in the interactive Python interpreter to see if you were correct. (a) [1, 2, 3] + [4, 5, 6] [1, 2, 3, 4, 5, 6] (b) a = [1, 2, 3]; a.append([4, 5, 6]); print(a) [1, 2, 3, [4, 5, 6]] (c) a = [1, 2, 3]; a.extend([4, 5, 6]); print(a) [1, 2, 3, 4, 5, 6] (d) len([]) 0 (e) list(range(5, 0, -1)) [5, 4, 3, 2, 1] (f) len(range(0, 5)) 5 (g) sum(range(0, 5)) 10 (bc list(range(0,5)) = [0, 1, 2, 3, 4] (h) max([17, 99, 42]) 99 (i) a = [1,2,3,4]; a.reverse(); print(a) [4, 3, 2, 1] (j) max(["Hello", "there", "world,", "how", "are", "yeh?"]) ‘yeh?’ (alphabetically?) (k) "".join(["Hello", "there", "world,", "how", "are", "yeh?"]) 'Hellothereworld,howareyeh?' (l) "Hello world"[3:6] ‘lo (m) "Hello world"[:] ‘Hello world’ If population is a list, this line will create a copy of the list. For an object of type tuple or a str, it will do nothing (the line will do the same without [:]) (n) "Hello world"[-1:-5:-1] ‘dlro’ Last indicates going backwards, middle indicates stop before fifth character, first indicates first letter from back. (o) "Hello world"[::-1] ‘dlrow olleH’ (p) 42 in range(0, 100) True (q) '7' in str(123456) False Two. Right out of the box, Python strings define a whole bunch of useful methods to perform operations on the texts that these objects represent. Start by making the assignment x = "Hello, world!". Consulting the Python library documentation section "Text sequence type str", for each of the following expressions, first read it and guess what result it would seem to evaluate to, assuming that each method does pretty much what its name promises that it will do. Then try it out in the interactive Python interpreter to again see if you guessed correctly. (a) x.upper() HELLO, WORLD (b) (c) (d) (e) (f) (g) (h) x.lower() hello, world x.capitalize() Hello, world! – first char x.title() Hello, World! entire title x.swapcase() hELLO WORLD x.startswith("Hel") True x.find("o") 4 x.rfind("o") 8 (last ‘o’ – from right to left vs left to right) (i) x.isupper() FALSE (checks all chars) (j) x.islower() FALSE (checks all chars) (k) x.isnumeric() FALSE (l) x.replace("l", "xyz") Hexyzxyzo, worxyzd! (m) x.split(" ") ['Hello,', 'world!'] # String Problems # Convert the words in the text to title case. (Not same as uppercase.) def title_words(text): prev = ' ' result = "" for c in text: if prev.isspace() and not c.isspace(): result += c.title() else: result += c prev = c return result print(title_words("hi hey ho")) #Hi Hey Ho # Given a text string, create and return another string that contains # each character only once, in order that they occur in the text. def unique_chars(text): result = "" seen = set() for c in text: if c not in seen: result += c seen.add(c) return result print(unique_chars("hi hey ho")) #hi eyo # The classic way to test whether two strings are anagrams. They are # if and only if sorting both gives the same end result. def are_anagrams(word1, word2): if len(word1) != len(word2): return False return list(sorted(iter(word1))) == list(sorted(iter(word2))) print(are_anagrams("listen","silent")) #True print(are_anagrams("listen","selent")) #False # The string module has handy data and methods for text processing. from string import ascii_letters as letters from string import ascii_uppercase as au from string import ascii_lowercase as al #au = "hello what is up right now" #al = "hello what is up right now" au_c = au[13:] + au[:13] #print(au_c) # up right nowhello what is al_c = al[13:] + al[:13] #print(al_c) # up right nowhello what is # Obfuscate the given text using the ROT-13 encoding: # simple letter substitution cipher that replaces a letter with the 13th letter after it def rot13(text): result = "" for c in text: idx = au.find(c) # Is c an uppercase character? if idx > -1: result += au_c[idx] else: idx = al.find(c) # Is c a lowercase character? if idx > -1: result += al_c[idx] else: result += c # Other characters are taken as is. return result print(rot13("hello")) #uryyb from random import choice # Given a sentence and a function f that converts one word, translate # the entire sentence. Since whitespace and punctuation must be kept # as they were in the original sentence, we can't just use "split" to # separate the sentence into words, since this would lose the track # of what the whitespace and punctuation were. def translate_words(sentence, f): result = "" word = "" for c in sentence: is_letter = c in letters if is_letter: # add the letters into the word word += c elif len(word) > 0 and not is_letter: # non-letter ends the word result += f(word) + c # add the translated word word = "" else: result += c # non-letters added to result as is if len(word) > 0: # the possibly remaining word at end of sentence result += f(word) return result # Convert the given sentence to pig latin. Note how the function to # convert one word is defined inside this function, to be passed to # the previous translate_words function as its second argument f. def pig_latin(sentence): def trans(word): cap = word[0].isupper() idx = 0 while idx < len(word) and word[idx] not in "aeiouAEIOUY": idx += 1 if idx == 0: # the word starts with vowel return word + "way" else: if cap: head = word[idx].upper() + word[idx + 1:] else: head = word[idx:] return head + word[:idx].lower() + "ay" return translate_words(sentence, trans) print(pig_latin("hey what is up guys")) #eyhay atwhay isway upway uysgay # Convert the given sentence to ubbi dubbi. Same logic as previous. def ubbi_dubbi(sentence): def convert(c): if c in 'aeiouyAEIOUY': if c.isupper(): return "Ub" + c.lower() else: return "ub" + c else: return c def trans(word): return "".join([convert(c) for c in word]) return translate_words(sentence, trans) print(ubbi_dubbi("hey guys")) #hubeuby gubuubys print(ubbi_dubbi("HEY GUYS")) #HUbeUby GUbuUbyS # The trickiest conversion gives us a choice of how to convert each # letter. Let us maintain a dictionary that maps each letter to the # list of the possibilities. def tutnese(sentence): reps = { "b": ["bub"], "c": ["cash", "coch"], "d": ["dud"], "f": ["fuf", "fud"], "g": ["gug"], "h": ["hash", "hutch"], "j": ["jay", "jug"], "k": ["kuck"], "l": ["lul"], "m": ["mum"], "n": ["nun"], "p": ["pup", "pub"], "q": ["quack", "queue"], "r": ["rug", "rur"], "s": ["sus"], "t": ["tut"], "v": ["vuv"], "w": ["wack", "wash"], "x": ["ex", "xux"], "y": ["yub", "yuck"], "z": ["zub", "zug"] } def trans(word): result = "" skip = False for idx in range(len(word)): if skip: skip = False continue c = word[idx].lower() if idx < len(word) - 1 and c == word[idx + 1].lower(): if c in "aeiouy": dup = "squat" else: dup = "squa" if word[idx].isupper: dup = dup[0].upper() + dup[1:] result += dup + c skip = True # skip the duplicated letter after this one else: if c in reps: rep = choice(reps[c]) if word[idx].isupper(): rep = rep[0].upper() + rep[1:] result += rep else: result += word[idx] return result result = translate_words(sentence, trans) return result print(tutnese('Do you know the famous Variety headline "Stix nix hix pix"?')) #Dudo yuckou kucknunowash tuthashe fudamumousus Vuvarugietutyuck hasheadudlulinune "Sustutiex nuniex hutchiex pupiex"? # Idea taken from Rosetta Code, so I added this solution in there. # https://rosettacode.org/wiki/Number_names#Python def int_to_english(n): if n < 0: return "minus " + int_to_english(-n) if n < 20: return ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", "eighteen", "nineteen"][n] if n < 100: tens = ["twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"][(n // 10 - 2)%10] if n % 10 != 0: return tens + "-" + int_to_english(n % 10) else: return tens if n < 1000: if n % 100 == 0: return int_to_english(n // 100) + " hundred" else: return int_to_english(n // 100) + " hundred and " +\ int_to_english(n % 100) powers = [("thousand", 3), ("million", 6), ("billion", 9), ("trillion", 12), ("quadrillion", 15), ("quintillion", 18), ("sextillion", 21), ("septillion", 24), ("octillion", 27), ("nonillion", 30), ("decillion", 33), ("undecillion", 36), ("duodecillion", 39), ("tredecillion", 42), ("quattuordecillion", 45), ("quindecillion", 48), ("sexdecillion", 51), ("eptendecillion", 54), ("octadecillion", 57), ("novemdecillion", 61), ("vigintillion", 64)] ns = str(n) idx = len(powers) - 1 while True: d = powers[idx][1] if len(ns) > d: first = int_to_english(int(ns[:-d])) second = int_to_english(int(ns[-d:])) if second == "zero": return first + " " + powers[idx][0] else: return first + " " + powers[idx][0] + " " + second idx -= 1 for x in [42, 3**7, 2**100, 10**128]: print(f"{x} written in English is {int_to_english(x)}.") #42 written in English is forty-two. #2187 written in English is two thousand one hundred and eighty-seven. #1267650600228229401496703205376 written in English is one nonillion two hundred and sixty-seven octillion six hundred and fifty septillion six hundred sextillion two hundred and twenty-eight quintillion two hundred and twenty-nine quadrillion four hundred and one trillion four hundred and ninety-six billion seven hundred and three million two hundred and five thousand three hundred and seventy-six. #10000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000 written in English is one vigintillion vigintillion. Lists a = [1, "bob", 4.7] b = ["jack", 99, a] print(len(b)) #3 print(len(b[-1])) #3 print(len(b[2])) #3 a = "yeah" items = [a, a, a] print(items) # ['yeah', 'yeah', 'yeah'] a = "nope" print(items) # ['yeah', 'yeah', 'yeah'] # If the same object is placed in multiple places of a list or other # collection, all those places in collection refer to same object. first = [1, 2, 3] # Three references to the same object. second = [first, first, first] print(second) # [[1,2,3], [1,2,3], [1,2,3]] first[0] = 99 print(second) # [[99,2,3], [99,2,3], [99,2,3]] # Splicing a list or other sequence creates a whole new object. a = [1, 2, 3, 4, 5] b = a[2:4] a[2] = 99 print(b) # [3, 4] ● "hello" in "hello world" 1 ● print("bob" in a) # True print("jack" not in a) # True ● a = [1,2,3] print(a) [1,2,3] a.append(42) print(a) [1,2,3,42] a.append(["bob","tom"]) print(a) [1,2,3,42,’bob’,’tom’] a = a + ["hello", "world"] print(a) [1,2,3,42,’bob’,’tom’,‘hello’,‘world’] ● a = [1,2,3] print(a) [1,2,3] print(id(a)) 2149490832085 a.reverse() print(a) [3,2,1] print(id(a)) 2149490832085 a.sort() print(a) [1,2,3] b = a[::-1] print(b) [1,2,3] print(id(b)) 2410935205904 ● a = [1, 2, "bob"] b = [a, a, a] print(b) [1, 2, ‘bob’],[1, 2, ‘bob’],[1, 2, ‘bob’] a.append(99) print(b) [1, 2, ‘bob’, 99],[1, 2, ‘bob’, 99],[1, 2, ‘bob’, 99] *stores object memory address. If you edit object part of list you are indirectly modifying list ● import random a = [1, 2, "bob", 42, 99, "tom", 54.87, type(42),type("hello")] print(a) print(random.choice(a)) *random ● import random a = [1, 2, "bob", 42, 99, "tom", 54.87, type(42),type("hello")] print(random.sample(a,3)) *random ● import random random.seed(12345) a = [1, 2, "bob", 42, 99, "tom", 54.87, type(42),type("hello")] random.shuffle(a) print(a) *rearranges order in list (permenantly) a = [1, 3, 5, 6, 7, 8, 9] b = [1, 2, 3] c = a[3:] print(c) c = a[:3] print(c) del(a[1]) print(b) # # # # [missing end index means "up to the end"] [6, 7, 8, 9] missing start index means "from the start" [1, 3, 5] # remove element 2 from the list object # [1, 3] # If you want to create a separate copy of a list, instead of having both # names point to the same list, the canonical Pythonic trick is to use # slicing without either start or end. a = [1, 2, 3] b = a[:] # slicing always creates a separate copy del(a[1]) # that modifying the original does not affect print(a) # [1, 3] print(b) # [1, 2, 3] # The canonical trick to create a reversed copy of the list. a = [1, 2, 3] print(a[::-1]) # [3, 2, 1] print(a) # [1, 2, 3] x = list(range(1,21,2)) print(x) [1,3,5,7,9,11,13,15,17,19] List comprehensions a = "hello world" len(a) #11 a[6] #w a[3:7] #'lo w' #negative stepwise, reads from right a[7:3:-1] #'ow o' a[::-1] #dlrow olleH b = [12, "hello", 42, type(a), [4,5,6]] b[3] #string b.append(999) #[12, "hello", 42, type(a), [4,5,6], 999] #c = (87,4.2,'yeah') #ERROR: 'tuple' object does not support item assignment #c[1] = 'boo' [x*x*x for x in range(10)] #[0,1,8,27,64,125,216,343,512,729] [x*x*x for x in range(10) if x != 5] #[0,1,8,27,64,216,343,512,729] [(x,y) for x in range(5) for y in range(5)] [(x,y) for x in range(5) for y in range(5) if 2 < x+y <6] #[(0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (3, 0), (3, 1), (3, 2), (4, 0), (4, 1)] #Lists are eager, computes elements in list as sson as they are inputted, and takes up memory a = [] a = [x*x for x in range(1,11)] #Lazy, potential to generate elements one by one, allows us to create sequences so long they couldnt exist in memory b = (x*x for x in range(1,11)) next(b) #generates element one at a time on command - no running out of memory len(a) #9 #len(b) #error - stores each separate x = range(1,11) y = [e for e in x] [1,2,3,4,5,6,7,8,9,10] y = [e*e for e in x] [1,4,9,16,25,36,49,64,81,100] x=range(1,10**6+1) y=[e for e in x if "7" in str(e)] print(len(y)) x1 = range(5) x2=range(3) y = [(a,b) for a in x1 for b in x2] print(y) [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2), (3, 0), (3, 1), (3, 2), (4, 0), (4, 1), (4, 2)] num_list = [y for y in range(100) if y % 2 == 0 if y % 5 == 0] # [0, 10, 20, 30, 40, 50, 60, 70, 80, 90] h_letters = [ letter for letter in 'human' ] # ['h', 'u', 'm', 'a', 'n'] squares = [x**2 for x in range(10)] # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] number_list = [x ** 2 for x in range(10) if x % 2 == 0] # [0, 4, 16, 36, 64] listOfWords = ["this","is","a","list","of","words"] items = [ word[0] for word in listOfWords ] # ['t', 'i', 'a', 'l', 'o', 'w'] string = "Hello 12345 World" numbers = [x for x in string if x.isdigit()] # ['1', '2', '3', '4', '5'] a_list = [1, "4", 9, "a", 0, 4] squared_ints = [ e**2 for e in a_list if type(e) == types.IntType ] # [ 1, 81, 0, 16 ] Celsius = [39.2, 36.5, 37.3, 37.8] Fahrenheit = [ ((float(9)/5)*x + 32) for x in Celsius ] # [102.56, 97.700000000000003, 99.140000000000001, 100.03999999999999] colours = [ "red", "green", "yellow", "blue" ] things = [ "house", "car", "tree" ] coloured_things = [ (x,y) for x in colours for y in things ] # [('red', 'house'), ('red', 'car'), ('red', 'tree'), ('green', 'house'), ('green', 'car'), ('green', 'tree'), ('yellow', 'house'), ('yellow', 'car'), ('yellow', 'tree'), ('blue', 'house'), ('blue', 'car'), ('blue', 'tree')] #Calculation of the prime numbers between 1 and 100 using the sieve of Eratosthenes: noprimes = [j for i in range(2, 8) for j in range(i*2, 100, i)] primes = [x for x in range(2, 100) if x not in noprimes] # [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97] Squares of numbers from 1 to 99 that contain digit "3". a = [ x * x for x in range(1, 100) if "3" in str(x)] print(a) # Since a comprehension can be made out of any existing sequence, # more complex lists can be created out of existing comprehensions. # The members of previous list that contain are between 500 and 2000. b = [x for x in a if 500 < x < 2000] print(b) Three. Use list comprehensions to create the following list objects, each list in a single swoop. (a) All integers from -10 to +10. x = range(-10,11) y = list(x) a = [x for x in range(-10, 11)] (b) All even integers from -10 to +10. (Zero is also even. It is divisible by two, is it not?) x = range(-10,11,2) y = list(x) b = [x for x in range(-10, 11, 2)] (c) The cubes (that is, third powers) of the integers from -10 to +10. x = range(-10,11) y = [e**3 for e in x] c = [e*e*e for e in b] (d) Three strings entered by the user with input, using the prompt "Enter something: " ans = [input("Enter something:") for e in range(3)] (e) A hundred-element list whose every other element is zero and every other element is 42. (Hint: first use range to produce the simple list of integers from 0 to 99, and use list comprehension on that to turn its elements into the required elements. The integer remainder operator % will be useful here.) x = list(range(0,100)) y = [e%2*42 for e in x] print(y) e = [42*(i%2) for i in range(100)] (f) A hundred-element list of random integers ranging from 0 to 999. Use the random.randint function to generate each random number. import random f = [random.randint(0, 999) for i in range(100)] (g) The elements of the previous list that are in the odd-numbered locations of the list. (Hint: use enumerate to first convert the list of numbers into a list of (position, number) pairs so that you can filter elements based on their positions instead of their values.) g = [e for (i,e) in enumerate(f) if i%2 == 0] (h) The list of all possible pairs of the form (x, y) where both x and y are integers that range from 1 to 5. (This list should have 25 elements, from (1, 1), (1, 2), (1, 3) ... to (5, 5).) h = [(x,y) for x in range(1, 6) for y in range(1, 6)] (i) List of ten elements that consists of 3-tuples (1, 2, 3), (2, 3, 4), ..., (10, 11, 12). i = [(x,x+1,x+2) for x in range(1, 11)] Four. Write a program that asks the user to enter one sentence of arbitrary text, and then prints the characters of that sentence in sorted order, as determined by the order comparison operator < for the individual characters. For example, if the user enters the input Hello world your program would print the line (note the space character in the beginning) Hdellloorw # Read input as string. text = input("Enter text: ") # Convert string (a sequence of chars) into list of characters. chars = list(text) # Sort characters according to their Unicode codepoints. chars.sort() # Turn the sorted list back to a string. result = "".join(chars) # And here you are! print(result) Defining your own functions def total_water(buckets): #def name(what function needs) total = buckets * 5 return total #print(f"There is {total} gallons.") print(d) #42 print(type(a)) #int print(total_water) #function total_water at 0x109111bf8> print(type(total_water)) #class function #We have a loud talking parrot. The "hour" parameter is the current hour time in the range 0..23. We are in trouble if the parrot is talking and the hour is before 7 or after 20. #Return true if we are in trouble. def parrot_trouble(talking,hour): return talking and (hour < 7 or hour > 20) #We have two monkeys, a and b, and the parameters aSmile and bSmile indicate if each is smiling. We are in trouble if they are both smiling or if neither of them is smiling. Return true if we are in trouble. def monkey_trouble(a_smile,b_smile): return a_smile == b_smile def print_values_onebyone(seq): for x in seq: print(f"Here is element {x}.") a = [42, "bob", 12.5, type(42)] print_values_onebyone(a) # Here is element 42. # Here is element bob. # Here is element 12.5. # Here is element <class 'int'>. def print_values_onebyone(seq): seq.sort() for x in seq: print(f"Here is element {x}.") a = [42, -7, 99, 1] print(a) print_values_onebyone(a) print(a) #[42, -7, 99, 1] # Here is element -7. # Here is element 1. # Here is element 42. # Here is element 99. # [-7, 1, 42, 99] def countdown(start = 10, end = 0): print("Countdown begins!") for x in range(start, end, -1): print(f"{x}... ", end = '') print("\nLiftoff!") countdown() #10... 9... 8... 7... 6... 5... 4... 3... 2... 1... #Liftoff! ans = total_water(10) print(f"we have {ans} gallons") #50 ans = total_water(50) print(f"we have {ans} gallons") #250 def total_water2(buckets, capacity = 5): #def name(what function needs, parameter) total = buckets * capacity return total ans = total_water2(50,capacity = 8) print(f"we have {ans} gallons") #400 total = 42 print(f"Total is {total}") #42 def total_water3(buckets, capacity = 5): #def name(what function needs, parameter) total = buckets * capacity return total ans = total_water3(10) print(f"we have {ans} gallons") #50 ans = total_water3(50,capacity = 8) print(f"we have {ans} gallons") #400 print(f"Total is {total}") #42 ● The keyword global used inside a function causes its following name to be created and accessed in the global namespace. print(f"Total is {total}") #42 def total_water4(buckets, capacity = 5): global total total = buckets * capacity ans = total_water4(10) print(f"we have {ans} gallons") #50 ans = total_water4(50,capacity = 8) print(f"we have {ans} gallons") #400 print(f"Total is {total}") #400 #print(dir(total_water)) # Every def function has a '_doc_' string def factorial(n): """Return the factorial of a positive integer. n -- The positive integer whose factorial is computed. """ result = 1 for x in range(2, n+1): result *= x return result print(factorial(5)) #120 print(factorial(10)) # 362880 print(factorial(100)) f = factorial result = f(30) print(result) # copy a reference to the function object # make an actual function call # 265252859812191058636308480000000 def maximum(seq): """ Return the largest element in sequence. seq -- The sequence whose maximum we are looking for. """ first = True for x in seq: if first or x > m: m = x first = False return m print(maximum([5,2,8,1,0,9,6])) # 9 # We could have just used Python's built-in max function... print(max([5,2,8,1,0,9,6])) # 9 def accum(seq): """Given a sequence, return its accumulation sequence as a list. seq -- The sequence whose accumulation is computed. """ result = [] total = 0 for x in seq: total += x result.append(total) return result print(accum([1, 2, 3, 4, 5])) # [1, 3, 6, 10, 15] Alternatively, the library itertools offers a bunch of functions that take an iterable and perform some kind of transformation on it, giving you another iterable as a result. For example, accumulate is one of the functions already defined there. from itertools import accumulate print(list(accumulate([1,2,3,4,5]))) Select precisely the elements that are larger than their predecessor. def select_upsteps(seq): result = [] for i in range(len(seq)): # Loop through positions of sequence if i == 0 or seq[i-1] < seq[i]: result.append(seq[i]) return result print(select_upsteps([4, 8, 3, 7, 9, 1, 2, 5, 6])) # [4,8,7,9,2,5,6] Compute the factorials up to n! and count how many times each digit from 1 to 9 appears as the first digits of that factorial. def leading_digit_list(n): prod = 1 # The current factorial digits = [0 for i in range(11)] # Create a list full of zeros for i in range(2, n+1): lead = ord(str(prod)[0]) - ord('0') # Compute first digit digits[lead] += 1 prod = prod * i # The next factorial from the current one return digits from math import log def output_leading_digits(n): digits = leading_digit_list(n) # from https://en.wikipedia.org/wiki/Benford%27s_law benford = [100 * (log(d+1, 10) - log(d, 10)) for d in range(1, 11)] print("\nDigit\tObserved\tBenford") for i in range(1, 10): pct = 100 * digits[i] / n print(f"{i}\t{pct:.2f}\t\t{benford[i-1]:.2f}") print("") output_leading_digits(2000) Calling a function creates a new namespace separate from that of the caller's namespace. Let us demonstrate that with an example. xx = 42 def scope_demo(): """Demonstrate the behaviour of local and global variables. """ # print(f"xx is now {xx}") # error, Python compiles entire function xx = 99 print(f"xx is now {xx}") # xx is now 99 # access a variable in the global namespace print(f"global xx is now {globals()['xx']}") # global xx is now 42 def inner(): return 4 + xx # what is this bound to? return inner print(f"Scope demo result is {scope_demo()}.") However, argument objects are passed to functions by reference, so the names in both caller and function namespace refer to the same data object. Therefore, if a function modifies that object, that modification persists to when execution returns to the caller. items = [1,2,3,4,5] print(f"Items is now: {items}") # [1, 2, 3, 4, 5] def demonstrate_parameter_passing(x): x[2] = 99 demonstrate_parameter_passing(items) print(f"Items is now: {items}") # [1, 2, 99, 4, 5] Next, let's write a function that allows us to roll a die a given number of times, and return the sum of these rolls. This function shows how to generate a random integer from the given range (once again, the upper bound is exclusive) and how to use default parameters. import random def roll_dice(rolls, faces = 6): """Roll a die given number of times and return the sum. rolls -- Number of dice to roll. faces -- Number of faces on a single die. """ total = 0 for x in range(rolls): total += random.randrange(1, faces+1) return total print("Rolling a 12-sided die 10 times gave a total of %d" % roll_dice(10, 12)) # With a default parameter, we don't need to give its value print("Rolling a 6-sided die 10 times gave a total of %d" % roll_dice(10)) Note that default parameter values are computed only once at the function declaration. val = 42 def foo(x = val): return x val = 99 def bar(x = val): return x print("foo() equals %d, bar() equals %d" % ( foo(), bar() )) # 42 99 Fizzbuzz is a game where you try to list the numbers from start to end but so that if a number is divisible by 3, you say "fizz", and if a number is divisible by 5, you say "buzz", and if by both 3 and 5, you say "fizzbuzz". def fizzbuzz_translate(n): """Convert positive integer n to its fizzbuzz representation. n -- The positive integer to convert. """ if n % 5 == 0 and n % 3 == 0: return 'fizzbuzz' elif n % 5 == 0: return 'buzz' elif n % 3 == 0: return 'fizz' else: return str(n) print("Let's play a game of fizzbuzz from 1 to 100.") print(", ".join([fizzbuzz_translate(y) for y in range(1, 101)])) Next, let us write a function that evaluates a polynomial, given in the list of coefficients, at the given point x. def evaluate_poly(coeff, x): """Evaluate the polynomial, given as coefficient list, at point x. coeff -- Sequence of coefficients that define the polynomial. x -- The point in which to evaluate the polynomial. """ result = 0.0 power = 1.0 # Current power of x for c in coeff: result += c * power # Add current term to result power *= x # Next power from current with one multiplication return result # Evaluate 5x^3-10x^2+7x-6 at x = 3. Note how coefficients are listed # in the increasing order of the exponent of the term. print(evaluate_poly([-6, 7, -10, 5], 3)) # 60.0 A function that returns either True or False is called a predicate. It checks whether the given parameter values have some particular property. For example, the simple predicate that checks if its parameter is odd: def is_odd(n): """A simple function to check if an integer is odd. n -- The integer whose oddness we want to determine. """ if n % 2 == 0: return False else: return True Predicates can be themselves be passed as parameters to higher-level functions that take functions are parameters. Such predicates can then be used in, for example, conditional list comprehensions. Here, we list the squares of odd numbers from 0 to 100. oddsquares = [x*x for x in range(100) if is_odd(x)] print(oddsquares) # [1, 1225, 3721, 7569, 9, 25, 49, 81, 121, 169, 225, 1369, 1521, 1681, 1849, 2025, 3969, 4225, 4489, 4761, 5041, 7921, 8281, 8649, 9025, 9409, 289, 361, 441, 529, 625, 729, 841, 961, 1089, 2209, 2401, 2601, 2809, 3025, 3249, 3481, 5329, 5625, 5929, 6241, 6561, 6889, 7225, 9801] Lab 4: List computations One. Write a function above_average(elems) that counts how many elements in elems, assumed to be a list of integers, are greater than the average value of the list. (Could every student in the Lake Wobegon High really be above average?) def above_average(elems): # Handle empty list as special case to avoid division by zero. if len(elems) == 0: return 0 # The average of the list. avg = sum(elems) / len(elems) # List comprehension collects the elements that are above average. return len([x for x in elems if x > avg]) #print(above_average([1,2,3,4])) Two. Python has the integer exponentiation operator ** built in the language, and notice that the caret character ^ means a totally different operator that has nothing to do with exponentiation. However, in a related operation of falling power, useful in many combinatorial formulas and denoted syntactically in mathematical notation by underlining the exponent, each number that gets multiplied into the product is always one less than the previous number. For example, the falling power 8 would be computed as the product 8 * 7 * 6 = 336. Similarly, the falling power 10 would be 10 * 9 * 8 * 7 * 6 = 30240. Nothing changes if the base n is negative, for example (-4) would be -4 * -5 * -6 * -7 * -8 = -6720. Write a function falling_power(n, k) that computes and returns the falling power n where n can be any integer, and k can be any nonnegative integer. (Analogous to ordinary powers, n = 1 for any n.) 3 5 5 k 0 def falling_power(n, k): result = 1 # Loop through integers to multiply. for i in range(n - k + 1, n + 1): result *= i return result print(falling_power(8,3)) #336 Three. Write a function forward_difference(elems) that creates and returns a new list whose length is one less than the length of the parameter list elems guaranteed to be nonempty, and whose value in each position i equals the difference elems[i+1]-elems[i] between the original element in that position and its successor. For example, when given as argument the list [4, 2, 7, 1, 8, 3], this function would create and return the list [-2, 5, -6, 7, -5]. Given that list as argument for another call, this function would create and return [7, -11, 13, -12], and so on. The forward difference of a singleton list that contains only one element would be the empty list. def forward_difference(elems): result = [0 for i in range(len(elems) - 1)] for i in range(len(elems) - 1): result[i] = elems[i+1] - elems[i] return result print(forward_difference([4,2,7,1,8,3])) Four. Write a function dosido(elems) that modifies the given list elems operating in place (that is, this function does not create a new list as result, but modifies the list object that is given to it as argument) so that every element in even-numbered positions switches the place with the element right after it, and then returns the modified list. In fact, to enforce this specification, we shall require that this function must always return None in the end. For example, the list [42, 'Bob', 99.5, 3, 7] would become ['Bob', 42, 3, 99.5, 7]. If the list length is odd, the last element stays in place, since there is no position after it to swap it with. (Loop through the even positions of the list, and swap each elems[i] with elems[i+1] if the position i+1 exists inside the list. Make sure that your function also works correctly for the important edge cases of empty lists and lists that contain only one element.) def dosido(elems): # Loop through the positions that we want to swap. for i in range(0, len(elems) - 1, 2): # Swap the two elements in positions i and i + 1. tmp = elems[i] elems[i] = elems[i+1] elems[i+1] = tmp return elems print(dosido(['Bob', 42, 3, 99.5, 7])) Five. Write a function disemvowel(text) that creates and returns a new string that is otherwise the same as the given text, but all its vowels have been removed. (To keep this problem simple, vowels are now only the letters aeiou in either upper- or lowercase, so you don't need to worry about all those accents and acutes and other squiggles that could be placed on top of these letters in other languages.) For example, given the argument string "Ilkka Kokkarinen", this function would return the answer string "lkk Kkkrnn". def disemvowel(text): # One-liner using list comprehension and string join. return "".join([c for c in text if c not in "aeiouAEIOU"]) print(disemvowel("Ilkka Kokkarinen")) Six. (ADVANCED ONLY) Using the plain fizzbuzz_translate function given in the defdemo.py example as model, write a generalized fizzbuzz function fizzbuzz_translate(n, conv = [(3,'fizz'), (5,'buzz')]) that receives as parameters the integer n and a list of tuples conv that tells which word each given divisor gets converted to. If the parameter conv is not given, its default value is the list that defines the rules of the ordinary fizzbuzz. (I mistakenly said in the lecture that conv is a dictionary, but actually it has to be a list so that its elements can be iterated through in sorted order.) This function should build up a result string that contains the concatenation of the words for all the divisors of n in the order that they appear in the conversion list. For example, the call fizzbuzz(15, [(2,'pop'), (3,'zip'), (5,'zup')]) would return 'zipzup', and the call fizzbuzz(18, [(2,'fizz'), (3,'fozz'), (5,'fuzz')]) would return 'fizzfozz'. If the parameter n is not divisible by any of the values given in the conversion list, this function should return that number as a string, as in the original game of fizzbuzz. def fizzbuzz_translate(n, conv = [(3,'fizz'), (5,'buzz')]): # Build up the result string piece by piece. result = "" # Loop through the elements of conv. for (v, word) in conv: if n % v == 0: result += word # If none of the keys was a divisor of v, just return the number. if len(result) == 0: result = str(n) return result print([fizzbuzz_translate(x) for x in range(10, 21)]) #['buzz', '11', 'fizz', '13', '14', 'fizzbuzz', '16', '17', 'fizz', '19', 'buzz'] print([fizzbuzz_translate(x, [(2,'fizz'), (3,'fozz'), (5,'fuzz')]) for x in range(10, 21)]) #['fizzfuzz', '11', 'fizzfozz', '13', 'fizz', 'fozzfuzz', 'fizz', '17', 'fizzfozz', '19', 'fizzfuzz'] Seven. (ADVANCED ONLY) Let us define a list of numbers to be sawtooth if no two consecutive elements are equal, and whenever a[i] < a[i+1] then a[i+1] > a[i+2], and whenever a[i] > a[i+1] then a[i+1] < a[i+2], whenever the indices i, i+1 and i+2 are inside the list. (By this definition, the empty list and every one-element list is a sawtooth. Meditate upon this seeming paradox until you understand why it must be so, because the general principle behind this is so very important in computer science and math.) Write a function is_sawtooth(a) to determine whether the given list a is a sawtooth. def is_sawtooth(a): # Any empty or singleton list is a sawtooth. if len(a) < 2: return True # The difference between the first two elements. step = a[1] - a[0] # If they are the same, not a sawtooth. if step == 0: return False # Iterate through the list positions from the second. for i in range(2, len(a)): # The difference between the next two elements. curr = a[i] - a[i-1] # If that has the same sign as previous difference, that's it. if curr * step >= 0: return False # The current difference becomes the next round previous diff. step = curr # Everything was hunky dory. return True print(is_sawtooth([1,5,2, 8, 4, 5, 1])) #True print(is_sawtooth([1,1,2])) #False Iterators The important integer sequence iterator range ● Python's built-in function range creates and returns an iterator that produces an integer stream one number at the time. The arguments for range, separated by commas, work exactly the same as for the string slicing operator. (Given just one argument n, range produces the stream 0, 1, ..., n - 1.) ● Additionally, you can give the stream step size as the third argument to range. Using a negative step size makes the sequence count down instead of up. For example, the call range(5, 0, -1) would produce the stream 5, 4, 3, 2, 1, the end value again being exclusive. File handling Regular expressions are a powerful way to perform some computations on strings that otherwise would be quite difficult. import re # To get rid of single quotes in text, we replace contractions with # full words. This way, any single quotes that remain are actual quotes. replacements = ( ( "doesn't", "does not" ), ( "don't", "do not" ), ( "you're", "you are" ), ( "i'm", "i am" ), ( "we're", "we are" ), ( "they're", "they are" ), ( "won't", "will not" ), ( "can't", "can not" ), ( "shan't", "shall not" ), ( "shouldn't", "should not" ), ( "mustn't", "must not" ) ) # Precompile a regex machine to recognize word separators. word_separators = re.compile("[^a-z]+") # # # # # \\s turns to \s, which means any whitespace character + repetition one or more times | means or [0-9] any digit character [] character class, any one of the characters inside The dictionary of words that we shall build up as we see them words = {} file = open('warandpeace.txt', encoding='utf-8') for line in file: if len(line) < 2: # skip empty lines continue # Lowercase everything and remove the trailing linebreak character. line = line.lower()[:-1] # Remove the contractions (see above) for (orig, repl) in replacements: line = line.replace(orig, repl) # Remove whatever other possessives might remain. line = re.sub(r"'s\b", "", line) # Raw strings are handy for regexes. line = re.sub(r"'ll\b", "", line) # Process the individual words in the line. for word in word_separators.split(line): if len(word) > 0: words[word] = words.get(word, 0) + 1 file.close() print("Here are some individual word counts.") for w in ('prince', 'russia', 'you', 'supercalifragilisticexpialidocious'): print(f"The word {w!r} occurs {words.get(w, 0)} times.") Turn a dictionary into a list of its items as (key, value) tuples. # We first sort these words into alphabetic order. words_list_f = sorted(words.items(), reverse=True) Custom sorting with a key function that gives something that can be order compared. Now we sort in frequency order. Since the previous step sorted these words in alphabetical order, two words with same frequency count will remain in their alphabetical order. words_list_f = sorted(words_list_f, key = (lambda p: p[1]), reverse = True) # Extract the words into a separate lists, ignoring their counts. words_list = [w for (w, c) in words_list_f] print("\nHere are the 100 most frequent words in War and Peace:") print(words_list[:100]) once = list(reversed([w for w in words_list if words[w] == 1])) print(f"\n{len(once)} words that occur exactly once in War and Peace:") print(", ".join(once)) PRIME This module defines utility functions that generate prime numbers so that these generated primes are cached for quick later access. How far to explicitly store all found primes. __prime_max__ = 10**6 # These are just to "prime" the pump, heh, to get this thing started. __primeset__ = {2, 3, 5, 7, 11} # The list of prime numbers that we know so far. This will grow. __primelist__ = [2, 3, 5, 7, 11] Determine whether integer n is prime, by checking its divisibility by all known prime integers up to the square root of n. from math import sqrt def __is_prime__(n): # To check whether n is prime, check its divisibility with # all known prime numbers up to the square root of n. upper = 1 + int(sqrt(n)) # First ensure that we have enough primes to do the test. __expand_primes__(upper) for d in __primelist__: if n % d == 0: return False if d * d > n: return True return True # Expand the list of known primes until it the highest integer that # it contains is at least n. def __expand_primes__(n): # Start looking for new primes after the largest prime we know. m = __primelist__[-1] + 2 while n > __primelist__[-1]: if __is_prime__(m): __primeset__.add(m) __primelist__.append(m) m += 2 # The public functions for the user code that imports this module. # Determine if the parameter integer n is a prime number. def is_prime(n): # Negative numbers, zero and one are not prime numbers. if n < 2: return False if n < __prime_max__: # Expand the list of known primes until largest is >= n. __expand_primes__(n) # Now we can just look up the answer. return n in __primeset__ else: # Determine primality of n the hard way. return __is_prime__(n) # Compute the k:th prime number in the sequence of all primes. def kth_prime(k): # Expand the list of known primes until it contains at least k primes. while len(__primelist__) < k + 1: __expand_primes__(__primelist__[-1] * 2) # Look up the k:th prime from the list of known primes. return __primelist__[k] # For demonstration purposes when not imported as a module. if __name__ == "__main__": print("Here are the first 100 prime numbers.") print([kth_prime(k) for k in range(100)]) print(f"The thousandth prime number is {kth_prime(1000)}.") One. A prime number is said to be palindromic if the digits of that number, read from right to left, are the exact same prime number. Write a function palindromic_primes(start, end) that returns the list of palindromic prime numbers between start and end, inclusive. def palindromic_primes(start, end): result = [] for p in range(start, end + 1): if is_prime(p) and str(p) == str(p)[::-1]: result.append(p) return result Two. A prime number is said to be an emirp number if the digits of that number, read from right to left, are a different prime number. (Palindromic primes therefore cannot be emirps, and emirps cannot be palindromic primes.) Write a function emirp_primes(start, end) that returns the list of emirp numbers from between start and end, inclusive. def emirp_primes(start, end): result = [] for p in range(start, end + 1): if is_prime(p): rev = int(str(p)[::-1]) if rev != p and is_prime(rev): result.append(p) return result Three. Twin primes are two integers p and p + 2 so that both are prime numbers. For example, 17 and 19 are twin primes. Write a function twin_primes(start, end) that returns the list of twin primes between start and end, inclusive. Each twin prime pair should be returned as a tuple (p, p + 2) inside the list. def twin_primes(start, end): result = [] for p in range(start, end + 1): if is_prime(p) and is_prime(p+2): result.append((p, p+2)) return result Four. Safe primes are primes of the form 2p + 1 where p is also a prime number. Write a function safe_primes(n) that returns the list of first n safe primes. def safe_primes(n): result = [2] p = 3 while len(result) < n: if is_prime(p) and is_prime(2*p+1): result.append(p) p += 2 return result Five. Euclid numbers are a sequence whose k:th element is the product of the first k prime numbers, plus one. Write a function euclid_numbers(n) that produces the list of first n Euclid numbers. def euclid_numbers(n): tally = 1 result = [] for k in range(n): tally *= kth_prime(k) result.append(tally + 1) return result Six. Strong primes are prime numbers that are larger than the mathematical average of the previous prime number and the next prime number. Write a function strong_primes(n) that produces the list of first n strong primes. def strong_primes(n): result = [] prev = kth_prime(0) this = kth_prime(1) succ = kth_prime(2) idx = 2 while len(result) < n: if this > (prev + succ) // 2: # This prime is strong result.append(this) idx += 1 prev = this this = succ succ = kth_prime(idx) return result Seven. Good prime numbers are those prime numbers pi whose square is larger than the product of any two prime numbers pi-k and pi+k for 0 < i < n, when i is the position of the prime number pi in the list of prime numbers. Write a function good_primes(n) that produces the list of first n good primes. def good_primes(m): result = [] idx = 1 while len(result) < m: curr = kth_prime(idx) for k in range(1, idx + 1): ileft = idx - k iright = idx + k prod = kth_prime(ileft) * kth_prime(iright) if curr * curr < prod: # This prime is no good break else: # This gets executed if previous loop did not break result.append(curr) idx += 1 return result Tuples ● Immutable lists of fixed size are often more memory-efficient to represent as tuples, which are syntactically denoted by separating the elements with commas. ○ x = [1, "bob", 5.7] y = (1, "bob", 5.7) print(x) [1,’bob’,5.7] print(y) [1,’bob’,5.7] print(type(x)) <class ‘list’> print(type(y)) <class ‘tuple’> print(len(y)) 3 print(y[-1]) 5.7 x.reverse print(x) [5.7,’bob’,1] z = (17,99,42,-5,0) print(sorted(z)) [-5,0,17,42,99] import random print(random.sample(z,2)) [0, 42] *random ○ a = 42 b = 17 print(f"a is {a} and b is {b}") a is 42 and b is 17 (a,b) = (b,a) print(f"a is {a} and b is {b}") a is 17 and b is 42 ○ x = ["bob","jack", "bill", "tina", "carol", "tom", "robert"] y = [name for name in x if len(name) >= 5] print(y) [‘carol’, ‘robert’] Sets names1 = { 'Larry', 'Johnny', 'Billy', 'Mary', 'Larry' } names2 = { 'Max', 'Lena', 'Larry', 'Billy', 'Tina' } # Membership check is done with the operators 'in' and 'not in' print('Tony' in names1) # False print('Mary' in names1) # True print('Johnny' not in names1) # False The basic set theoretical operations are handily available as methods. union = names1.union(names2) difference = names1.difference(names2) intersection = names1.intersection(names2) print(union) # {'Lena', 'Johnny', 'Max', 'Billy', 'Larry', 'Tina', 'Mary'} print(difference) # {'Mary', 'Johnny'} print(intersection)# {'Billy', 'Larry'} for x in names2: print(f"{x} is in the second set.") Dictionaries scores = { 'Mary': 14, 'Tony': 8, 'Billy': 20, 'Lena': 3} print("The type of {} " f"is {type({})}.") # <class 'dict'> print(scores['Mary']) scores['Lena'] = scores['Lena'] + 5 print(scores['Lena']) scores['Anne'] = 12 del scores['Tony'] 14 Lena scores 5 more points 8 add Anne in the dictionary and remove Tony Trying to index a dictionary with a nonexistent key would be a runtime error. To avoid this, use the method get that takes the key and the default value to use in case the key is nowhere in the dictionary: print(scores.get('Alice', 0)) # # # # # # 0 Iterating through a dictionary goes through its keys, and iterating through the items of the dictionary goes through a sequence of tuples of (key, value) pairs. Again, there is no guarantee of iteration order. print([x for x in scores]) # ['Billy', 'Lena', 'Anne', 'Mary'] print([x for x in scores.items()]) # [('Billy', 20), ('Lena', 8), ('Anne', 12), ('Mary', 14)] Each dictionary can contain only one entry for keys that are equal in the sense of operator ==, even if these objects are separate. numdic = { 42: "Hello", 42.0: "World" } # Spyder gives a warning... print(42 == 42.0) # True (implicit cast) print(42 is 42.0) # False print(numdic[42]) # World print(numdic[42.0]) # World import random # Read through the text and build a dictionary that, for each found # pattern up to length n, gives the string of letters that follow that # pattern in the original text. def build_table(text, n = 3, mlen = 100): result = {} for i in range(len(text) - n - 1): # The n character string starting at position i. pattern = text[i:i+n] # The character that follows that pattern. next_char = text[i + n] # Update the dictionary for each suffix of the current pattern. for j in range(n): follow = result.get(pattern[j:], "") if len(follow) < mlen: result[pattern[j:]] = follow + next_char return result # Using the previous table, generate m characters of random text. def dissociated_press(table, m, result, maxpat = 3): pattern = result while m > 0: follow = table.get(pattern, "") if(len(follow) > 0): # Choose a random continuation for pattern and result. c = random.choice(follow) result += c pattern += c m -= 1 # Shorten the pattern if it grows too long. if len(pattern) > maxpat: pattern = pattern[1:] else: # Nothing for the current pattern, so shorten it. pattern = pattern[1:] return result if __name__ == "__main__": # Convert the contents of text file into one string without line breaks. wap = open("warandpeace.txt", encoding='utf-8') text = "".join(wap) wap.close() table = build_table(text, 6, 500) print(f"Table contains {len(table)} entries.") for maxpat in range(1, 7): print(f"\nRandom text with maxpat value {maxpat}:") print(dissociated_press(table, 600, "A", maxpat).replace('\n', ' ')) Additional list, set and dictionary goodies numberList = [1, 2, 3] strList = ['one', 'two', 'three'] result = zip()# No iterables are passed resultList = list(result) # Converting itertor to list print(resultList) #[] result = zip(numberList, strList) # Two iterables are passed resultSet = set(result) # Converting itertor to set print(resultSet) #{(2, 'two'), (3, 'three'), (1, 'one')} # Unzipping the Value Using zip() coordinate = ['x', 'y', 'z'] value = [3, 4, 5, 0, 9] result = zip(coordinate, value) resultList = list(result) print(resultList) #[('x', 3), ('y', 4), ('z', 5)] c, v = zip(*resultList) print('c =', c) #c = ('x', 'y', 'z') print('v =', v) #v = (3, 4, 5) Decisions def sign(a): if a < 0: return -1 elif a > 0: return +1 else: return 0 print(f"Sign of -7 is {sign(-7)}") #-1 print(f"Sign of 10 is {sign(10)}") #1 print(f"Sign of 0 is {sign(0)}") #0 e = (42 < 10) #False type(a) #class bool # = is assigning one side to other; == is equal Combining simple conditions to create complex conditions ● Truth-valued conditions can be created with the comparison operators <, >, <=, >=, == and != that compare the operands given to them. Note that the equality comparisons are done with == instead of =, since the latter operator already stands for assignment. ● In Python, comparisons can even be chained, in style of a < b < c > d, instead of having to use the logical and operator to combine individual comparisons. ● All the discrete math intro stuff about propositional logic is in full effect, should you happen to know any of that. Most importantly the De Morgan’s rules for simplifying negations and double negations, which can be as tricky in programming languages as they are in natural languages. A and B same as A or B same as not(not A or not B) not(not A and not B) a1 ! a2 or a2 != a3 not a1 == a2 or not a2 == a3 not(a1 == a2 and a2 == a3) def median(a,b,c): #Find and return median of three given parameters a,b,c if a <= b <= c or c <= b <= a: return b elif b <= a <= c or c <= a <= b: return a else: return c def median_other_way(a,b,c): if a > b and a > c: # a is maximum of 3 return max(b,c) elif a < b and a < c: # a is minimum of b return min(b,c) else: return a If-else ladders def day_in_month(m, leap_year = FALSE): #Given month as integer, compute # of days if m < 1 or m >12: #too small or too big return 0 elif m ==2: #February if leap_year: return 29 else: return 28 elif m in (4,6,9,11): return 30 else: return 31 def is_leap_year(y): #Determine whether current year is a leap year #The year is evenly divisible by 4; #If the year can be evenly divided by 100, it is NOT a leap year, unless; #The year is also evenly divisible by 400. Then it is a leap year. if y % 4 != 0: #If divisible by four, a leap year return False if y % 100 != 0: #Centennial years are not real years return True return y % 400 == 0 # print(is_leap_year(1974)) #False def is_leap_year_v2(y): if y % 400 == 0: return True if y % 100 == 0: return False return y % 4 == 0 def is_leap_yest_with_logic(y): return y % 4 == 0 and (y % 100 != 0 or y % 400 == 0) While-loops ● The infinite-way decision of how many steps we should take is broken down to a series of small two-way decisions "Have we reached the goal yet?" that can be evaluated in place. ● The logic of while-loop is best understood as "As long as you have not reached your goal, take one step that will take you closer to that goal." Assuming that you can recognize the goal when you get there, this logic will reach the goal whenever the goal is reachable to begin with, and take exactly as many steps as are needed to do so, no more and no less. Examples: Math Problems # The oldest algorithm in recorded history, Euclid's algorithm to find # the greatest common divisor of two nonnegative integers. def euclid_gcd(a, b, verbose = False): #doesn't print as sentence while b > 0: a, b = b, a % b if verbose: print(f"a is {a}, b is {b}") return a print(euclid_gcd(45,20)) #5 print(euclid_gcd(45,20, True)) #does print as sentence #a is 20, b is 5 #a is 5, b is 0 x = 21489732932423546 y = 439204893028502538 print(euclid_gcd(x, y)) #14 print(euclid_gcd(x, y, True)) #Prints for all remainders # Generate the Collatz series from the given starting value. # start with any positive integer n. # if previous term is even, next term is one half previous term. # If previous term is odd, next term is 3 times previous term plus 1. # The conjecture is no matter what value of n, the sequence will always reach 1. def collatz(start): curr = start result = [] if curr < 1: return [] while curr > 1: #run until you get to 1 - haven't reached goal yet result.append(curr) if curr % 2 == 0: curr = curr // 2 #integer division, no decimal else: curr = 3 * curr + 1 result.append(1) return result Given the previous ability to generate the Collatz series starting from the given number, determine which number from start to end – 1 produces the longest Collatz series, and return that number. print(collatz(42)) # [42, 21, 64, 32, 16, 8, 4, 2, 1] print(collatz(43)) # [43, 130, 65, 196, 98, 49, 148, 74, 37, 112, 56, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] def longest_collatz(start, end): best = start best_len = len(collatz(start)) for x in range(start + 1, end): curr_len = len(collatz(x)) if curr_len > best_len: best_len = curr_len best = x return best print(longest_collatz(2, 30)) #27 Another algorithm from the ancient world, Heron's square root method to numerically iterate the guess for the square root of the positive real number x. (This algorithm generalizes to arbitrary roots, and in fact turns out to be special case of Newton's numerical iteration with the function whose root we want to solve hardcoded to square root.) def heron_root(x): if x < 0: raise ValueError("Square roots of negative numbers not allowed") guess = x / 2 prev = 0 while guess != prev: # This will converge in float precision. prev = guess guess = (guess + x / guess) / 2 return guess print(heron_root(64)) #8 # Defining anonymous functions with lambda. is_positive = lambda x: x > 0 print(is_positive(42)) #True print(is_positive(-99)) #False square = lambda x: x*x print(list(filter(is_positive, [-7, 8, 42, -1, 0]))) #[8, 42] print(list(map(square, [1, 2, 3, 4, 5]))) #[1, 4, 9, 16, 25] from operator import mul from functools import reduce print(reduce(mul, [1,2,3,4,5], 1)) # 120 (= 1 * 2 * 3 * 4 * 5) A handy Python module that defines simple functions that correspond to the arithmetic operators of the Python language. These are needed since you can't pass an arithmetic operator as an argument to function, so the operator is "wrapped" inside a proper function object that can be passed to another function. import operator from functools import reduce # Reduce is a handy operation to repeatedly combine first two elements # of the given sequence into one, until only one element remains. def falling_power(n, k): return reduce(operator.mul, reversed(range(n-k+1,n+1))) print(falling_power(8,3)) #336 def rising_power(n, k): return reduce(operator.mul, range(n, n+k)) print(rising_power(3,8)) #1814400 Converting Roman numerals and positional integers back and forth makes for interesting and education example of loops, lists and dictionaries. symbols_encode = [ (1000, 'M'), (900, 'CM'), (500, 'D'), (400, 'CD'), (100, 'C'), (90, 'XC'), (50, 'L'), (40, 'XL'), (10, 'X'), (9, 'IX'), (5, 'V'), (4, 'IV'), (1, 'I') ] def roman_encode(n): if n < 1: raise ValueError("Romans did not have zero or negative numbers") result = "" for (v, s) in symbols_encode: while v <= n: # same symbol can be used several times result += s n -= v return result print(roman_encode(900)) #CM Dictionaries are handy to map symbols into the values that they encode. symbols_decode = { 'M':1000, 'D':500, 'C':100, 'L':50, 'X':10, 'V':5, 'I':1 } def roman_decode(s): result = 0 prev = 0 for c in reversed(s): # Loop through symbols from right to left v = symbols_decode[c] if prev > v: result -= v else: result += v prev = v return result print(roman_decode('M')) #1000 Whenever you have two functions that are each other's inverses, it is quite easy to test them by looping through some large number of possible inputs to first, and verifying that the functions really invert the result produced by the other. def test_roman(): for x in range(1, 5000): if x != roman_decode(roman_encode(x)): return False return True print(test_roman()) #True From statistics, how to calculate the variance of the dataset to measure how widely the values are spread around the mean of those values. def variance(items): sSum = 0 sqSum = 0 for x in items: sSum += x sqSum += x * x avg = sSum / len(items) avgSq = sqSum / len(items) return avgSq - avg * avg print(variance([1,2,3])) #0.666666666666667 Another famous numerical method to find the root of the function f within the given tolerance, looking for solution from real interval [x0, x1]. def secant_method(f, x0 = 0, x1 = 1, tol = 0.000000001, verbose = False): fx0 = f(x0) fx1 = f(x1) while abs(x1 - x0) > tol: x2 = x1 - fx1 * (x1 - x0) / (fx1 - fx0) fx2 = f(x2) x0, x1, fx0, fx1 = x1, x2, fx1, fx2 if verbose: print(f"x = {x0:.15f}, diff = {abs(x1-x0):.15f}") return (x0 + x1) / 2 Lab 5: General repetition One. Write a function stutter(seq, dup = 2) that creates and returns a list that contains the same elements as the parameter sequence seq, but each element is duplicated dup times. For example, if this function is called with the parameter list [1,"hello",3], it would create and return the list [1,1,"hello","hello",3,3]. Your function should not modify the original sequence, but create and return an entirely new list as the answer. def stutter(seq, dup = 2): result = [] for x in seq: for i in range(dup): result.append(x) return result print(stutter([1,2,3])) #[1,1,2,2,3,3] Two. Write a function longest_without_repeat(text) that finds and returns the longest substring of text that does not contain any individual character twice. For example, given the string "nowisthetimeforallgoodmentocomeinaidfortheircountry", this function would return the string "naidforthe". def longest_without_repeat(text): longest = "" # Iterate through all possible starting points of that substring. for i in range(len(text)): # Find out how far you can go without repeating. j = i + 1 while j < len(text) and text[j] not in text[i:j]: j += 1 if j - i + 1 > len(longest): longest = text[i:j] return longest print(longest_without_repeat('nowisthetimeforallgoodmentocomeinaidfortheir country')) Three. Write a function babbage(m, giveup = 1000) that is given an integer m and solves a generalization of the historically important Babbage Problem. This function should find the smallest integer whose square ends with the digits of m. For example, when given m = 21, this function would return 11, since 11 * 11 = 121 and that one is the first square that ends with the digits 21. If there is no solution find before reaching the value giveup, this function should return -1. def babbage(m, giveup = 1000): # Convert the integer to string for testing purposes. m = str(m) # Iterate possible integers until suitable one is found. for i in range(1, giveup): if (str(i*i)).endswith(m): return i return -1 print(babbage(21)) #11 Four. Write a function longest_common_substring(text1, text2) that finds and returns the longest string that is a substring of both parameter strings text1 and text2. If the strings have no substrings in common, this function should return the empty string. For example, the longest common substring of "Ilkka" and "Kokkarinen" would be "kka". def longest_common_substring(text1, text2): # The longest common substring that we have found longest = '' # Iterate through the possible starting points of for i in range(len(text1)): # Find the longest common substring from that j = len(longest) + 1 while i + j <= len(text1) and text1[i:i+j] in if j > len(longest): longest = text1[i:i+j] j += 1 return longest text1 = "Ilkka" text2 = "Kokkarinen" print(longest_common_substring(text1,text2)) #kka so far. string. position. text2: Five. (ADVANCED ONLY) A text string is said to be squarefree if it does not contain any nonempty substring that is immediately followed by that exact same substring. (As special case of this rule, a squarefree string may not contain the same character in consecutive positions, as that would constitute a repeated substring of length one.) For example, the strings "bobby" and "boohoohoo" are not squarefree, as illustrated by the coloured highlights. Write a function is_squarefree(text) that determines whether its argument string text is squarefree. def is_squarefree(text): # Iterate through the possible starting positions inside the text. for i in range(len(text)): # From that position, iterate through possible substring lengths. for j in range(1, (len(text) - i) // 2): if text[i:i+j] == text[i+j:i+2*j]: return False return True print(is_squarefree("Python is not the same thing as Jython")) #True print(is_squarefree("Python is not not the same thing as Python")) #False Card Problems # Define the suits and ranks that a deck of playing cards are made. suits = ['clubs', 'diamonds', 'hearts', 'spades'] ranks = {'deuce' : 2, 'trey' : 3 , 'four' : 4, 'five' : 5, 'six' : 6, 'seven' : 7, 'eight' : 8, 'nine' : 9, 'ten' : 10, 'jack' : 11, 'queen' : 12, 'king' : 13, 'ace' : 14 } # If 2 cards are of a same rank, it'll sort them by suite deck = [ (rank, suit) for suit in suits for rank in ranks.keys() ] import random # Deal a random hand with n cards, without replacement. def deal_hand(n, taken = []): result = [] while len(result) < n: c = random.choice(deck) if c not in result and c not in taken: result.append(c) return result print (deal_hand(4)) #[('queen', 'clubs'), ('trey', 'clubs'), ('trey', 'spades'), ('nine', 'spades')] # If we don't care about taken, this could be one-liner: # return random.sample(deck, n) def deal_hand(n, taken = [], sort = True): result = [] while len(result) < n: c = random.choice(deck) # gives randomly chosen element from sequence if c not in result and c not in taken: result.append(c) if sort: # Sort first by ranks result = sorted(result, key = (lambda x: ranks[x[0]]), reverse = True) # Followed by sorting by suits result = sorted(result, key = (lambda x: x[1]), reverse = True) return result print (deal_hand(4)) Given a hand left in the game of gin rummy, count its deadwood points. def gin_count_deadwood(hand): count = 0 for (rank, suit) in hand: v = ranks[rank] if v == 14: v = 1 elif v > 10: v = 10 count += v return count print(gin_count_deadwood([('queen', 'clubs'), ('trey', 'clubs'), ('trey', 'spades'), ('nine', 'spades')])) #25 Given a blackjack hand, count its numerical value. This value is returned as a string to distinguish between blackjack and 21 made with three or more cards, and whether the hand is soft or hard. def blackjack_count_value(hand): total = 0 # Current point total of the hand soft = 0 # Number of soft aces in the current hand for (rank, suit) in hand: v = ranks[rank] if v == 14: # Treat every ace as 11 to begin with total += 11 soft += 1 else: total += min(10, v) # All face cards are treated as tens if total > 21: if soft > 0: # Saved by the soft ace soft -= 1 total -= 10 else: return "bust" if total == 21 and len(hand) == 2: return "blackjack" if soft > 0: return "soft " + str(total) else: return "hard " + str(total) print(blackjack_count_value([('queen', 'clubs'), ('trey', 'clubs'), ('trey', 'spades'), ('nine', 'spades')])) #25 bust Determine if the five card poker hand has a flush, that is, all five cards have the same suit. def poker_has_flush(hand): suit = hand[0][1] for i in range(1, len(hand)): if hand[i][1] != suit: return False return True print(poker_has_flush([('five', 'clubs'),('queen', 'clubs'), ('trey', 'clubs'), ('trey', 'clubs'), ('nine', 'clubs')])) #True A utility function that allows us quickly determine the rank shape of the hand. Count how many pairs of identical ranks there are in the hand, comparing each card to the ones after it. def count_equal_ranks(hand): count = 0 for i in range(len(hand) - 1): for j in range(i+1, len(hand)): if hand[i][0] == hand[j][0]: count += 1 return count print(count_equal_ranks([('five', 'clubs'),('queen', 'clubs'), ('trey', 'clubs'), ('trey', 'clubs'), ('nine', 'clubs')])) #1 # The previous function makes all the following functions trivial. def poker_four_of_kind(hand): return count_equal_ranks(hand) == 6 def poker_full_house(hand): return count_equal_ranks(hand) == 4 def poker_three_of_kind(hand): return count_equal_ranks(hand) == 3 def poker_two_pair(hand): return count_equal_ranks(hand) == 2 def poker_one_pair(hand): return count_equal_ranks(hand) == 1 Of the possible poker ranks, straight is the trickiest to check when the hand is unsorted. Also, ace can work either as highest or lowest card inside a straight. def poker_has_straight(hand): # If a hand has any pairs, it is not a straight. if count_equal_ranks(hand) > 0: return False # We know now that the hand has no pairs. hand_ranks = [ranks[rank] for (rank, suit) in hand] min_rank = min(hand_ranks) max_rank = max(hand_ranks) if max_rank == 14: # Special cases for ace straights if min_rank == 10: return True #AKQJT return all(x in hand_ranks for x in [2, 3, 4, 5]) #A2345 else: return max_rank - min_rank == 4 # Straight flushes complicate the hand rankings a little bit. def poker_flush(hand): return poker_has_flush(hand) and not poker_has_straight(hand) def poker_straight(hand): return poker_has_straight(hand) and not poker_has_flush(hand) def poker_straight_flush(hand): return poker_has_straight(hand) and poker_has_flush(hand) # "Sometimes nothing can be a pretty cool hand." def poker_high_card(hand): return count_equal_ranks(hand) == 0 and not poker_has_flush(hand)\ and not poker_has_straight(hand) In fact, there are not too many five card hands (since there are exactly choose(52, 5) = 2,598,960) for us to loop through to make sure that all counts agree with those given in the Wikipedia page # https://en.wikipedia.org/wiki/List_of_poker_hands Given an iterator, generate all k-combinations of the values that it produces. The itertools module makes combinatorics easy peasy. from itertools import combinations def evaluate_all_poker_hands(): funcs = [poker_high_card, poker_one_pair, poker_two_pair, poker_three_of_kind, poker_straight, poker_flush, poker_full_house, poker_four_of_kind, poker_straight_flush] counters = [ 0 for i in range(len(funcs))] for hand in combinations(deck, 5): for i in range(len(funcs)): if funcs[i](hand): counters[i] = counters[i] + 1 break # No point looking for more for this hand return [(funcs[i].__name__, counters[i]) for i in range(len(funcs))] if __name__ == "__main__": print(evaluate_all_poker_hands([('five', 'clubs'),('queen', 'clubs'), ('trey', 'clubs'), ('trey', 'clubs'), ('nine', 'clubs')])) #[('poker_high_card', 1302540), ('poker_one_pair', 123552), ('poker_two_pair', 0), ('poker_three_of_kind', 54912), ('poker_straight', 10200), ('poker_flush', 5108), ('poker_full_house', 3744), ('poker_four_of_kind', 624), ('poker_straight_flush', 40)] Hangman import random def hangman(wordlist, sep = '*'): #provide some other character they want to use count = 0 word = wordlist[random.randrange(0, len(wordlist))] #rand random integer to find random word??? letters = [ sep for x in range(len(word)) ] #indicates some letter has not been uncovered - game stops when all have print("Let us play a game of Hangman.") while sep in letters: print(f"Current word is {'-'.join(letters)}.") #convert list to string guess = str(input("Please enter a letter: ")) cost = 1 #cost of guess is one point for i in range(len(word)): if letters[i] == sep and word[i] == guess: letters[i] = guess cost = 0 if cost > 0: print("That one was a miss.") count += cost else: print("That one was a hit!") print(f"You got {word!r} with {count} misses.\n") #uses repr to print which is intended for python vs string for humans return count def only_letters(word): for x in word: if x not in "abcdefghijklmnopqrstuvwxyz": return False return True if __name__ == "__main__": f = open('words.txt', 'r', encoding = 'utf-8') #'r' for reading wordlist = [x.lower().strip() for x in f] f.close() print(f"Read a word list of {len(wordlist)} words." ) wordlist = [x for x in wordlist if only_letters(x)] print(f"After cleanup, the list has {len(wordlist)} words.") while True: hangman(wordlist) List Problems Given two lists s1 and s2, zip them together into one sequence.(Python function zip produces pairs of elements.) def zip_together(s1, s2): i = 0 result = [] while i < len(s1) and i < len(s2): result.append(s1[i]) result.append(s2[i]) i += 1 if i < len(s1): result.extend(s1[i:]) #loop through remaining elements and extend if i < len(s2): result.extend(s2[i:]) #extend from the remaining elements return result s1 = [1,2,3] s2 = ['bob','jack','tom'] print (zip_together(s1,s2)) # [1, 'bob', 2, 'jack', 3, 'tom'] Given two sorted lists, create a new list that contains the elements of both lists sorted, constructing the new list in one pass. def merge_sorted(s1, s2): i1 = 0 i2 = 0 result = [] while i1 < len(s1) and i2 < len(s2): if s1[i1] <= s2[i2]: result.append(s1[i1]) i1 += 1 else: result.append(s2[i2]) i2 += 1 if i1 < len(s1): result.extend(s1[i1:]) if i2 < len(s2): result.extend(s2[i2:]) return result s1 = [4,7,8,11,17] s2 = [1,2,5,9,14,20,23] print (merge_sorted(s1,s2)) #[1,2,4,5,7,8,9,11,14,17,20,23] Given two sorted lists, create and return a new list that contains the elements that are in both lists. If the lists are sorted, this operation can be done in linear time in one pass through both lists. def intersection_sorted(s1, s2): i1 = 0 i2 = 0 result = [] while i1 < len(s1) and i2 < len(s2): if s1[i1] < s2[i2]: i1 += 1 elif s1[i1] > s2[i2]: i2 += 1 else: result.append(s1[i1]) i1 += 1 i2 += 1 return result s1 = [1,2,9,11] s2 = [1,2,5,9,14] print (intersection_sorted(s1,s2)) #[1,2,9] Modify the list s in place so that all elements for which the given predicate pred is true are in the front in some order, followed by the elements for which pred is false, in some order. def partition(s, pred): i1 = 0 i2 = len(s) - 1 while i1 < i2: if pred(s[i1]): i1 += 1 elif not pred(s[i2]): i2 -= 1 else: s[i1], s[i2] = s[i2], s[i1] i1 += 1 i2 -= 1 return s Let us calculate how the congressional seats are divided over states whose populations (in millions) are given in the parameter list pop. from math import sqrt def apportion_congress_seats(seats, pop): """Allocate congressional seats to states with given populations, according to the Huntington-Hill method to round seats to integers. seats -- The number of congressional seats to allocate. pop -- List of populations in each state. """ result = [1 for state in pop] # one seat per state to begin with seats -= len(pop) while seats > 0: max_priority = 0 max_state = 0 for state in range(len(pop)): this_priority = pop[state] / sqrt(result[state]*(result[state]+1)) if this_priority > max_priority: max_priority = this_priority max_state = state print(f"Giving the next seat to state #{max_state}.", sep='') result[max_state] += 1 seats -= 1 return result print(apportion_congress_seats(10, [30, 22, 14, 8, 5])) #Giving #Giving #Giving #Giving #Giving #[3, 3, the next the next the next the next the next 2, 1, 1] seat seat seat seat seat to to to to to state state state state state #0. #1. #0. #2. #1. Generators ● When this iterator object is used to produce the stream, the function body is executed. When the execution comes to the yield statement, the execution pauses and the yielded result becomes the next object of the stream that is produced by this generator. ● As a handy little piece of syntactic sugar, yield from can be used to yield all items produced by another generator as part of the stream produced by the current one, instead of having to write the explicit for-loop for this purpose. r = range(10) #only remembers 3 values - current, end value, stepwise b = range(10*100) # String multiplication d = {x: "a" * x for x in r} #{0: '', 1: 'a', 2: 'aa', 3: 'aaa', 4: 'aaaa', #5: 'aaaaa', 6: 'aaaaaa', 7: 'aaaaaaa', #8: 'aaaaaaaa', 9: 'aaaaaaaaa'} def test (a,b=42): return a+b print(test(99)) #141 #asterick must be last parameter def test (a, *args): print(f"a is now {a}") print(f"Caller gave {len(args)} additional arguments") for x in args: print(f"Next argument is {x}") print(test(99, 'bob', 5.78, [1,2,3])) # # # # # a is now 99 Caller gave 3 Next argument Next argument Next argument additional arguments is bob is 5.78 is [1, 2, 3] #single asterick must be before double asterick def test (a, *args, **kwargs): print(f"a is now {a}") print(f"Caller gave {len(args)} additional arguments") for x in args: print(f"Next argument is {x}") print(f"Caller gave {len(kwargs)} keyword arguments") for y in kwargs: print(f"Named argument {y} given value {kwargs[y]}") print(test(99, 'bob', 5.78, [1,2,3], name = 'ted', age=42, height=1.85)) #a is now 99 #Caller gave 3 additional arguments #Next argument is bob #Next argument is 5.78 #Next argument is [1, 2, 3] #Caller gave 3 keyword arguments #Named argument name given value ted #Named argument age given value 42 #Named argument height given value 1.85 def test(a,b,c): return a + b * c t = (42,99,17) #print(test(t)) #demands three values, only accepts t as a print(test(*t)) #1725 a = iter(range(20)) for x in a: print(x, end = " ") #0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 def test(it): for x in it: print(x, end = " ") print("") a = iter(range(10)) test(a) print(a) #0 1 2 3 4 5 6 7 8 9 #<range_iterator object at 0x7f2d3c285c90> b = (x*x*x for x in range(10)) # test(b) print(b) #<generator object <genexpr> at 0x7f71433eea40> #0 1 8 27 64 125 216 343 512 729 c = [x*x*x for x in range(10)] test(c) print(c) # [0, 1, 8, 27, 64, 125, 216, 343, 512, 729] --# First, an iterator defined as a proper class with certain methods. class Squares: def __init__(self, start, end): self.start = start self.end = end def __next__(self): if self.start > self.end: raise StopIteration else: result = self.start ** 2 self.start = self.start + 1 return result def __iter__(self): return self def __str__(self): # Human readable return "Range iterator from " + self.start + " to " + self.end def __repr__(self): # Expression to generate this object return "Squares(" + self.start + ", " + self.end + ")" # Iterators are usually easier to define as generators. Simply using the # keyword yield instead of return makes Python convert that function to a # generator type with all the iteration methods loaded in. def squares(start, end): curr = start while curr < end: yield curr * curr curr += 1 a = squares(1,10) b = squares(20,25) print("Let's take some values from a:") print(next(a)) #1 print(next(a)) #4 print(next(a)) #9 print("Then some values from b:") print(next(b)) #400 print(next(b)) #441 print("Again some values from a:") print(next(a)) #16 print(next(a)) #25 print("Printing the squares with for x in Squares(1, 10): print(x) # 1 4 9 16 25 36 49 print("Printing the squares with for x in squares(1, 10): print(x) # 1 4 9 16 25 36 49 Squares class:") 64 81 100 generator:") 64 81 # So far, both examples generated a finite sequence. But there is nothing in # the laws of nature or man that says that a sequence couldn't be infinite. def fibonacci(): yield 1 yield 1 curr = 2 prev = 1 while True: yield curr curr, prev = curr + prev, curr a = fibonacci() b = fibonacci() print("Some values from a:") for i in range(10): print(next(a), end = " ") # 1 1 2 3 5 8 13 21 34 55 print("Some values from b:") for i in range(10): print(next(b), end = " ") # 1 1 2 3 5 8 13 21 34 55 print("Some more values from a:") for i in range(10): print(next(a), end = " ") # Remembers where it left off, continues # 89 144 233 377 610 987 1597 2584 4181 6765 --def collatz(start): while True: yield start if start == 1: break elif start % 2 == 0: start = start // 2 else: start = 3 * start + 1 c = collatz(42) for x in c: print(x, end = " ") # 42 21 64 32 16 8 4 2 1 --def primes(): primes = [2, 3, 5, 7, 11] yield from primes # Handy syntactic sugar for yield inside for-loop current = 13 while True: isprime = False for divisor in primes: if current % divisor == 0: break if divisor * divisor > current: isprime = True break if isprime: primes.append(current) yield current current += 2 it = (x*x for x in primes() if "7" in str(x)) for i in range(20): print(next(it), end = " ") # 49, 289... --- # # # # Since a generator can take parameters, we can write a iterator decorator that can be used to modify the result of any existing iterator. We don't have to care how that iterator was originally defined, as long at it somehow produces new values. # Let through every k:th element and discard the rest. def every_kth(it, k): count = k for x in it: count = count - 1 if count == 0: count = k yield x for x in every_kth(iter(range(10)),3): print(x, end = " ") # 2 5 8 # Duplicate each element k times. def stutter(it, k): for x in it: for i in range(k): yield x for x in stutter(iter(range(10)),3): print(x, end = " ") # 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 --def ngrams(it, n): result = [] for v in it: result.append(v) if len(result) >= n: yield result result = result[1:] print("The string split into three consecutive words, overlapping:") print(list(ngrams("Hello world, how are you today?".split(" "), 3))) # [['Hello', 'world,', 'how'], ['world,', 'how', 'are'], ['how', 'are', 'you'], ['are', 'you', 'today?']] --- def vertical_print(*its, start="", end="", sep=" "): its = list(map(iter, its)) term = len(its) while term > 0: result = start for idx in range(len(its)): if its[idx]: try: val = str(next(its[idx])) except StopIteration: its[idx] = None val = " " term -= 1 else: val = " " result += val + sep result += end if term > 0: yield result for line in vertical_print("Hello", "there", "how", "are", "yeh?", "supercalifragilisticexpialidocious", start = "[", end="]"): print(line) --# Sieve of Erathostenes is a classic but inefficient way to generate # prime numbers. Recursive generators make it a more interesting here # for the purposes of our discussion. def sieve(m, div): if div == 2: yield from range(3, m, 2) else: for i in sieve(m, div - 2 if div > 3 else 2): if i == div or i % div != 0: yield i # The itertools module defines tons of handy functions to perform # computations on existing iterators, to be combined arbitrarily. print("Here are the prime numbers under 10000 filtered with the sieve:") print(list(sieve(10000, 101))) import itertools as it # # # # # Ulam numbers are infinite sequence that starts with [1,2], and after those two, each number is the smallest number that can be expressed exactly one way as a sum of two distinct previous numbers in the sequence. See http://en.wikipedia.org/wiki/Ulam_number Here is one possible (not optimized) generator for Ulam numbers. def ulam(n1=1, n2=2): yield n1 yield n2 u = [n1, n2] # Define testing if n is the next element as a utility function. def testnext(n): count = 0 # count indices down with i for i in range(len(u)-1, 0, -1): for j in range(i): if u[i] + u[j] == n: count += 1 if count == 2: return False return (count == 1) # So we can now start looking for the next value and yield it. curr = n2 + 1 while True: if testnext(curr): yield curr u.append(curr) curr += 1 # The first 100 Ulam numbers print ("Here are the first 100 Ulam numbers:") print(list(it.islice(ulam(1,2), 0, 100))) #[1, 2, 3, 4, 6, 8, 11, 13, 16, 18,...,] # Iterators can be turned into various combinatorial possibilities. # Again, even though there are exponentially many solutions, these # are generated lazily one at the time as needed by the computation. print("Here are all possible permutations of [1, 2, 3, 4].") print(list(it.permutations(range(1, 5)))) print("Here are all possible 3-combinations of [1, 2, 3, 4].") print(list(it.combinations(range(1, 5), 3))) # [(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)] print("Here are all possible replacement combinations of [1, 2, 3, 4].") print(list(it.combinations_with_replacement(range(1, 5), 3))) # [(1, 1, 1), (1, 1, 2), (1, 1, 3), (1, 1, 4), (1, 2, 2), (1, 2, 3), (1, 2, 4), # (1, 3, 3), (1, 3, 4), (1, 4, 4), (2, 2, 2), (2, 2, 3), (2, 2, 4), (2, 3, 3), # (2, 3, 4), (2, 4, 4), (3, 3, 3), (3, 3, 4), (3, 4, 4), (4, 4, 4)] --from random import randint # Randomly choose k elements from those produced by iterator it. def reservoir(it, k): buffer = [] count = 0 for v in it: if count < k: # First k elements build up the reservoir. buffer.append(v) else: idx = randint(0, count) if idx < k: # The new element hits the reservoir. buffer[idx] = v # displace some older element count += 1 # Having read in everything, emit the reservoir elements. for v in buffer: yield v if __name__ == "__main__": print("Here are 20 random non-short lines from 'War and Peace':\n") wap = open("warandpeace.txt", encoding='utf-8') idx = 1; for line in reservoir((x for x in wap if len(x) > 60), 20): print(f"{idx:2}: {line}", end="") idx += 1 wap.close() # The dictionary of Morse codes and the corresponding letters. codes = { ".-": "a", "-...": "b", "-.-.": "c", "-..": "d", ".": "e", "..-.": "f", "--.": "g", "....": "h", "..": "i", ".---": "j", "-.-": "k", ".-..": "l", "--": "m", "-.": "n", "---": "o", ".--.": "p", "--.-": "q", ".-.": "r", "...": "s", "-": "t", "..-": "u", "...-": "v", ".--": "w", "-..-": "x", "-.--": "y", "--..": "z" } # # # # # Morse code is not what is known as "prefix code" so that no encoding of some character would be a prefix of the encoding of some other character. This makes the decoding ambiguous unless the Morse operator takes a noticeable short pause between the individual letters. # Construct a reverse dictionary from an existing dictionary # with this handy dictionary comprehension. codes_r = { codes[v]: v for v in codes } # Given a string of characters, encode it in Morse code # placing the given separator between the encodings of the # individual letters. Unknown characters are simply skipped # in this encoding. def encode_morse(text, sep=""): return sep.join((codes_r.get(c, "") for c in text.lower())) # A generator function that yields all possible ways to # decode the given Morse code message into a string. This # generator is written recursively to find all the possible # first characters, followed by the recursive decodings of # the rest of the message. def decode_morse(message): if message == "": yield "" else: for i in range(1, min(len(message) + 1, 5)): prefix = message[:i] if prefix in codes: head = codes[prefix] for follow in decode_morse(message[i:]): yield head + follow if __name__ == "__main__": from random import sample f = open('words.txt', 'r', encoding = 'utf-8') wordlist = [word.strip().lower() for word in f if len(word) < 8] f.close() print(f"Read a list of {len(wordlist)} words.") # Convert to set for a quick lookup of individual words. words = set(wordlist) for text in sample(wordlist, 20): enc = encode_morse(text) print(f"The word {text!r} encodes in Morse to {enc!r}.") print(f"The Morse code message {enc!r} decodes to words:") dec = [word for word in decode_morse(enc) if word in words] for word in dec: print(f"{word!r} split as {encode_morse(word, ' ')}") print("") # Exercise for the reader: find the word whose encoding in Morse # can be decoded to largest possible number of different words. JSON ● JSON ("JavaScript Object Notation") is a modern standard for expressing arbitrary structured data as raw Unicode text, in a way that can be read and written by different programs written in different programming languages. import json with open('countries.json', encoding="utf-8") as data_file: countries = json.load(data_file) print(f"Read {len(countries)} countries from the JSON file.") # Read 240 countries from the JSON file. continental_pops = {} total_pop = 0 for country in countries: continent = country["Continent"] pop = int(country["Population"].split(" ")[0]) continental_pops[continent] = continental_pops.get(continent, 0) + pop total_pop += pop print("\nThe total population in each continent:") # The total population in each continent: for continent in sorted(continental_pops.keys()): print(f"The continent of {continent} has a total of {continental_pops[continent]} people.") # The continent of Africa has a total of 1130881210 people. # The continent of Asia has a total of 4477006876 people. # The continent of Europe has a total of 598292164 people. # The continent of NorthAmerica has a total of 565020417 people. # The continent of Oceania has a total of 38823795 people. # The continent of SouthAmerica has a total of 408939962 people. print(f"That gives a total of {total_pop} people on Earth.") # That gives a total of 7218964424 people on Earth. hazard_table = {} for country in countries: hazards = country["NaturalHazards"] for hazard in hazards: hazard_table[hazard] = hazard_table.get(hazard, 0) + 1 print("\nHere are the hazards found around the world:") for hazard in sorted(hazard_table.keys(), key=(lambda x: hazard_table[x]), reverse = True): print(f"{hazard[0].upper() + hazard[1:]} is a hazard in {hazard_table[hazard]} countries.") # Tropical cyclone is a hazard in 76 countries. # Flood is a hazard in 72 countries. # Drought is a hazard in 71 countries. # Earthquake is a hazard in 60 countries. # Volcano is a hazard in 36 countries. # Dust storm is a hazard in 31 countries. # Landslide is a hazard in 28 countries. # Tsunami is a hazard in 20 countries. # Wildfire is a hazard in 9 countries. # Avalanche is a hazard in 6 countries. # Locust plague is a hazard in 3 countries. # Thunderstorm is a hazard in 3 countries. # Land subsidence is a hazard in 2 countries. # Tornado is a hazard in 2 countries import bisect # Recursively fill an n-by-n box of letters so that every row and # every column is some n-letter word. The call finds all the # possible ways to fill in the i:th horizontal word, given the # previous i - 1 horizontal and vertical words. def wordfill(n, i, horiz, vert, wordlist, vv = 0): # Entire square is complete when rows 0, ..., n-1 are filled. if i == n: yield horiz else: # Vertical words constrain the next horizontal word. prefix = "".join([w[i] for w in vert]) # Find the first word that starts with that prefix. idx = bisect.bisect_left(wordlist, prefix) # Iterate through all words that start with that prefix. while idx < len(wordlist) and wordlist[idx].startswith(prefix): word = wordlist[idx] # Try using that word as the i:th horizontal word. if not(word in horiz or word in vert): horiz.append(word) yield from wordfill(n, i + vv, vert, horiz, wordlist, 1 - vv) horiz.pop() idx += 1 if __name__ == "__main__": import random, itertools n = 5 f = open('words.txt', 'r', encoding = 'utf-8') wordlist = [x.strip() for x in f if x.islower()] f.close() print(f"Read in a word list of {len(wordlist)} words.") wordlist = sorted([x for x in wordlist if len(x) == n]) print(f"There remain {len(wordlist)} words of length {n}.") wordset = set(wordlist) rows, cols = 3, 4 result = [] while len(result) < rows * cols: word = random.choice(wordlist) for sol in itertools.islice(wordfill(n, 0, [], [word], wordlist, 1), 1): # Let us make sure that the generated solution is legal. for i in range(n): w = "".join(w[i] for w in sol) if w not in wordset: print(f"ERROR! {w} IS NOT A WORD") result.append(sol) print(f"Printing out {rows*cols} random word squares found:\n") for row in range(rows): idx = row * cols for j in range(n): for i in range(idx, idx + cols): print(result[i][j], end = " ") print("") print("") import statistics print("\nLet's demonstrate the Python statistics package.") for country in countries: country["Population"] = float(country["Population"].split(" ")[0]) country["Area"] = float(country["Area"].split(" ")[0]) if country["Area"] == 0: country["Area"] = 1 mn = statistics.mean([country["Population"] / country["Area"] for country in countries]) print(f"Mean population density is {mn:.1f} people per square kilometer.") # Mean population density is 415.4 people per square kilometer. import json with open('mountains.json', encoding="utf-8") as data_file: mountains = json.load(data_file) print(f"Read {len(mountains)} mountains from the JSON file.") # Read 3741 mountains from the JSON file. print("JSON data file generated from Wolfram Mathematica.") with open('countries.json', encoding="utf-8") as data_file: countries = json.load(data_file) print(f"Read {len(countries)} countries from the JSON file.") # Read 240 countries from the JSON file. mic = {} tallest = {} smallest = ('Molehill', 0) for mountain in mountains: for country in mountain['Countries']: mic[country] = mic.get(country, []) + [mountain['Name']] t = tallest.get(country, smallest) try: e = int(mountain['Elevation'].split(' ')[0]) except ValueError: e = 0 if e > t[1]: tallest[country] = (mountain['Name'], e) print("\nHere are top thirty countries sorted by their tallest mountains:") countries = sorted(countries, key = (lambda x: tallest.get(x['Name'], smallest)[1]), reverse = True) for i in range(30): c = countries[i]['Name'] t = tallest[c] print(f'{i+1:2}. {c} with {t[0]}, elevation {t[1]} m.') # # # # 1. China with Mount Everest, elevation 8848 m. 2. Nepal with Mount Everest, elevation 8848 m. 3. Pakistan with K2, elevation 8612 m. ... print("\nHere are the top thirty countries sorted by named mountains:") countries = sorted(countries, key = (lambda x: len(mic.get(x['Name'], []))), reverse = True) for i in range(30): c = countries[i]['Name'] print(f'{i+1:2}. {c} with {len(mic[c])} named mountains.') #1. United States with 1011 named mountains. #2. Canada with 185 named mountains. #3. Italy with 169 named mountains. # https://en.wikipedia.org/wiki/Benford%27s_law print("\nLet's see how well Benford's law applies to mountain heights.") digit_m = {} digit_f = {} digit_y = {} count = 0 for mountain in mountains: try: m = int(mountain['Elevation'].split(' ')[0]) except ValueError: m = 0 h = int(str(m)[0]) digit_m[h] = digit_m.get(h, 0) + 1 f = 0.3048 * m # feet h = int(str(f)[0]) digit_f[h] = digit_f.get(h, 0) + 1 y = 0.9144 * m # yards h = int(str(y)[0]) digit_y[h] = digit_y.get(h, 0) + 1 count += 1 from math import log benford = [100 * (log(d+1, 10) - log(d, 10)) for d in range(1, 10)] print("Digit\tMeters\tFeet\tYards\tBenford") for d in range(1, 10): print(f"{d}\t{100 * digit_m[d] / count:.1f}\t{100 * digit_f[d] / count:.1f}" +f"\t{100 * digit_y[d] / count:.1f}\t{benford[d-1]:.1f}") #Let's see how well Benford's law applies to mountain heights. #Digit Meters Feet Yards Benford #1 21.3 30.4 25.0 30.1 #2 29.2 11.4 30.0 17.6 #3 17.9 7.5 18.0 12.5 #4 10.0 6.7 6.2 9.7 #5 5.8 6.8 6.3 7.9 #6 5.5 9.8 5.6 6.7 #7 4.9 10.2 3.3 5.8 #8 2.3 8.8 2.4 5.1 #9 2.2 7.6 2.4 4.6 Applying functions to other functions # let's write a function negate that can be given any predicate function, # and it creates and returns a function that behaves as the negation of # its parameter. def negate(f): # Nothing says that we can't define a function inside a function. # Since we don't know what parameters f takes, we write our function # to accept any arguments and keyword arguments. def result(*args, **kvargs): return not f(*args, **kvargs) return result # return the function that we just defined # Let's try this out by negating the following simple function. def is_odd(x): return x % 2 == 1 is_even = negate(is_odd) print(is_even) # <function result at ....> print(is_even(2)) # True print(is_even(3)) # False Creating small functions on the fly with lambdas ● For example, the expression lambda x, y: x * x + y * y defines an anonymous function object that takes two parameters, this are named x and y inside the function, and returns the sum of their squares. This object is typically passed as argument to a function that uses it internally. # # # # In this program, we demonstrate a bunch of Python features for functional programming. This means that functions are treated as data that can be stored to variabled, passed to other functions, returned as result, created on the fly etc. # Defining anonymous functions with lambda. is_positive = lambda x: x > 0 print(is_positive(42)) #True print(is_positive(-99)) #False square = lambda x: x*x # Functional programming basic commands map, filter and reduce. Note that # these return iterator objects instead of performing the computation # right away, so we force the computation to take place with list(). print(list(filter(is_positive, [-7, 8, 42, -1, 0]))) #[8, 42] print(list(map(square, [1, 2, 3, 4, 5]))) #[1, 4, 9, 16, 25] from operator import mul from functools import reduce print(reduce(mul, [1,2,3,4,5], 1)) # 1 * 2 * 3 * 4 * 5 = 120 --# Functions can be defined to operate on iterators. def hamming_distance(i1, i2): return sum(map(lambda x: x[0] == x[1], zip(i1, i2))) print(f"Hamming distance is {hamming_distance([1,2,3,4,5],(1,2,4,5,3))}.") #2 # # # # # # Sometimes functions can take a long time to calculate, but we know that they will always return the same result for the same parameters. The function memoize takes an arbitrary function as a parameter, and creates and returns a new function that gives the same results as the original, but remembers the results that it has already computed and simply looks up the result if they have previously been calculated. def memoize(f): results = {} # empty dictionary, as nothing has been computed yet def lookup_f(*args): res = results.get(args, None) if res == None: res = f(*args) # calculate the resul results[args] = res # and store it return res lookup_f.results = results # Alias the local variable to be seen from outside return lookup_f # The famous Fibonacci series. Trust me that this stupid implementation # would take nearly forever if called with n in largish double digits. def fib(n): if n < 2: return 1 else: return fib(n-1) + fib(n-2) # print(fib(200)) oldfib = fib fib = memoize(fib) print(fib(200)) # # # # way longer than the lifetime of universe keep a reference to the original slow function however, memoization makes it all smooth 453973694165307953197296969697410619233826 print(f"fib is oldfib = {fib is oldfib}") # False # Hofstadter's recursive Q-function, memoized for efficiency. def hof_q(n): if n < 3: return 1 return hof_q(n - hof_q(n-1)) + hof_q(n - hof_q(n-2)) hof_q = memoize(hof_q) print(hof_q(100)) #56 # # # # We can also perform the memoization explicitly. Since the function arguments are assumed to be nonnegative integers, the computed results can be stored in a list. Otherwise, some kind of dictionary would have to be used. Q = [ 0, 1, 1 ] def hof_qt(n): if n >= len(Q): for i in range(len(Q), n + 1): Q.append(Q[i - Q[i-1]] + Q[i - Q[i-2]]) return Q[n] print(hof_qt(1000000)) # #512066 #with recursion, surely stack overflow print(len(Q)) #1000001 # Wouldn't it be "groovy" to memoize the memoize function itself, so that # if some function has already been memoized, it won't be memoized again # but its previously memoized version is returned? memoize = memoize(memoize) def divisible_by_3(x): return x % 3 == 0 f1 = memoize(divisible_by_3) f2 = memoize(divisible_by_3) # already did that one print(f"f1 is f2 = {f1 is f2}") # True # We can use memoization technique to speed up checking whether the # so-called Collatz sequence starting from the given number eventually # reaches 1. The function collatz(n) tells how many steps this takes. @memoize def collatz(n): if n == 1: return 0 elif n % 2 == 0: return 1 + collatz(n // 2) else: return 1 + collatz(3 * n + 1) lc = max((collatz(x) for x in range (1, 1000000))) print(f"Longest Collatz sequence was {lc} steps.") #524 print(f"Memoization dictionary has {len(collatz.results)} stored values.") # # # # # Sometimes we use some simple function only in one place. With lambdas, such a function can be defined anonymously on the spot, provided that the function can written as a single expression. For example, use the previous negate to a lambda predicate that checks if its parameter is negative or zero, to give a function is_positive. is_positive = negate(lambda x: x <= 0) print(is_positive(2)) # True print(is_positive(-2)) # False # The famous Thue-Morse sequence for "fairly taking turns". __tm_call_count = 0 @memoize def thue_morse(n, sign): global __tm_call_count __tm_call_count += 1 if n == 1: if sign == 0: return '0' else: return '1' return thue_morse(n-1, sign) + thue_morse(n-1, 1-sign) print("Thue-Morse sequence for n == 6:") print(thue_morse(6, 0)) print(f"Computing that required {__tm_call_count} recursive calls.") # The next function can be given any number of functions as parameters, # and it creates and returns a function whose value is the maximum of # the results of any of these functions. def max_func(*args): funcs = args def result(*args): return max((f(*args) for f in funcs)) return result # Try that one out with some polynomials that we just create as lambda # to be used as throwaway arguments of max_func. f = max_func(lambda x: -(x*x) + 3*x - 7, lambda x: 4*x*x - 10*x + 10, lambda x: 5*x*x*x - 20) # -x^2+3x-7 # 4x^2-10x+10 # 5x^3-20 print([f(x) for x in range(-10, 10)]) Functional programming tools ● For example, map(lambda x: x*x, [1, 2, 3, 4, 5]) would produce the iterator for the series 1, 4, 9, 16, 25. As usual, use the function list to get all these values in one swoop as a list, instead of stepping through them one at the time. ● The function filter cherrypicks those elements from the stream for which the given function produces a True result, discarding those that are considered False. ● The function reduce (in other functional languages, this function is more vividly called fold) turns a stream back into a single element by repeatedly combining two consecutive elements into one element, using the two-parameter function given to it as first parameter. Recursive functions ● A thing is said to be self-similar if it contains a strictly smaller version of itself inside it as a proper part. (As an aside, in mathematics an entity is infinite if it fully contains itself as its proper part.) ● Recursion, a function calling itself for smaller parameter values, is often a natural way to solve self-similar problems, once the self-similarity has been spotted. Spotting the selfsimilarity is the difficult part of problem solving, but once it is revealed, the rest is nearly mechanical. ● Spotting the self-similarity allows us to write recursive definitions for things we wish to compute. For example, the non-recursive definition for factorial n! = 1 * 2 * … * (n - 1) * n can be converted into an equivalent recursive definition n! = (n - 1)! * n. ● To avoid infinite regress in principle and stack overflow in practice, every recursive method must have at least one base case where no further recursive calls are made. For the factorial function, the base case is 0! = 1. ● Recursive definitions can be a surprisingly powerful way to solve complex problems. The more complex and powerful the function F(n) is, the harder it also works when placed to the right hand side of the definition F(n) = s(F(n - 1)) where s is some simple function (both sides of the equation balance by definition). ● Just like with every function call, recursive invocations of the same function each have their own local namespace. This way the same parameter and local names can exist independently of each other in different levels of recursion. (From the technical point of view, there is no difference whatsoever in what happens in the execution of a function call, whether that function is recursive or not.) ● In theory, both recursion and iteration are computationally universal, so any computational problem that can be solved one way, could always somehow also be solved the other way. However, some types of problems are better suited for recursion than iteration. ● The true power of recursion lies in its ability to branch to two or more different directions, going as far and deep in each direction as needed before trying out the next direction, and possibly branching further to explore an exponential number of possibilities. Contrast this to the while-loop that will only keep going to one direction, missing the goal unless there is one on this linear path. Practical application of recursion # Randomly choose k elements from those produced by iterator it. def factorial1(n): total = 1 for x in range(2,n+1): total*=x return total print(factorial1(10)) --#recursion version def factorial2(n): if n <2: #must test if in base case - answer so simple it can be solved return 1 else: return factorial2(n-1)*n print(factorial2(10)) #3628800 --def fibonacci(n): a, b = 1, 1 for i in range(2, n+1): a,b = b, a + b return b --def fib(n): if n < 2: return 1 else: return fib(n-1) + fib(n-2) print([fib(i) for i in range (0,30)]) #running time increases with size of n # [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040] print(str(fibonacci(100000))[-10:]) #convert to string and return last 10 elements # 9707537501 --- # classic Towers of Hanoi puzzle. def hanoi(src, tgt, n): #source, target, number of pegs if n < 1: return mid = 6 - src - tgt hanoi(src, mid, n-1) print(f"Move top disk from peg {src} to peg {tgt}.") hanoi(mid, tgt, n-1) hanoi(1,3,3) # For computing high integer powers, binary power is more efficient than # repeated multiplication n - 1 times. def binary_power(a, n): if n < 0: raise ValueError("Binary power negative exponent not allowed.") elif n == 0: return 1 elif n % 2 == 0: return binary_power(a * a, n // 2) else: return a * binary_power(a * a, (n-1) // 2) results = [binary_power (b, e) == b ** e for b in range (1,10) for e in range(0,20)] print(len(results)) #180 # Without converting a number to a string but using only operations of # integer arithmetic, reverse its digits. For example, the integer 12345 # would become 54321. def reverse_digits(n): def rev_dig_acc(n, a): # accumulator recursion if n == 0: return a else: return rev_dig_acc(n // 10, 10 * a + n % 10) if n < 0: return -reverse_digits(-n) else: return rev_dig_acc(n, 0) print(reverse_digits(12345)) #54321 --- # # # # Flattening a list that can contain other lists as elements is a classic list programming exercise. This being Python, the correct "duck typing" way of checking of something is a list is to check if it is iterable, that is, for our purposes it behaves like a list. def flatten(li): result = [] for x in li: if isinstance(x, list): # is x an iterable? result.extend(flatten(x)) else: result.append(x) return result a = [42,"bob",7.6,[99,6,[[17]]]] print(flatten(a)) #[42, 'bob', 7.6, 99, 6, 17] --# # # # The subset sum problem asks if it is possible given items (all assumed to be integers) that If there is no solution, the function returns the list of items that together add up to the def subset_sum(items, goal): if len(items) == 0 or goal < 0: return None if goal == 0: return [] last = items.pop() answer = subset_sum(items, goal-last) if answer != None: answer.append(last) else: answer = subset_sum(items, goal) items.append(last) return answer a = [2,4,9,11,15,21,24,28,33,35] goal = sum(a) / 2 print(f"[{goal}, {subset_sum(a,goal)}") # [91.0, [4, 24, 28, 35] --- to choose a subset of together add up to goal. None, otherwise it returns goal. # Ackermann function is a function that grows fast. Really, really, really # fast. Faster than you can begin to imagine, until you have taken some # decent theory of computability courses. And not even then. def ackermann(m, n): if m == 0: return n+1 elif m > 0 and n == 0: return ackermann(m-1, 1) else: return ackermann(m-1, ackermann(m, n-1)) print(f"Ackermann(3, 3) = {ackermann(3, 3)}.") #61 print(f"Ackermann(3, 4) = {ackermann(3, 4)}.") #12 # The knight's tour problem is another classic. Given an n*n chessboard, # and the start coordinates of the knight, find a way to visit every # square on the board. def knight_tour(n = 8, sx = 1, sy = 1): result = [] visited = set( ) moves = ((2,1),(1,2),(2,-1),(-1,2),(-2,1),(1,-2),(-2,-1),(-1,-2)) # Test whether square (x, y) is inside the chessboard. We use # the 1-based indexing here, like is normally done by humans. def inside(x,y): return x > 0 and y > 0 and x <= n and y <= n # Find all the unvisited neighbours of square (x, y). def neighbours(x, y): nblist = [] for (dx, dy) in moves: if inside(x+dx, y+dy) and (x+dx, y+dy) not in visited: nblist.append((x+dx,y+dy)) return nblist # Try to generate the rest of the tour from square (cx, cy). def backtrack(cx, cy): result.append((cx, cy)) if len(result) == n*n: # The tour must be closed to be considered success return (sx - cx, sy - cy) in moves visited.add((cx,cy)) for (x,y) in neighbours(cx, cy): if backtrack(x, y): return True # Undo the current move result.pop() visited.remove((cx,cy)) return False if backtrack(sx, sy): return result else: return None print(knight_tour(6, 1, 1)) # [(1, 1), (3, 2), (5, 3), (6, 5), (4, 6), (5, 4), (6, 6), (4, 5), (6, 4), (5, 6), (3, 5), (1, 6), (2, 4), (3, 6), (5, 5), (6, 3), (5, 1), (4, 3), (6, 2), (4, 1), (2, 2), (1, 4), (2, 6), (3, 4), (1, 5), (2, 3), (4, 4), (4, 2), (2, 1), (3, 3), (1, 3), (2, 2), (3, 4), (2, 6), (1, 5), (2, 3)] Solving Advanced Problems # Inspired by the work of Donald Knuth in "Stanford Graphbase". # # # # Compute the Hamming distance between two words of same length. That is, the number of positions in which the words have a different character. In the word graph, we consider two words to be neighbours if their Hamming distance equals one. def hamming_distance(w1, w2): d = 0 for i in range(len(w1)): if w1[i] != w2[i]: d += 1 return d # Compute the word layers from the given starting word using the # breadth first search algorithm. def word_layers(start, neighbours, maxd = 99): # The zeroth layer contains only the starting word. result = [[start]] # All words that have been discovered so far. discovered = set() # The words of the previous layer. frontier = [start] discovered.add(start) # Generate all word layers up to maxd level, or as long # as the current layer has even one word in it. while maxd > 0 and len(frontier) > 0: # The next layer that we build up this round. nextl = [] # The next layer contains all undiscovered neighbours # of all the words of the current layer. for w in frontier: for w2 in neighbours[w]: if w2 not in discovered: nextl.append(w2) discovered.add(w2) # Add the completed word layer to the result list. result.append(nextl) # This layer becomes the frontier for the next round. frontier = nextl maxd -= 1 return result if __name__ == "__main__": from random import sample # The length of the words that we consider. n = 5 f = open('words.txt', 'r', encoding = 'utf-8') wordlist = [x.strip() for x in f if x.islower()] f.close() print(f"Read in a word list of {len(wordlist)} lowercase words.") wordlist = sorted([x for x in wordlist if len(x) == n]) print(f"There remain {len(wordlist)} words of length {n}.") print("Building the word neighbourhood graph...", end="") # Dictionary that maps each word to list of its neighbours. neighbours = { w:[] for w in wordlist } # Loop through all positions of the wordlist... for i1 in range(len(wordlist)): w1 = wordlist[i1] # Because this is a bit slow, print something to tell the # human user that something is happening in the program. if i1 % 500 == 0: print(".", end="", flush=True) # Compare each word to remaining words of the wordlist. for i2 in range(i1 + 1, len(wordlist)): w2 = wordlist[i2] if hamming_distance(w1, w2) == 1: neighbours[w1] = neighbours[w1] + [w2] neighbours[w2] = neighbours[w2] + [w1] print("\nThe neighbour word graph is now fully constructed.") singletons = [x for x in wordlist if neighbours[x] == []] print(f"There are {len(singletons)} singleton words in the dictionary:") print("Here is a random sample of 50 such singletons.") print(sample(singletons, 50)) # Compute the total number of neighbours, and find the word # that has the largest number of neighbours. mw, total = 0, 0 for word in wordlist: total += len(neighbours[word]) if len(neighbours[word]) > mw: mw = len(neighbours[word]) mword = word print(f"Average number of neighbours is {total / len(wordlist):.2f}.") print(f"The word with largest number of neighbours is {mword!r}.") print(f"It is directly connected to {', '.join(neighbours[mword])}.") print(f"Constructing the word layers starting from {mword!r}.") wl = word_layers(mword, neighbours) total = 0 print(f"Word layers starting from {mword!r} have sizes:") for i in range(1, len(wl) - 1): print(f"Level {i} contains {len(wl[i])} words.", end = " ") print(f"Some are: {', '.join(sample(wl[i], min(5, len(wl[i]))))}.") total += len(wl[i]) print(f"A total of {total} words are reachable from {mword!r}.") from random import choice, sample from bisect import bisect_left, bisect_right # Compute a histogram of individual characters in words. def histogram(words): result = {} for word in words: for c in word: result[c] = result.get(c, 0) + 1 return result # Find all words that are palindromes. def palindromes(words): return [x for x in words if x == x[::-1]] # Find all words that are a different word when read backwards. def semordnilap(words): wset = set(words) return [x for x in words if x != x[::-1] and x[::-1] in wset] # Find all the rotodromes, words that become other words when rotated. def rotodromes(words): def _is_rotodrome(word, wset): for i in range(1, len(word)): w2 = word[i:] + word[:i] if w2 != word and w2 in wset: return True return False wset = set(words) return [x for x in words if _is_rotodrome(x, wset)] # Find the "almost palindromes", words that become palindromes when # one letter is tactically removed. def almost_palindromes(words): def _almost(word): for i in range(len(word) - 1): w2 = word[:i] + word[i+1:] if w2 == w2[::-1]: return True return False return [x for x in words if len(x) > 2 and _almost(x)] # Stolen from "Think Python: How To Think Like a Computer Scientist" # Regular expressions can come handy in all kinds of string problems. import re # Find the words that contain at least three duplicated letters. def triple_duplicate(words): return [x for x in words if len(re.findall(r'(.)\1', x)) > 2] # Find the words that contain three duplicated letters all together. def consec_triple_duplicate(words): return [x for x in words if len(re.findall(r'(.)\1(.)\2(.)\3', x)) > 0] # How many words can be spelled out using only given characters? def limited_alphabet(words, chars): # A regular expression used many times is good to precompile # into the matching machine for speed and efficiency. pat = re.compile('^[' + chars + ']+$') return [word for word in words if pat.match(word)] # Stolen from Programming Praxis. Given text string and integer # k, find and return the longest substring that contains at most # k different characters inside it. def longest_substring_with_k_chars(text, k = 2): # The k most recently seen characters mapped to the last # position index of where they occurred. last_seen = {} len_, max_, maxpos = 0, 0, 0 for i in range(len(text)): c = text[i] # If no conflict, update the last_seen dictionary. if len(last_seen) < k or c in last_seen: last_seen[c] = i len_ += 1 if len_ > max_: max_ = len_ maxpos = i - max_ + 1 else: # Find the least recently seen character. min_, minc = len(text), '$' for cc in last_seen: if last_seen[cc] < min_: min_ = last_seen[cc] minc = cc # Remove it from dictionary... last_seen.pop(minc) # ... and bring the current character to its place. last_seen[c] = i len_ = i - min_ # Extract the longest found substring as the answer. return text[maxpos:maxpos + max_] # # # # Given a sorted list of words and the first word, construct a word chain in which each word starts with the suffix of the previous word with the first k characters removed, for example ['grama', 'ramal', 'amala', 'malar', 'alarm'] for k = 1. # Since words are sorted, we can use binary search algorithm to # quickly find the sublist whose words start with the given prefix. def word_chain(words, first, k = 1, len_ = 3): # Recursive algorithm to complete the given wordlist. def backtrack(chain): # If the wordlist is long enough, return it. if len(chain) == len_: return chain # Extract the suffix of the last word of the wordlist. suffix = chain[-1][k:] # Extract the words that start with that suffix. start = bisect_left(words, suffix) end = bisect_right(words, suffix + k * 'z') # Try out those words one at the time. for idx in range(start, end): word = words[idx] if len(word) > len(chain[-1]) - k and word not in chain: # Extend the wordlist with this word. chain.append(word) if backtrack(chain): # Solution found return chain # Remove that word and try the next one. chain.pop() return None return backtrack(first) # # # # What words remain words by removing one character? Create and return a list whose i:th element is a dictionary of all such words of length i, mapped to the list of words of length i-1 that they can be turned into by removing one letter. def remain_words(words): result = [ [], [x for x in words if len(x) == 1] ] wl = 2 while True: nextlevel = { } hasWords = False for w in (x for x in words if len(x) == wl): shorter = [] for i in range(0, wl - 1): ww = w[:i] + w[i+1:] # word w with i:th letter removed if ww in result[wl - 1]: shorter.append(ww) if len(shorter) > 0: nextlevel[w] = shorter hasWords = True if hasWords: result.append(nextlevel) wl += 1 else: return result # Generate a table of all anagrams from the given words. def all_anagrams(words): godel = {} # The first 26 prime numbers, one for each letter from a to z. primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101] for word in words: m = 1 for c in word: # ord(c) gives the Unicode integer codepoint of c. m *= primes[ord(c) - ord('a')] # All anagrams have the same godel number, due to commutativity # of integer multiplication and the Fundamental Theorem of # Arithmetic that says every integer has one prime factorization. godel[m] = godel.get(m, []) + [word] return godel if __name__ == "__main__": f = open('words.txt', 'r', encoding='utf-8') words = [x.strip() for x in f if x.islower()] f.close() words = limited_alphabet(words, "abcdefghijklmnopqrstuvwxyz") print(f"Read in {len(words)} words.") # Binary search can quickly find all words with given prefix. for prefix in ["aor", "jim", "propo"]: result = [] idx = bisect_left(words, prefix) while idx < len(words) and words[idx].startswith(prefix): result.append(words[idx]) idx += 1 result = ", ".join(result) print(f"\nWords that start with {prefix!r} are {result}.") # How about finding all words that end with given suffix? words_r = [word[::-1] for word in words] words_r.sort() for suffix in ["itus", "roo", "lua"]: suffix = suffix[::-1] result = [] idx = bisect_left(words_r, suffix) while idx < len(words_r) and words_r[idx].startswith(suffix): result.append(words_r[idx][::-1]) idx += 1 result = ", ".join(result) print(f"\nWords that end with {suffix[::-1]!r} are {result}.") hist = histogram(words).items() hist = sorted(hist, key = (lambda x: x[1]), reverse = True) print("\nHistogram of letters sorted by their frequencies:") print(hist) pals = palindromes(words) print(f"\nThere are {len(pals)} palindromes. ", end = "") print("Some of them are:") print(", ".join(sample(pals, 10))) sems = semordnilap(words) print(f"\nThere are {len(sems)} semordnilaps. Some of them are:") print(", ".join(sample(sems, 10))) almost = almost_palindromes(words) print(f"\nThere are {len(almost)} almost palindromes. ", end = "") print("Some of them are:") print(", ".join(sample(almost, 10))) print("\nLet us next look for some rotodromes.") for i in range(2, 13): rotos = rotodromes([w for w in words if len(w) == i]) print(f"There are {len(rotos)} rotodromes of length {i}. ", end = "") print(f"Some of them are:") print(f"{', '.join(sample(rotos, min(10, len(rotos))))}.") print("\nWords that contain triple duplicate character:") for word in triple_duplicate(words): print(word, end = ' ') print("\n\nWords that contain consecutive triple duplicate:") for word in consec_triple_duplicate(words): print(word, end = ' ') print("\n\nWords that contain only hexadecimal digits [a-f]:") for word in limited_alphabet(words, "abcdef"): print(word, end = ' ') print("\n\nWords that contain only vowels:") for word in limited_alphabet(words, "aeiouy"): print(word, end = ' ') print("\n\nWords spelled with upside down calculator:") for word in limited_alphabet(words, "oieslbg"): print(word.upper(), end = ' ') text = "ceterumautemcenseocarthaginemessedelendam" print(f"\n\nThe text is '{text}'.") print("Let's print out its longest substrings with k letters.") for k in range(1, 16): print(f"k = {k:2}: {longest_substring_with_k_chars(text, k)}") print(f"\nHow about the longest 10-char substring of War and Peace?") wap = open("warandpeace.txt", encoding='utf-8') text = " ".join(wap) wap.close() text.replace("\n", " ") print(f"It is:{longest_substring_with_k_chars(text, 10)}") print("\nNext, some word chains of five-letter words.") words5 = [word for word in words if len(word) == 5] count, total = 0, 0 while count < 10: total += 1 first = choice(words5) best = [first] while len(best) < 5: better = word_chain(words5, [first], 1, len(best) + 1) if better: best = better else: break if len(best) > 3: print(f"{first}: {best}") count += 1 print(f"Found {count} word chains after trying {total} firsts.") print("\nSome letter eliminations:") elim_dict_list = remain_words(words) startwords = list(elim_dict_list[8]) for i in range(10): word = choice(startwords) while len(word) > 1: print(word, end = " -> ") word = choice(elim_dict_list[len(word)][word]) print(word) print("\nLet us compute all anagrams for the six letter words.") words6 = [word for word in words if len(word) == 6] anagrams = all_anagrams(words6) print("The anagram groups with five or more members are:") for li in (x for x in anagrams if len(anagrams[x]) > 4): print(", ".join(anagrams[li])) # The dictionary of Morse codes and the corresponding letters. codes = { ".-": "a", "-...": "b", "-.-.": "c", "-..": "d", ".": "e", "..-.": "f", "--.": "g", "....": "h", "..": "i", ".---": "j", "-.-": "k", ".-..": "l", "--": "m", "-.": "n", "---": "o", ".--.": "p", "--.-": "q", ".-.": "r", "...": "s", "-": "t", "..-": "u", "...-": "v", ".--": "w", "-..-": "x", "-.--": "y", "--..": "z" } # # # # # Morse code is not what is known as "prefix code" so that no encoding of some character would be a prefix of the encoding of some other character. This makes the decoding ambiguous unless the Morse operator takes a noticeable short pause between the individual letters. # Construct a reverse dictionary from an existing dictionary # with this handy dictionary comprehension. codes_r = { codes[v]: v for v in codes } # Given a string of characters, encode it in Morse code # placing the given separator between the encodings of the # individual letters. Unknown characters are simply skipped # in this encoding. def encode_morse(text, sep=""): return sep.join((codes_r.get(c, "") for c in text.lower())) # A generator function that yields all possible ways to # decode the given Morse code message into a string. This # generator is written recursively to find all the possible # first characters, followed by the recursive decodings of # the rest of the message. def decode_morse(message): if message == "": yield "" else: for i in range(1, min(len(message) + 1, 5)): prefix = message[:i] if prefix in codes: head = codes[prefix] for follow in decode_morse(message[i:]): yield head + follow if __name__ == "__main__": from random import sample f = open('words.txt', 'r', encoding = 'utf-8') wordlist = [word.strip().lower() for word in f if len(word) < 8] f.close() print(f"Read a list of {len(wordlist)} words.") # Convert to set for a quick lookup of individual words. words = set(wordlist) for text in sample(wordlist, 20): enc = encode_morse(text) print(f"The word {text!r} encodes in Morse to {enc!r}.") print(f"The Morse code message {enc!r} decodes to words:") dec = [word for word in decode_morse(enc) if word in words] for word in dec: print(f"{word!r} split as {encode_morse(word, ' ')}") print("") # Exercise for the reader: find the word whose encoding in Morse # can be decoded to largest possible number of different words. import bisect import random # Recursively fill an n-by-n box of letters so that every row and # every column is some n-letter word. The call finds all the # possible ways to fill in the i:th horizontal word, given the # previous i - 1 horizontal and vertical words. If babbage is # set to True, the word square must have same rows and columns. def wordfill(n, i, horiz, vert, wordlist, vv, babbage = False): # Entire square is complete when rows 0, ..., n-1 are filled. if i == n: yield horiz else: # Vertical words constrain the next horizontal word. prefix = "".join([w[i] for w in vert]) # Find the first word that starts with that prefix. idx = bisect.bisect_left(wordlist, prefix) # Iterate over words that start with that prefix. while idx < len(wordlist) and wordlist[idx].startswith(prefix): word = wordlist[idx] # If not used yet, try using that word as the horizontal word. if not (word in horiz or word in vert): horiz.append(word) if babbage: yield from wordfill(n, i + 1, horiz, horiz, wordlist, 0, True) else: yield from wordfill(n, i + vv, vert, horiz, wordlist, 1vv) horiz.pop() idx += 1 if __name__ == "__main__": import itertools n = 4 f = open('words.txt', 'r', encoding = 'utf-8') wordlist = [x.strip() for x in f if x.islower()] f.close() print(f"Read in a word list of {len(wordlist)} words.") wordlist = sorted([x for x in wordlist if len(x) == n]) print(f"There remain {len(wordlist)} words of length {n}.") wordset = set(wordlist) rows, cols = 4, 7 result = [] while len(result) < rows * cols: w1 = random.choice(wordlist) i1 = bisect.bisect_left(wordlist, w1[0]) i2 = bisect.bisect_right(wordlist, chr(ord(w1[0]) + 1)) w2 = wordlist[random.randint(i1, i2-1)] for sol in itertools.islice(wordfill(n, 1, [w1], [w2], wordlist, 0), 1): # Let us make sure that the generated solution is legal. for i in range(n): w = "".join(w[i] for w in sol) if w not in wordset: print(f"ERROR! {w} IS NOT A WORD") result.append(sol) print(f"Printing out {rows*cols} random word squares.\n") print("-" * ((n+3)*cols + 1)) for row in range(rows): idx = row * cols for j in range(n): print("| ", end = "") for i in range(idx, idx + cols): print(result[i][j], end = " | ") print("") print("-" * ((n+3)*cols + 1)) from heapq import heappop, heappush # A traditional version to compute and return the n:th Hamming number. def nth_hamming(n): muls = (2, 3, 5) frontier_pq = [1] frontier_set = set() frontier_set.add(1) while n > 0: curr = heappop(frontier_pq) frontier_set.remove(curr) for m in muls: if m * curr not in frontier_set: frontier_set.add(m * curr) heappush(frontier_pq, m * curr) n -= 1 return curr # Another (slow) version by combining lazy iterators recursively. # Merge the results of two sorted iterators into sorted sequence. # For simplicity, this assumes that both iterators are infinite. def iterator_merge(i1, i2): v1 = next(i1) v2 = next(i2) while True: if v1 <= v2: yield v1 v1 = next(i1) else: yield v2 v2 = next(i2) # Let through only one element of each run of consecutive values. def iterator_uniq(i): v = next(i) yield v while True: while True: v2 = next(i) if v != v2: break v = v2 yield v # Hamming numbers can be produced by merging three streams of # Hamming numbers with their values multiplied by 2, 3 and 5. def hamming_it(): yield 1 i = iterator_uniq(iterator_merge( (2*x for x in hamming_it()), iterator_merge( (3*x for x in hamming_it()), (5*x for x in hamming_it()) ) )) for x in i: yield x if __name__ == "__main__": import itertools print(f"The millionth Hamming numbers is {nth_hamming(1000000)}.") print("Here are the first 100 Hamming numbers with iterators.") print(list(itertools.islice(hamming_it(), 100))) import random from math import sqrt, floor # Given current generation of solution candidates sorted in the order # of increasing fitness, compute the next generation of candidates. def compute_next_gen(current, fitness, combine, newsize=None, elitism=0): n = len(current) if not newsize: newsize = n # Initialize the result list to be filled in. next_gen = [] # Best solutions get to the next generation automatically. for i in range(elitism): next_gen.append(current[-i]) while len(next_gen) < newsize: # Choose the parents to create two new offspring so that higher # fitness solutions have a higher chance of being chosen. i1 = int(floor(sqrt(random.randint(0, n * n - 1)))) i2 = int(floor(sqrt(random.randint(0, n * n - 1)))) # Create the offspring and add them to the next generation. (o1, o2) = combine(current[i1], current[i2]) next_gen.append(o1) next_gen.append(o2) # Sort the next generation in order of increasing fitness. next_gen.sort(key=fitness) return next_gen # One point crossover function to combine two solutions. def one_point_crossover(s1, s2): i = random.randint(1, len(s1) - 1) return (s1[:i] + s2[i:], s2[:i] + s1[i:]) # Burst crossover, a better combinator for two solutions. def burst_crossover(s1, s2, prob=10): r1, r2 = "", "" for i in range(len(s1)): r1 += s1[i] r2 += s2[i] if random.randint(0, 99) < prob: (s1, s2) = (s2, s1) return (r1, r2) if __name__ == "__main__": # Make the randomness repeatable between runs. Change # the seed value to get a different random pattern. random.seed(135745) # Read in the wordlist. f = open('words.txt', 'r', encoding = 'utf-8') wordlist = [x.strip() for x in f if x.islower()] f.close() # We shall look for words of five letters. wordlist = [x for x in wordlist if len(x) == 5] # How many times each letter occurs in wordlist. cfreq = [0 for i in range(26)] # Extract the bigrams, trigrams, quadgrams, and words. wordsets = [set(), set(), set(), set()] for word in wordlist: for c in word: cfreq[ord(c) - ord('a')] += 1 wordsets[3].add(word) # word itself for i in range(0, 3): wordsets[0].add(word[i:i+2]) # bigrams if i < 3: wordsets[1].add(word[i:i+3]) # trigrams if i < 2: wordsets[2].add(word[i:i+4]) # quadgrams print(f"Weights are {cfreq} {len(cfreq)}.") chars = ['a','b','c','d','e','f','g','h','i','j','k', 'l','m','n','o','p','q','r','s','t','u','v', 'w','x','y','z'] # # # # Fitness function for a text string. We give points for bigrams and trigrams so that the fitness landscape does not consist of flat plateaus for the algorithm to just mosey around aimlessly. def word_count(text): count = 0 for i in range(0, len(text) - 2): if text[i:i+2] in wordsets[0]: count += 1 if text[i:i+3] in wordsets[1]: count += 100 if text[i:i+4] in wordsets[2]: count += 10000 if text[i:i+5] in wordsets[3]: count += 1000000 return count # bigrams # trigrams # quadgrams # words # Create a random pattern of n characters. def create_pattern(n, forced=None, prob=50): pat = "" for i in range(n): if forced and forced[i] != '*': pat += forced[i] elif not forced and random.randint(0, 99) < prob: pat += '*' else: pat += random.choices(chars, weights = cfreq, k=1)[0] return pat # Find the best way to fill in the asterisks of the given # character pattern to maximize the number of words in it. pattern = create_pattern(50) print(f"Pattern: {pattern}.") sols = [create_pattern(50, forced = pattern) for i in range(3000)] sols.sort(key=word_count) for g in range(50): sols = compute_next_gen(sols, word_count, burst_crossover, newsize = len(sols)-10, elitism=30) # Uncomment this to receive a status report after each gen. print(f"Gen {g}: {sols[-1]}, fitness {word_count(sols[-1])}.") best = sols[-1] print(f"The best solution found was {best}.") words = [best[i:i+5] for i in range(0, len(best) - 5) if best[i:i+5] in wordsets[3]] print(f"The solution contains {len(words)} words. They are:") print(words)