Announcements No Labs / Recitation this week On Friday we will talk about Project 3 Release late afternoon / evening tomorrow Cryptography Midterm Curve 0-35 D 36-39 C 40-50 C 51-55 C+ 56-59 B 60-68 B 69-74 B+ 75-79 A 80-100 A Data Structures More List Methods Our first encoding Matrix CQ:Are these programs equivalent? 1 2 b = [‘h’,’e’,’l’,’l’,’o’] b.insert(len(b), “w”) print(b) A: yes B: no b = [‘h’,’e’,’l’,’l’,’o’] b.append(“w”) print(b) Advanced List Operations L = [0, 1, 2, 0] L.reverse() print(L) will print: [0, 2, 1, 0] L.remove(0) print(L) will print: [2, 1, 0] L.remove(0) print(L) will print: [2, 1] print (L.index(2)) will print 0 Why are Lists useful? They provide a mechanism for creating a collection of items def doubleList(b): i=0 while i < len(b): b[i] = 2 * b[i] i = i +1 return (b) print(doubleList([1,2,3])) Using Lists to Create Our Own Encodings Python provides a number of encodings for us: Binary, ASCII, Unicode, RGB, Pictures etc. We know the basic building blocks, but we are still missing something … We need to be able to create our own encodings What if I want to write a program that operates on proteins? Under the hood: what is a matrix? A matrix is not “pre defined” in python We “construct” a way to encode a matrix through the use of lists We will see how to do so using a single list (not ideal) We will see how to do so using a list of lists Two different ways Row by Row Column by Column 12 34 ( ) Lists Suppose we wanted to extract the value 3 y = [“ABCD”, [1,2,3] , ”CD”, ”D”] y[1][2] The first set of [] get the array in position 1 of y. The second [] is selecting the element in position 2 of that array. This is equiv. to: z = y[1] z[2] Lets make it more concrete! Lets revisit encoding a matrix Lets try something simple: A = [1, -1, 0, 2] B = [1, 0, 0, 0.5, 3, 4, -1, -3, 6] Does this work? We lose a bit of information in this encoding Which numbers correspond to which row We can explicitly keep track of rows through a row length variable B = [1, 0, 0, 0.5, 3, 4, -1, -3, 6] rowLength = 3 B[rowLength*y +x] Lets convince ourselves B = [1, 0, 0, 0.5, 3, 4, -1, -3, 6] rowLength = 3 B[rowLength*y +x] x=0 y=0 B[3*0 + 0] x=1 y=1 B[3*1 + 1] x=2 y=1 B[3*1 + 2] Can we encode it another way? We can encode column by column, but we lose some information again Which numbers correspond to which column We can explicitly keep track of columns through a column length variable B = [1, 0.5, -1, 0, 3, -3, 0, 4, 6] columnLength = 3 B[columnLength*x + y] Lets convince ourselves B = [1, 0.5, -1, 0, 3, -3, 0, 4, 6] columnLength = 3 B[columnLength*x + y] x=0 y=0 B[3*0 + 0] x=1 y=1 B[3*1 + 1] x=2 y=1 B[3*2 + 1] Lists of Lists Recall that when we had a string in our list B = [“ABCD”, 0, 1, 3] We could utilize the bracket syntax multiple times print B[0][1] would print B Lists can store other Lists B = [[0, 1, 3], 4, 5, 6] Another way to encode a Matrix Lets take a look at our example matrix What about this? B= [[1, 0, 0], [0.5, 3, 4], [-1, -3, 6]] Why is this important? We can now write code that more closely resembles mathematical notation i.e., we can use x and y to index into our matrix B = [[1, 0, 0], [0.5, 3, 4], [-1, -3, 6]] for x in range(3): for y in range(3): print (B[x][y]) …but first some more notation We can use the “*” to create a multi element sequence 6 * [0] results in a sequence of 6 0’s [0, 0, 0, 0, 0, 0] 3 * [0, 0] results in a sequence of 6 0’s [0, 0, 0, 0, 0, 0] 10 * [0, 1, 2] results in what? What is going on under the hood? Lets leverage some algebraic properties 3 * [0, 0] is another way to write [0, 0] + [0, 0] + [0, 0] We know that “+” concatenates two sequences together What about lists of lists? We have another syntax for creating lists [ X for i in range(y)] This creates a list with y elements of X Example: [ 0 for i in range(6)] ≡ [0]*6 [0, 0 ,0 ,0 ,0 ,0] Example: [[0, 0] for i in range(3)] ≡ [[0,0]]*3 [[0, 0], [0, 0], [0, 0]] What does this does: [2*[0] for i in range(3)]? Lets put it all together m1 = [ [1, 2, 3, 0], [4, 5, 6, 0], [7, 8, 9, 0] ] m2 = [ [2, 4, 6, 0], [1, 3, 5, 0], [0, -1, -2, 0] ] m3= [ 4*[0] for i in range(3) ] for x in range(3): for y in range(4): m3[x][y]= m1[x][y]+m2[x][y] Data structures We have constructed our first data structure! As the name implies, we have given structure to the data The data corresponds to the elements in the matrix The structure is a list of lists The structure allows us to utilize math-like notation Homework Read through the entire project description when it becomes available Announcements: Project 3 Taking a look at cryptography Message -> Encrypted Message -> Message Encryption Decryption Shift Cipher How many possible elements do we have? Answer: 95 printable characters What happens if we shift something more than 95 positions? Answer: we need to loop around (shifting 99 is the same as shifting 4) What happens if we shift by a negative shift? Answer: we can modify a negative shift into a positive shift Simple Example Consider if we had 4 options: “A” “B” “C” “D” Assume encoding for “A” is 0, “B” is 1, “C” is 2, and “D” is 3 Shifting “A” by 2 yields “C” (0+2) Shifting “B” by 2 yields “D” (1+2) Shifting “C” by 2 yields “A” (2+2) ? Shifting “D” by 2 yields “D” (3+2)? Modular Arithmetic Modular arithmetic allows us to count “in circles” 0 mod 4 = 0 1 mod 4 = 1 2 mod 4 = 2 3 mod 4 = 3 4 mod 4 = 0 5 mod 4 = 1 6 mod 4 = 2 7 mod 4 = 3 Simple Example Consider if we had 4 options: “A” “B” “C” “D” Assume encoding for “A” is 0, “B” is 1, “C” is 2, and “D” is 3 Shifting “A” by 2 yields “C” (0+2) mod 4 = 2 Shifting “B” by 2 yields “D” (1+2) mod 4 = 3 Shifting “C” by 2 yields “A” (2+2) mod 4 = 0 Shifting “D” by 2 yields “B” (3+2) mod 4 = 1 Negative Shifts? Lets think about negative shifts … conceptually we just “go the other way” “D” shifted by -1 is “C” (3-1) mod 4 = 2 “C” shifted by -1 is “B” (2-1) mod 4 = 1 Observe that since we are effectively moving in a circle, a negative shift can be expressed as a positive shift “D” shifted by 3 is “C” (3 + 3) mod 4 = 2 “C” shifted by 3 is “B” (2 + 3) mod 4 = 1 What about large negative shifts? Observe that shifting -10 is equivalent to shifting -2 Why? We shift 2 times around our “circle” of letters and then need to shift -2 additional letters Notice that both shifting -10 and shifting -2 are equivalent to shifting 2 -10 mod 4 = 2 -2 mod 4 = 2 Notice that we can compute a positive shift from a negative shift by: shift modulo the total number of elements Clicker Question: did we encode the same matrix in both programs? 1 2 [[1, 2], [3, 4]] [[1, 3], [2, 4]] A: yes B: no C: maybe Lets explore why it depends 1 [[1, 2], [3, 4]] 12 34 ( ) 2 [[1, 3], [2, 4]] 13 24 ( ) Both lists of lists can either be a row by row or a column by column encoding Let us encode something else using lists! We call this structure a tree Root Root How might we encode such a structure? What structures do we know of in python? Strings, Ranges, Lists We know that lists allow us to encode complex structures via sub lists We used sub list to encode either a row or a column in the matrix We used an ‘outer’ list of rows or columns as a matrix Can we use the same strategy? How might we encode a simple tree? Root Leaf1 Leaf2 Leaf3 Tree = [‘Leaf1’, ‘Leaf2’, ‘Leaf3’] Trees can be more complex Root Leaf1 Leaf2 Leaf3 Leaf4 Leaf5 Tree = [‘Leaf1’, ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, ‘Leaf5’]] Trees can be more complex Root Leaf2 Leaf0 Leaf1 Leaf3 Leaf4 Leaf5 Tree = [[‘Leaf0’,‘Leaf1’], ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, ‘Leaf5’]] Trees can be more complex Root Leaf2 Leaf0 Leaf1 Leaf3 Leaf4 Leaf5 Leaf6 Tree = [[‘Leaf0’,‘Leaf1’], ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] What is the intuition Each sub list encodes the ‘branches’ of the tree We can think of each sub list as a ‘sub tree’ We can use indexes (the bracket notation []) to select out elements or ‘sub trees’ How can we select out the leaves? Root Leaf1 Leaf2 Leaf3 Tree[0] Tree[1] Tree[2] Tree = [‘Leaf1’, ‘Leaf2’, ‘Leaf3’] Indexes provide us a way to “traverse” the tree Root 1 0 2 Leaf2 0 Leaf0 1 Leaf1 0 Leaf3 1 2 Leaf4 0 Leaf5 1 Leaf6 Tree = [[‘Leaf0’,‘Leaf1’], ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] Indexes provide us a way to “traverse” the tree Root 1 0 2 Leaf2 0 Leaf0 1 Leaf1 0 Leaf3 1 2 Leaf4 0 Leaf5 1 Leaf6 Tree = [[‘Leaf0’,‘Leaf1’], ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] Tree[2] Indexes provide us a way to “traverse” the tree Root 1 0 2 Leaf2 0 Leaf0 1 Leaf1 0 Leaf3 1 2 Leaf4 0 Leaf5 1 Leaf6 Tree = [[‘Leaf0’,‘Leaf1’], ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] Tree[2][2] Indexes provide us a way to “traverse” the tree Root 1 0 2 Leaf2 0 Leaf0 1 Leaf1 0 Leaf3 1 2 Leaf4 0 Leaf5 1 Leaf6 Tree = [[‘Leaf0’,‘Leaf1’], ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] Tree[2][2][1] CQ: How do we select ‘Leaf4’ from the Tree? Tree = [[‘Leaf0’,‘Leaf1’], ‘Leaf2’, [‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] A: Tree[2][1] B: Tree[3][2] C: Tree[1][2] Operations on Trees Trees, since they are encoded via lists support the same operations lists support We can “+” two trees We can embedded two trees within a list This creates a larger tree with each of the smaller trees as sub trees Example: Tree1 = [‘Leaf1’, ‘Leaf2’] Tree2 = [‘Leaf3’, ‘Leaf4’] Tree = [Tree1, Tree2] “+” two trees Leaf2 Leaf0 Leaf1 Leaf3 Leaf4 Leaf5 Tree1 = [[‘Leaf0’, ‘Leaf1’]] Tree2 = [‘Leaf2’] Tree3 = [[‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] Leaf6 “+” two trees Leaf2 Leaf0 Leaf1 Leaf3 Leaf4 Leaf5 Tree1 = [[‘Leaf0’, ‘Leaf1’]] Tree2 = [‘Leaf2’] Tree4 = Tree1+Tree2 [[‘Leaf0’, ‘Leaf1’], ‘Leaf2’] Leaf6 “+” two trees Leaf2 Leaf0 Leaf1 Leaf3 Leaf4 Leaf5 Tree4 = [[‘Leaf0’, ‘Leaf1’], ‘Leaf2’] Tree3 = [[‘Leaf3’, ‘Leaf4’, [‘Leaf5’, ‘Leaf6’]]] Tree = Tree4+Tree3 Leaf6 Why are trees important? They are a fundamental structure in computer science They enable us to search very quickly We will revisit trees later in the course What have we covered so far: Given a tree diagram we can write a list of lists Given a complex list we can select elements Homework Work on Project 3 Read Sections 11.1, 11.2, 11.3 (15 pages)