intro to ml

Apologies and Announcements Website will be up within this weekend Apologies for delay in initiating the discussion Please finalize your assignment groups asap Groups can have no more than 6 members Recommended to have at least 4 members Groups cannot contain unregistered students Course list finalized – will be put up to help identify group members Course members who are unable to join a group will be clubbed Must authenticate your sensors so that tampering can be detected! Couldn’t you have told me earlier?! Authentication by Secret Questions Give me your A/C number and answer the following questions 1. What is your date of birth? 2. What is your pet’s name? 3. How many marks did you get in 10th standard exams? 4. How many cars do you own? 5. … BANK USER SBI31415926535 1. 05th August 2000 2. Mr. Bud Bud 3. err … couldn’t hear you clearly 4. None, so give me that loan already! 5. … Authentication by Secret Questions Using PUFs Give me your device ID and answer the following questions 1. 10111100 2. 00110010 3. 10001110 4. 00010100 5. … TS271828182845 SERVER How to ensure that these answers are unique and unpredictable? DEVICE 1. 2. 3. 4. 5. 1 0 1 0 … Physically Unclonable Functions 0.50ms These tiny differences are difficult to predict or clone 0.55ms Then these could act as the fingerprints for the devices! A simple Multiplexer PUF “select” bit 0 p ms delay q ms delay Multiplexers are basically switching circuits 1 Correct. However, the devices are consistent, i.e., their delays do not change (too much) over time. It is difficult to deliberately create another mux that exhibits the same delays Arbiter PUFs If the top signal reaches the finish line first, the “answer” to this question is 0, else if the bottom signal reaches first, the “answer” is 1 Question: 1011 1 0 1 1 ? Arbiter PUFs If the top signal reaches the finish line first, the “answer” to this question is 0, else if the bottom signal reaches first, the “answer” is 1 Question: 1011 1 0 1 1 1? Arbiter PUFs If the top signal reaches the finish line first, the “answer” to this question is 0, else if the bottom signal reaches first, the “answer” is 1 Question: 0110 0 1 1 0 0? Some FAQs Does it matter whether the “red” signal reaches first or the “blue”? No, the color does not matter – the color was added just for explanation Why go into all this fuss of having multiple multiplexers? It was expected that it would make it more difficult to predict the answers. Also, it increases the number of possible questions. Is it compulsory to have only 4 multiplexers? Absolutely not. It depends on how long are your “questions” It is common to have 64 multiplexers Actually … That would make the total number of challenges 264 > 18 Quintillion!! By the way, people usually call the questions “challenges” and the answers “responses” Good … even if an attacker knows the responses to a few challenges, there is no way to guess the other answers. Right? Right? Hello! Melbo!! A Twist in the Tale An attacker can see responses on a few challenges and use ML to predict responses on all other challenges  Does not matter if using 32-bit or 64-bit challenges All mux-es are different so 𝑝1 ≠ 𝑝2 ≠ ⋯ , 𝑞1 ≠ 𝑞2 ≠ ⋯ 𝑐0 𝑐1 𝑝0 𝑡0𝑢 𝑐2 𝑝1 𝑐63 𝑝2 𝑡1𝑙 𝑡0𝑙 𝑞0 𝑡1𝑢 𝑞1 𝑡𝑖𝑢 is the (unknown) time at which the upper signal leaves the 𝑖-th mux. 𝑡𝑖𝑙 is the time at which the lower signal leaves the 𝑖-th mux. 𝑡2𝑢 𝑡2𝑙 𝑞2 … … 𝑝63 𝑢 𝑡63 𝑙 𝑡63 𝑞63 A Twist in the Tale 𝑢 𝑙 Observe that the answer is 0 if 𝑡63 < 𝑡63 and 1 otherwise 𝑢 𝑙 𝑢 𝑙 Also note that 𝑡1 and 𝑡1 depend on 𝑡0 , 𝑡0 , 𝑝1 , 𝑞1 , 𝑟1 , 𝑠1 and 𝑐1 𝑐1 dictates which previous delay 𝑡0𝑢 or 𝑡0𝑙 will get carried forward in which branch, and 𝑝1 , 𝑞1 , 𝑟1 , 𝑠1 give us the delay introduced by the 1-th mux itself 𝑐0 𝑐1 𝑝0 𝑡0𝑢 𝑐2 𝑝1 𝑝2 𝑡1𝑙 𝑡0𝑙 𝑞0 𝑡1𝑢 𝑐63 𝑞1 𝑡2𝑢 𝑡2𝑙 𝑞2 … … 𝑝63 𝑢 𝑡63 𝑙 𝑡63 𝑞63 A Twist in the Tale 10 𝑐1 ⋅ 𝑡0𝑢 + 𝑝1 + 𝑐011 ⋅ 𝑡0𝑙 + 𝑠1 𝑡1𝑢 = 1 − 𝑡1𝑙 = 1 − 01 ⋅ 𝑡0𝑢 + 𝑟1 1 0 𝑐1 ⋅ 𝑡0𝑙 + 𝑞1 + 𝑐1 𝑐0 𝑐1 01 𝑝0 𝑡0𝑢 𝑐2 𝑝1 𝑝2 𝑡1𝑙 𝑡0𝑙 𝑞0 𝑡1𝑢 𝑐63 𝑞1 𝑡2𝑢 𝑡2𝑙 𝑞2 … … 𝑝63 𝑢 𝑡63 𝑙 𝑡63 𝑞63 A little bit of Math  Let us use the shorthand Δ𝑖 = 𝑡𝑖𝑢 − 𝑡𝑖𝑙 to denote the lag Recall: all that matters is whether the top signal reaches first or not Thus, all that matters is whether Δ63 < 0 or not 𝑢 𝑡0 𝑙 + 𝑝1 − 𝑡0 𝑙 𝑡0 𝑢 𝑡0 Δ1 = 1 − 𝑐1 ⋅ − 𝑞1 + 𝑐1 ⋅ + 𝑠1 − − 𝑟1 = 1 − 𝑐1 ⋅ Δ0 + 𝑝1 − 𝑞1 + 𝑐1 ⋅ −Δ0 + 𝑠1 − 𝑟1 = 1 − 2𝑐1 ⋅ Δ0 + 𝑞1 − 𝑝1 + 𝑠1 − 𝑟1 ⋅ 𝑐1 + 𝑝1 − 𝑞1 To make notation simpler, let 𝑑𝑖 ≝ 1 − 2𝑐𝑖 𝑑𝑖 creates bits that take values −1, +1 instead Δ1 = Δ0 ⋅ 𝑑1 + 𝛼1 ⋅ 𝑑1 + 𝛽1 of 0,1 – that’s it! 𝛼1 = 𝑝1 − 𝑞1 + 𝑟1 − 𝑠1 /2 𝛽1 = 𝑝1 − 𝑞1 − 𝑟1 + 𝑠1 /2 A little bit of Math  Note that a similar relation holds for any stage Δ𝑖 = 𝑑𝑖 ⋅ Δ𝑖−1 + 𝛼𝑖 ⋅ 𝑑𝑖 + 𝛽𝑖 where 𝛼𝑖 = 𝑝𝑖 − 𝑞𝑖 + 𝑟𝑖 − 𝑠𝑖 /2 and 𝛽𝑖 = 𝑝𝑖 − 𝑞𝑖 − 𝑟𝑖 + 𝑠𝑖 /2 We can safely take Δ−1 = 0 (absorb initial delays into 𝑝0 , 𝑞0 , 𝑟0 , 𝑠0 ) We can keep going on recursively Δ0 = 𝛼0 ⋅ 𝑑0 + 𝛽0 (since Δ−1 = 0) Δ1 = Δ0 ⋅ 𝑑1 + 𝛼1 ⋅ 𝑑1 + 𝛽1 – now plugin value of Δ0 to get Δ1 = 𝛼0 ⋅ 𝑑1 ⋅ 𝑑0 + 𝛼1 + 𝛽0 ⋅ 𝑑1 + 𝛽1 Δ2 = 𝛼0 ⋅ 𝑑2 ⋅ 𝑑1 ⋅ 𝑑0 + 𝛼1 + 𝛽0 ⋅ 𝑑2 ⋅ 𝑑1 + 𝛼2 + 𝛽1 ⋅ 𝑑2 + 𝛽2 We can begin to see a pattern here Linear Models We have Δ63 = 𝑤0 ⋅ 𝑥0 + 𝑤1 ⋅ 𝑥1 + ⋯ + 𝑤63 ⋅ 𝑥63 + 𝛽63 = 𝐰 ⊤ 𝐱 + 𝑏 Exactly, this is why people where stopped using arbiter 𝑥𝑖 = 𝑑𝑖 ⋅ 𝑑𝑖+1 ⋅ … ⋅ 𝑑63 PUFs for authentication after this was revealed 𝑤0 = 𝛼0 𝑤𝑖 = 𝛼𝑖 + 𝛽𝑖−1 for 𝑖 > 0 This means that if someone If Δ63 < 0, upper signal wins and answer is 0 can find the 𝐰, 𝑏 parameters, they would be able to predict If Δ63 > 0, lower signal wins and answer is 1 response to any challenge!! Thus, answer is simply sign 𝐰 ⊤ 𝐱+𝑏 +1 2 This is nothing but a linear classifier! Linear/hyperplane Classifiers The model is a single vector 𝐰 of dimension 𝑑 (features are also 𝑑-dim), and a scalar term (called bias) 𝑏 Predict on a test point 𝐱 by checking if 𝐰 ⊤ 𝐱 + 𝑏 > 0 Decision boundary: hyperplane (where 𝐰 ⊤ 𝐱 + 𝑏 = 0) The vector 𝐰 is called the normal or perpendicular vector of the hyperplane – why? Consider any two vectors 𝐱, 𝐲 on the hyperplane i.e. 𝐰 ⊤ 𝐱 + 𝑏 = 0 = 𝐰 ⊤ 𝐲 + 𝑏. This means 𝐰 ⊤ (𝐱 − 𝐲) = 0. Note that the vector 𝐱 − 𝐲 is parallel to the hyperplane and 𝐰 perpendicular to all such vectors The bias term 𝑏 if changed, shifts the plane – it can be thought of as a threshold as well – how large does 𝐰 ⊤ 𝐱 have to be in order for decision to be 1 𝐰 XOR PUF XOR: given a bunch of 0/1 bits, output is 1 if odd number of bits are 1 else if even number of bits (includes no bits) are 1, output is 0 XOR is basically addition modulo 2 𝑏1 + ⋯ + 𝑏𝐾 %2 Cracking the XOR PUF It turns out that the XOR PUF can also be cracked using a linear model although one of a larger dimensionality Key insight: if we have a bunch of +1/−1 values, their product is +1 if and only if an even number of them are -1 else the product is -1 We can crack the individual PUFs using linear models i.e., for i-th PUF ⊤ 1 + sign 𝐰𝑖 𝐱 2 Remember: sign value of +1 corresponds to bit 1 and -1 corresponds to bit 0 Note: 𝑖 sign 𝐰𝑖⊤ 𝐱 is +1 if an even number of the sign values are -1 However, XOR is concerned with parity of +1 bits Solution: Flip the signs! Cracking the XOR PUF The product − 𝑖 −sign 𝐰𝑖⊤ 𝐱 = −1 𝐾+1 𝑖 sign 𝐰𝑖⊤ 𝐱 is -1 if an even number of the sign values are +1 else the product is +1 The extra −1 is there since XOR is 0 if there are an even number of 1s Here, 𝐾 is the number of PUFs 1+ −1 𝐾+1 ⊤ sign 𝐰 𝑖 𝑖 𝐱 Thus, the output of 𝟐 the sign values are +1 else the output is 1 is 0 if an even number of This is exactly what we wanted! All we need to do find a way to compute 𝑖 sign 𝐰𝑖⊤ 𝐱 Although it does not seem so right away, there is a linear model hidden here Observation: ⊤ sign 𝐰 𝑖 𝑖 𝐱 = sign Find a way to simplify 𝑖 𝐰𝑖⊤ 𝐱 𝑖 𝐰𝑖⊤ 𝐱 Cracking the XOR PUF Let’s take a toy example in 2 dims with 𝐰1⊤ 𝐱 ⋅ 𝐰2⊤ 𝐱 where 𝐰1 = 𝑎, 𝑏 , 𝐰2 = 𝑝, 𝑞 , 𝐱 = 𝑥, 𝑦 ∈ ℝ2 𝐰1⊤ 𝐱 ⋅ 𝐰2⊤ 𝐱 = 𝑎𝑥 + 𝑏𝑦 ⋅ 𝑝𝑥 + 𝑞𝑦 = 𝑎𝑝 ⋅ 𝑥 2 + 𝑎𝑞 + 𝑏𝑝 ⋅ 𝑥𝑦 + 𝑏𝑞 ⋅ 𝑦 2 = 𝑊 ⊤ 𝑋, where 𝑊 = 𝑎𝑝, 𝑎𝑞 + 𝑏𝑝, 𝑏𝑞 , 𝑋 = 𝑥 2 , 𝑥𝑦, 𝑦 2 ∈ ℝ3 Thus, we can just learn a linear model in 3D instead of 2D Exercise: extend this intuition to more than 2 classifiers and higher dims Try to do optimizations to reduce the dimensionality of 𝑋 Note: we are not assured that the linear model we learn will be of this form i.e., for some 𝑎, 𝑏, 𝑝, 𝑞 we get 𝑎𝑝, 𝑎𝑞 + 𝑏𝑝, 𝑏𝑞 However, we are assured that a linear model with 0 error does exist

intro to ml

Related documents

Products

Support

intro to ml

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib