Exam 3 Review

advertisement
Exam 3 Review
Identification Numbers
Information
Fall 2014
Mathematics in
Management Science
Identification Numbers
Modern identification numbers serve at
least two functions:
The number should unambiguously
identify the person or thing with which it
is associated . (codes)
The number should have a “selfchecking” aspect. (check digits)
Check Digits
Check digit is extra digit appended to
a number for purposes of detecting
errors when copying or transmitting the
number.
Check digit is calculated from the rest
of the number and transmitted with the
number.
When an error occurs, a recalculation
of the check digit won’t match.
Check Digit Examples
USPS money orders; Traveler checks
Credit cards; Car rentals
UPC – Universal Product Code
ISBN – International Standard Book No
BIN – Bank Identification No
VIN – Vehicle Identification No
Check Digits
Division Schemes
Weighted Schemes
Codabar Scheme
Direct vs Indirect Methods
Detecting/Correcting Errors
ZIP Codes
Bar Codes
UPC Bar Codes
AirLine Bar Codes
ZIP Code Bar Codes
Intelligent Mail Bar Codes
Bits, Bytes, & Binary Strings
A binary number is one written in base
2, so the digits are all either 0 or 1.
A single binary digit (a 0 or a 1) is a bit.
A byte is a group of binary digits or bits
(usually eight) operated on as a unit;
bytes are considered as a unit of
memory size.
A binary string (or word) is a list of bits.
Binary Codes
A system for coding data made up of
two states (or symbols); “0” or “1”.
Postnet code, UPC code, Morse code,
Braille, etc.
DVDs, Blu Ray, faxes, high defn TVs,
cell phones, all use binary codes with
data represented as strings of 0’s and
1’s rather than usual digits 0 through 9
and letters A through Z.
ASCII
Code
American
Standard
Code for
Information
Interchange
Parity
A bit string has odd parity if the number
of 1s in the string is odd. A bit string
has even parity if the number of 1s in
the string is even.
01100, 000, 11001001 – even parity.
1000011, 1, 00010 – odd parity.
Reliable Data Transmission
How to decode?
Binary Linear Code
Strings (words) of 0’s and 1’s with extra
digits for error correction used to send fulltext messages.
Words composed of all possible messages
of a given length plus parity-check sum
digits appended to messages; resulting
strings are the code words.
A binary linear code is set of binary digit
strings where each string has two parts—the
message part and the check-digit part.
Binary Codewords
A binary codeword is a string of binary digits:
e.g. 00110111 is an 8-bit codeword
A binary code is a collection of codewords all with
the same length.
A binary code C of length n is a collection of
binary codewords all of length n and it is called
linear if it is a subspace of {0, 1}n.
In other words, 0…0 is in C and the sum of two
codewords is also a codeword.
Hamming (7,4) Code
East to write formulas for parity bits.
Given message a1a2a3a4 calculate
parity check-sum digits c1c2c3 via:
c1= (a1+a2+a3
) mod 2
c2= (a1 +a3+a4 ) mod 2
c3= ( a2+a3+a4 ) mod 2
These give same as using circles!
RHS equations are parity check-sums.
Detecting & Correcting Errors
Valid code words must satisfy parity
check-sums; if not, have an error.
But, if bunch of errors, a code word
could get transformed to some other
code word. 
How many 1-bit errors does it take to
change a legal code word into a
different legal code word?
Hamming Distance
The Hamming distance between two
binary strings is the number of bits in
which the two strings differ.
dist btwn 10 and 01 is 2
dist btwn 10001 and 11001 is 1
dist btwn 00000 and 01101 is 3
dist btwn pixd words is 3
Weight of a Binary Code
Suppose the weight of some binary
code is t; so, it takes t 1-bit changes to
convert any code word into another.
Therefore, we can detect up to t-1
single bit errors.
The Hamming (7,4) code has weight
t=3. Thus using it we can detect 1 or 2
single bit errors.
Weight of a Binary Code
Suppose the weight of some binary
code is t.
Then, can detect up to t-1 single bit
errors,
or, we can correct up to
(t-1)/2 errors (if t is odd),
(t-2)/2 errors (if t is odd).
Cannot do both.
Nearest Neighbor Decoding
Spp parity-check sums detect an error.
Compute distances between received
word and all codewords.
The codeword that differs in fewest bits
is used in place of received word.
Thus get automatic error correction by
choosing “closest” permissible answer.
Types of Codes
Error Detection/Correction Codes
for accuracy of data
Data Compression Codes
for efficiency
Cryptography
for security
Data Compression
Here want to use less space to express
(approximately) same info.
Data compression is a process of
encoding data so that the most
frequently occurring data are
represented by the fewest symbols.
Compression Algorithms
Can be lossless – meaning that original
data can be reconstructed exactly – or
lossy – meaning only get approximate
reconstruction of the data.
Examples
ZIP and GIF are lossless
JPEG and MPEG are lossy
Run-Length Encoding (RLE)
Simple form of data compression
(introduced very early, but still in use).
Only useful where there are long runs
of same data (e.g., black and white
images).
Repeated symbols (runs) are replaced
by a single symbol and a count.
Huffman Encoding
Code created using so-called code tree
by arranging chars from top to bottom
according to increasing probabilities.
Uses code tree to both encode and
decode.
Must know:
How to create the code tree.
How to use code tree to encode/decode.
Using Huffman Tree: Assigning Labels
The label that
gets assigned to
a letter is the
sequence of
binary digits
along the path
connecting the
top to the desired
letter.
Creating a Huffman Code Tree
Constructed from a frequency table.
Freq table shows number of times (as a
fraction of total) that each char occurs
in document.
Freq table specific to the document
being compressed, so every doc has its
own code tree.
Cryptography
The study of methods to make, and
break, secret codes.
Process of coding information to
prevent unauthorized use is called
encryption.
Encryption used for thousands of
years.
Caesar Cipher or Shift Cipher
Identify letters with mod 26.
A → 0, B → 1, C → 2, etc.
Each char (A—Z) is “shifted” by a fixed
amount, d, known to both parties.
To encrypt: shift d letters to “right”
<letter> → (<letter> + d) mod 26.
To decrypt: shift d letters to the “left”
<letter> → (<letter> − d) mod 26
Decimation Cipher
Caesar cipher works by rearranging letters
in a simple way: add fixed number to
each letter and use mod arithmetic.
Decimation cipher permutes letters in a
more complicated way: add fixed number
to each letter and use mod arithmetic.
Again, identify the letters with Z26 , so
(A = 0, B = 1, … , Z = 25).
Linear Cipher
Let n, m, d be as before:
d = shift; (n · m) ≡ 1 mod 26
Example: n = 3 and d = 5, get m = 9.
Encrypt: x → (n · x + d ) mod 26
Decrypt: y → (m · (y − d )) mod 26
Vigenère Cipher
Starts with key—word, phrase, or random letters.
Letters in key indicate amount to shift the
corresponding letters in the message (as in Caesar
cipher).
Line up letters of key with letters of message; repeat
key as necessary.
“ Add” message and key letter by letter (mod 26).
To decrypt repeat, but subtract the key from the
encrypted message.
Online Data Security
One method uses a bit-by-bit Vigenerè
cipher:
Data is rep'd as a binary number.
Key is a (long, randomly generated)
binary number.
Data is encrypted by adding the key
bit-by-bit (mod 2).
Data is decrypted by adding the key
a second time.
Public-Key Cryptography
Algorithms are defined by keys: if you
know the key, you know the algorithm.
One key published (the public key);
other key kept secret (the private key).
This means one algorithm is public and
the other is secret/private.
Using Public-Key Cryptography
To send a message, encrypt it with the
recipient’s public key.
To read a received message, decrypt it
with your private key.
RSA Algorithm
Two keys: public key and private key,
Either key can encrypt a message: if
one key encrypts a message, the other
key will decrypt it.
Knowing one key does not allow finding
the other key!
RSA Algorithm
To encode/decode messages:
• sending a message
encrypt with recipient’s public key
• reading a received message
decrypt with your private key
The Keys
Each key consists of two numbers: an
exponent and a modulus.
Public key: r and n, respectively.
Private key: s and n, respectively.
The modulus n is same in both.
RSA Cryptography Algorithm
Given key (r,n), encrypt msg as follows:
1. Convert message to string of digits.
2. Break message into uniformly sized
blocks, padding last block with 0’s if need.
Call the blocks M1, M2,... , Mk .
3. Check that each Mi has no common
divisors with n besides 1.
4. Calculate and send Ri = (Mi )r mod n .
Key Generation Algorithm
1. Pick distinct primes p, q; put n = pq.
2. Let m be LCM of p − 1 and q − 1.
3. Pick r so it has no common divisors
with m except 1. (That is, r and m
are relatively prime.)
4. Find s so r • s ≡ 1 mod m; i.e., s is
the mult inverse of r (mod m).
5. Public/private keys are (r,n)/(s,n).
Download