Base64 and CRC

advertisement
Traceroute
Assignment
Base64 Encoding
The SMTP protocol only allows 7 bit ASCII
data, so how can you send me a picture of
Avril Lavigne, which is an 8 bit binary JPEG
file?
Encode it.
But back to Base64 encoding…
The encoding method used is simple and elegant.
Each group of 3 bytes is encoded as 4 bytes, each
containing only 6 bits of data. These are sent as 7-bit
ASCII.
Why is it called BASE64? Because 6 bits gives us
decimal numbers in the range 0-63, by assigning a
character to each decimal value (64 of them), we
can encode any number in the range 0-63 by just
one single character. Base 64 requires 64 symbols,
just as decimal (base 10) requires 10 symbols and
hexadecimal (base 16), requires 16 symbols.
The Base64 Alphabet:
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
(values given in decimal)
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
R
S
T
U
V
W
X
Y
Z
a
b
c
d
e
f
g
h
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
51
52
53
54
55
56
57
58
59
60
61
62
63
z
0
1
2
3
4
5
6
7
8
9
+
/
(pad) =
We take 3 bytes and encode to 4 bytes:
3 bytes to encode: 10101111
24 bit stream:
Four 6-bit values:
decimal value
Base64 character
11001010
11101010
101011111100101011101010
101011
43
r
111100
60
8
101011 101010
43
42
r
q
We then use the table to send the ASCII codes for each
BASE64 character.
The Base64 Alphabet:
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
(values given in decimal)
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
R
S
T
U
V
W
X
Y
Z
a
b
c
d
e
f
g
h
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
51
52
53
54
55
56
57
58
59
60
61
62
63
z
0
1
2
3
4
5
6
7
8
9
+
/
(pad) =
There is a slight problem when the bit stream
to be encoded is not an exact multiple of 3.
In this case, zeros are added to make the last
group of bytes (ie 1 or 2 bytes) up to a
multiple of 6 bits.
One or two padding characters (=) are added
to make the encoded data a multiple of 4
bytes.
For example:
4 bytes to encode: 10101111
32 bit stream:
11001010
11101010
00100011
10101111110010101110101000100011
Six 6-bit values: 101011 111100 101011 101010 001000 110000
decimal value
43
60
43
42
08
48
Base64 characters
r
8
r
q
I
w
Add padding
r
8
r
q
I
w
In this case four zeros are added, then two padding characters.
=
=
The Base64 Alphabet:
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
(values given in decimal)
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
R
S
T
U
V
W
X
Y
Z
a
b
c
d
e
f
g
h
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
51
52
53
54
55
56
57
58
59
60
61
62
63
z
0
1
2
3
4
5
6
7
8
9
+
/
(pad) =
Example Email Message with GIF attachment - BASE64 encoded
MIME-Version: 1.0
Content-Type: Multipart/mixed; BOUNDARY="Part10510241718.A"
--Part10510241718.A
Content-Type: Text/Plain; charset="us-ascii"
This email contains an attachment - a small GIF file.
Jim
----------------------
--Part10510241718.A
Content-Type: Image/gif; name="pin.gif"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="pin.gif"
R0lGODlhDgARAPIAAAAAAL8AAICAgP8AAP///wAAAAAAAAAAACH5BAEAAAQA
LAAAAAAOABEAAAM/SArRoRAy5yIBMwwynqTb1kjMtHHeFWRal2JUTAZCPJJA
XTdYhOWrX8/3w1mOgqFCwGwyLU4nNPqcRo9LqSUBADs=
--Part10510241718.A
CRC –Cyclic Redundancy Check
Errors happen!
One simple method of error checking is to do a checksum. All the bytes in
the message are added up and the result is transmitted with the message.
The receiver does the sum again and compares the result with the
transmitted checksum. This can detect lots of errors, but it is easy to see
that one bit changed in one byte could be cancelled out by one bit
changed in another byte. This is not a very secure method of error
checking.
The CRC is extensively used for error checking in many network
protocols.
It is based on some very complex mathematics, concerned
with polynomial arithmetic. If you wish to investigate the
theory behind CRC, this is a good starting point:
http://www.ross.net/crc/links.html
Why polynomial arithmetic? In any number system, numbers
can be considered as polynomials, in our familiar decimal
system, the number 3807 can be expressed as:
3*103+8*102+0*101+7*100
Things are actually simplified if we are working in binary, as
the coefficients can only be 0 or 1. So if we consider the
binary number 101101. This is:
1*25+0*24+1*23+1*22+0*21+1*20 or
x5+x3+x2+1
The CRC works by division, rather than
addition. The data (the transmitted message)
is considered to be a big binary number, which
could be represented as a polynomial. This
polynomial is divided by another, carefully
chosen, polynomial to give a result which is
used to check the data, in the same way as a
checksum.
By using a division algorithm, this method of
error checking can detect many more errors
than a simple checksum.
Rather than using a straightforward binary
division, the CRC uses modulo-2 arithmetic.
This means effectively doing a normal long
division, but with a few strange rules. In
modulo-2 arithmetic, subtraction and addition
are identical, since there are no “carries”. The
logical function is actually XOR.
Deciding if the divisor “goes into” the current
part of the dividend simply depends if the
MSB is the same(1). So 1111 would “go into”
1000.
Example from the book:
Data to be checked: 101110
Generator polynomial: 1001
(x3+1)
The data is first multiplied by 23, since the generator
polynomial is of order 3, done by adding 3 zeros:
101110000
Next this value is divided by the generator polynomial (1001),
by long division, using modulo-2 arithmetic, where subtraction
becomes the XOR function:
101011
--------------1001 | 101110000
1001
---101
000
---1010
1001
---110
000
---1100
1001
---1010
1001
---011
Remainder
The value actually transmitted is the data plus
the remainder ie, in this case: 101110011
When the CRC is calculated at the receiver
the value should be zero, as the remainder
value has been added to the original data.
101011
--------------1001 | 101110000
1001
---101
000
---1000
1001
---110
000
---1100
1001
---1010
1001
---011
Remainder
International standards have been established for various different CRC
generators of different bit lengths. For example, this is the CRC-32-IEEE 802.3
polynomial, used for Ethernet:
x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1 (V.42)
The CRC can always detect burst errors of fewer than r+1 bits (where r is the
order of the generator polynomial). There is also a good probability of longer burst
errors being detected. The CRC can also detect any odd number of bit errors.
The CRC is easy to implement in software and is often implemented in hardware
(using shift registers and xor gates).
The End
Download