compressed attacker

advertisement
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
1
Syllabus
Chapter 1: Data compression and encryption
Need for data compression, lossy/lossless compression, compression ratio, run length
encoding (RLE) for text and image compression, relative encoding and its applications in
facsimile data compression and telemetry, scalar quantization.
Chapter 2: Statistical methods
Statistical modeling of information source, coding redundancy, variable size codes,
prefix codes, Shannon-Fano coding, Huffman coding, adaptive Huffman coding,
arithmetic coding and text compression using PPM method.
Chapter 3: Dictionary methods
String compression, sliding window compression, LZ77, LZ78 and LZW algorithms and
applications in text compression, Zip and Gzip, ARC and cyclic redundancy code.
Chapter 4: Image compression
Lossless techniques of image compression, gray codes, two dimensional image
transforms, discrete cosine transform and its applications in lossy image compression,
quantization, zig-zag coding sequences, JPEG and JPEG-LS compression standards,
pulse code modulation and differential pulse code modulation methods of image
compression, video compression and MPEG industry standard.
Chapter 5: Audio compression
Digital audio, Lossy sound compression, M-law and A-law companding DPCM and ADPCM
audio compression, MPEG audio compression, frequency domain coding, format of
compressed data.
Chapter 6: Conventional encryption
Security of information, security attacks, classical techniques, Caesar cipher, block
cipher principle, design and modes of operation, S-box design, triple DES with two three
keys, introduction to international data encryption algorithm.
Chapter 7: Number Theory and public encryption
Modular arithmetic, Fermat’s and Euler’s theorems, Chinese remainder theorem,
discrete logarithm, principles of public key cryptosystems, RSA algorithm, key
management, Diffie-Hellman key exchange, elliptic curve cryptography.
Chapter 8: Message authentication
Authentication requirements and functions, message authentication functions (MAC),
hash functions and their security, hash and MAC algorithms, digital signatures and
authentication protocols, digital signature standard and algorithms.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
CHAPTER 1
1.1
2
CONVENTIONAL ENCRYPTION
Cryptography and related terms
Cryptography:
 Cryptography is the practice of storing and communicating data in such a form
that only whom it is intended for can read and process it.
 The basic purpose of cryptography is to protect the information from
unauthorized individuals who may exploit it for their own benefit and cause loss
to the organization.
 In cryptography we encode the data to be transmitted into an unreadable format
using certain algorithms so that it cannot be used and modified to produce
unauthorized effects.
 Practical goal of cryptography
Practically most of the cryptographic algorithms can be broken down if the
attacker has enough time and resources. Therefore the more realistic goal of
cryptography is to make obtaining the information work intensive for the
attacker.
In other words the encryption algorithm should be strong enough that the time
and resources lost by the attacker while decoding the code and tracking the
algorithm should be more than actual value of information.
The encryption algorithm is considered secure even if the time taken by the
attacker to break the code and obtain the information exceeds the useful
lifetime of the information.
Following figure shows the basic encryption procedure:
The sender generates the message containing the information to be communicated. This
message is in plain text and therefore cannot be transmitted on an insecure channel.
Hence this message is encrypted using the encryption algorithm to generate cipher text.
A secret key is used by the encryption algorithm to generate cipher text which is known
only to the sender and the intended receiver. This cipher text can be interpreted only
by those individuals whose know how it was encrypted i.e. who have the decryption
algorithm and the secret key. The intended receivers decrypt the message by running
the decryption algorithm and obtain the readable copy of the message.
Plain text: original message to be transmitted.
Cipher text: encrypted message.
Cipher: algorithm used to convert plain text to cipher text.
Key: secret data used sender and the receiver for encryption and decryption purposes.
Cryptography: study of encryption and decryption techniques.
Cryptanalysis: practice of decoding the encrypted message without the knowledge of
the key.
Cryptology: study of both cryptography and cryptanalysis.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
3
Encipher: to encrypt
Decipher: to decrypt
1.2 Information security
There are three aspects of information security
 Security service
 Security mechanism
 Security attack
Security service:
The security service is something that enhances the security of data processing
systems and information transfers of an organization.
It is used to counter security attacks and it uses many security mechanisms to do so.
The security standards defined by ITU X.800 are:
1. Authentication:
Authentication refers to the authenticity of the contents of the messages being
exchanged as well as that of the communicating entities.
2. Access control:
Access control is the ability to limit and control the access to host systems and
applications via communication links. To achieve this control, each entity trying to
gain access must first be identified, or authenticated, so that access rights can
be provided to the individual.
3. Data confidentiality:
The contents of the message being transferred across the insecure medium
should be readable to only those whom it is intended for and to no other entity.
4. Data integrity:
The contents of the message should not get modified during transit and even if
the message is modified, it should be detected at the receiving end.
5. Non repudiation:
Repudiation disputes arise when one entity denies sending or receiving any
message. The security mechanism should provide means to resolve such disputes.
Security mechanism:
A security mechanism is a mechanism designed to detect, prevent and recover from a
security attack.
No single mechanism supports all the functions required to provide complete security
and therefore many mechanisms work together.
Security attack:
A security attack is any action which compromises the security of information of an
organization.
It is an assault on the system derived from a threat.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
4
Following figures shows different types of security attacks:
Security threat:
A threat is potential for violation of security which exists when there is a circumstance,
capability, action or event that could breach security.
In simple words a threat is the vulnerability of the system which may be exploited by
an attacker.
Two types of security attacks:
 Passive attacks
 Active attacks
Passive attacks:
In a passive attack the attack monitors the transmissions to obtain message content or
monitors traffic flows, but does not modify the message.
Active attacks:
In an active attack the attacker acquires the message and modifies the contents of the
message to obtain unauthorized effects.
Types of active attacksModification of messages in transit:
In such type of a part of the message is altered or the message is delayed to produce
an unauthorized effect.
Masquerade:
In masquerade one entity pretends to be another entity to produce an unauthorized
effect.
Replay:
In replay attack a message sequence is captured and then retransmitted to produce an
unauthorized effect.
Denial of service:
Denial of service attack prevents or inhibits the normal use and management of
communication facilities.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
1.3
5
Classifications of cryptographic systems
1) Classification based on type of operations used for transforming plain text into
cipher text:
Substitution cipher:
In substitution cipher each element in the plain text is mapped into (replaced by)
another element to generate the cipher text.
Transposition cipher:
In transposition cipher the elements of the plain text are rearranged to generate
the cipher text.
Product systems:
Product systems involve multiple stages of substitution and transposition.
2) Classification based on number of keys used:
Symmetric, single key, secret key or conventional encryption:
In this encryption method both the sender and the receiver use the same single
key. The key is used for both encryption and decryption purposes.
Asymmetric, two key or public key encryption:
In public key encryption the sender and the receiver use different keys.
3) Classification on the basis of manner in which plain text is processed:
Block cipher:
A block cipher processes the input one block at a time producing an output block
for each input block.
Stream cipher:
Stream cipher processes the input elements continuously producing an output one
element at a time as it goes along.
1.4
Symmetric cipher
In symmetric cipher encryption or secret key encryption the sender and the receiver
share a secret key between them and all the messages are encrypted and decrypted
using the same secret key.
Following figure shows the symmetric encryption process:
Here a source produces a plain text message of the form: P = [X1, X2, ... , Xm]
Where X1, X2, … are characters.
A secret key is generated by the sender which is delivered to the receiver securely.
The plain text is encrypted using this secret key to generate the cipher text as:
C = EK (P)
Where E is the encryption algorithm.
The receiver decrypts the cipher text using the same key to obtain the plain text as:
P = DK (C)
Where D is the decryption algorithm.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
6
Requirements of symmetric encryption:
1. The encryption algorithm should be unconditionally secure i.e. the it should be
strong enough that the attacker should not be able to decrypt the cipher text or
discover the key even if he possesses cipher text copies along with corresponding
plain text copies.
2. Sender and receiver should obtain the copies of secret key in a secure fashion and
must keep the key secure.
3. The algorithm should be computationally secure i.e. :
- The cost of breaking the cipher exceeds the value of the message.
- The time required for breaking the cipher should exceed the useful lifetime of
the message.
Drawbacks of symmetric encryption:
- There is no method which is completely secure for delivering the secret key and
if the attacker obtains a copy of the secret key then all the communication of
the organization will be compromised.
- This method does not provide any mechanism for authentication of the
communicating parties involved and therefore is vulnerable to masquerade
attacks.
1.5
Fiestel cipher
Fiestel cipher is a product cipher and uses two basic ciphers in sequence in such a way
that their result is cryptographically stronger.
This method uses a cipher that alternates substitution and permutation.
Principle of operation:
Fiestel cipher works on the principle of confusion of diffusion and confusion.
Diffusion:
In diffusion, the statistical nature of plain text is dissipated into long range statistics
of cipher text. This is done by making each bit of the plain text affect many bits of
cipher text.
The purpose of diffusion is to make the statistical relationship between the plain text
and the cipher text as complex as possible to prevent the attacker from deducing the
key.
Confusion:
In confusion, the relationship between statistics of the cipher text and the encryption
key is made as complex as possible using a complex substitution algorithm.
This is done so that even if the attacker has understood the statistics of the cipher
text he will not be able to discover the key due to complex relationship between the key
and the cipher text.
Algorithm:
The inputs to the encryption algorithm are: a plain text block of size 2w bits and a key
having many subkeys K = {K1, K2,…, Kn}.
The plain text block is divided into two halves each of length w bits denoted by R0 for w
rightmost bits and L0 for w leftmost bits. These two halves pass through n rounds of
processing and are then combined to produce the cipher text block.
Each round i has inputs Li-1 and Ri-1 derived from previous round and a key Ki derived
from K.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
7
Li is subjected to substitution by first applying a round function on Ri-1 and ex-oring the
result with Li-1. The round function has same structure for each round but is
parameterized by the round key Ki.
Following this substitution, a permutation is performed that consists of interchange of
the two halves of data.
Following fig. shows the Fiestel cipher algorithm:
Design principles:
1. Block size:
Increasing the block size increases complexity and thus improves security. But it
slows the cipher.
Typically block size is 64 bits
2. Key size:
Increasing the key size improves security but slows the cipher.
Typically key size is 128 bits.
3. Round function:
Complex functions improve security but slow the cipher.
4. Number of rounds:
Increasing the number of rounds improves complexity but slows down the cipher.
Typically 16 rounds are used.
5. Complexity of subkey generation:
Complexity of subkey generation improves security and makes the analysis harder.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
8
Data encryption standard (DES)
DES is an encryption technique which encrypts the data in 64 bit blocks using 56 bit
keys.
Following fig. shows the encryption procedure used by DES:


The inputs to the encryption function are a 64 bit block of plain text and a 56 bit
key. Although the actual size of the key is 64 bits, only 56 bits are used and the
remaining 8 bits are arbitrary.
Following processes are involved in encryption of a block of plain text data using
DES:
1. Initial permutation
2. 16 rounds of complex key dependent round function involving substitution and
permutation functions.
3. 32 bit swap
4. Permutation which is inverse of the initial permutation.
Initial permutation:
The initial permutation is defined by the following table:
The table has to be interpreted in the following way:
- The input to the table consists of 64 bits numbered from 1 to 64.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
-
9
The 64 entries in the permutation table contain a permutation of the numbers from 1
to 64.
Each entry in the permutation table indicates the position of a numbered input bit in
the output, which also consists of 64 bits.
Inverse initial permutation:
The inverse initial permutation is defined by the following table:
Single round details:
Following figure shows the details of a single round involved in data processing:
-
A 64 bit intermediate value is the input to every round. This value is divided into two
data blocks each of length 32 bits.
The right hand side block Ri-1 is subjected to an expansion/permutation block which
converts 32 bit block of data into a 48 bit block.
The expansion is done according to the following table:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
-
32 bit block of data is expanded into a 48 bit block by repeating some of the bits
from the original block. The repetition of bits is as given in the above table.
After expansion the 48 bit data block is ex-ored with the 48 bit key.
The 48 bit ex-or output block is then mapped into 32 bit block by a substitution
function involving eight s-boxes.
Following figure shows s-box design:
-
Each s-box takes 6 bits of data as input and maps it into 4 bit data.
s-box design:
Following figure shows the design of an s-box: S1
10
Mapping 6 bits data into 4-bits:
Consider the 6 bit input as 110101
4 bit number = binary equivalent of 3 = 0011
i.
The 2 bit number formed by the first and last bits gives the row number to be
referred in the table.
ii.
The remaining 4 bits give the column number.
iii.
The number at the corresponding row and column when converted into 4 bit binary
equivalent is the 4 bit mapped output.
-
The output of s-box is then subjected to a permutation block which rearranges the
bits in order to increase the complexity of the encryption.
Following table defines the permutation operation:
-
The permuted output is then ex-ored with the left hand side input to the round: Li-1
to generate the right hand side output block Ri.
The input block Ri-1 is the left hand side output of the round i.e. Li = Ri-1.
-
Key generation in DES:
DES uses a 64 bit key as input. Out of the 64 bits every 8th bit is ignored and only 56
bits are used as given by the following table:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
11
The resultant 56 bit key is then subjected to a permutation defined by the following
permutation choice -1 table:
The permuted 56 bit key is then divided into two halves Co and Do each of size 28 bits.
At each round Ci-1 and Di-1 are subjected to a circular left shift given by the following
table:
The shifted values serve as input to the next round. They also serve as input to the
permuted choice-2 table which produces the 48 bit key for the round function.
PC-2 table:
DES decryption:
DES uses the same algorithm for decryption of the message except that the order of
application of the keys is reversed.
Triple DES:
DES is vulnerable to brute force attacks and therefore using DES for encryption does
not ensure complete security. Hence to improve the security of encryption, the plain
text is encrypted multiple times using same DES algorithm but with different keys.
In triple DES the plain text is encrypted by subjecting it to DES algorithm thrice.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
12
Triple DES using two keys:
C = EK1 [DK2 {EK1 (P)}]
P = DK1 [EK2 {EK1 (C)}]
Triple DES using three keys:
C = EK3 [DK2 (EK1 (P))]
P = DK3 [EK2 (DK1(C))]
Block cipher principles:
1. Electronic codebook mode:
In electronic codebook (ECB) mode the plain text is encrypted in 64 bit blocks using
the same encryption key K. The plain text message is divided into 64 bit blocks and if
the size of any block is less than 64 bits then bits are padded. Each 64 bit block is
encrypted independent of other blocks. Hence each block will result in a unique
cipher text block and therefore the codebook is used.
This method is useful for small blocks of data.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
13
The drawback of this method is that if the attacker discovers the encryption
algorithm and the key entire data becomes visible to him.
2. Cipher block chaining mode:
-
-
In CBC mode the cipher text output of the previous round is ex-ored with the
current plain text block and the ex-or output is subjected to the encryption block.
For the first block of data no previous cipher text block is known and therefore an
initial value is used to ex-or it with the plain text block.
The advantage of this method is that even if an attacker finds out the encryption
key and the encryption algorithm, he will not be able to decrypt the cipher text block
unless the previous cipher text blocks are known to him.
Another advantage of this method is that same blocks of cipher text will produce
different blocks of cipher text and therefore the structural analysis of data is not
possible.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
14
3. Cipher feedback mode:
-
CFB mode converts a block cipher into stream cipher by padding with appropriate
number of bits.
This mode is suitable for real time applications where s bits of stream data are to be
transmitted immediately.
4. Output feedback mode:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
5. Counter mode:
15
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
-
16
The advantage of this method is that even if the attacker knows the encryption
algorithm and the secret key, he will not be able to decrypt the cipher text until he
knows the cipher text.
Key management in symmetric encryption:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
17
In this method the key distribution center which is a highly trusted organization
generates the secret keys to be used by two communicating entities. Following steps
take place for key distribution:
1. The initiator A has to establish a data transfer session with B. Hence A sends a
request message to KDC. Along with the request message a nonce N1 is added which
can be a time stamp or any counter number depending on the application.
2. KDC responds by a message encrypted using the secret key shared between KDC and
A and another message encrypted using the secret key shared between KDC and B.
The first message contains a secret key Ks to be used for communication message
along with a copy of the request message sent by A so that A can verify that the
message did not get modified during transit.
The other message contains the secret key Ks along with identity of A and it is
encrypted using the key shared between KDC and B so that once B receives this
message it trusts the key source.
3. A extracts the second part of the message and sends it to B.
4. B derives the key and sends an encrypted nonce to A.
5. A decrypts the nonce N2 and sends it to B so that the identity of A is authenticated
to B.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
CHAPTER 2
2.1
18
NUMBER THEORY AND PUBLIC KEY ENCRYPTION
Number theory
Modular arithmetic:
Modulus operator:
 Consider a positive integer ‘n’ and any other integer ‘a’.
When a is divided by n we get remainder ‘r’ and quotient ‘q’ such that: a = nq + r
 When the remainder is required and the quotient is not of much significance, then
the operation can be represented using modulus operator as: a mod n = r
a mod n operation gives the remainder when a is divided by n.
For example:
7 mod 5 = 2
11 mod 7 = 4


Congruent modulo integers:
 Two integers a and b are said to be congruent modulo n if: a mod n = b mod n and
it is represented as:

Rules
1.
2.
3.
For example:
17 13 mod 4
35 52 mod 17
of modular arithmetic:
a mod n + b mod n = (a + b) mod n
a mod n - b mod n = (a - b) mod n
a mod n x b mod n = (a x b) mod n
Relatively prime numbers:
 Two numbers are said to be relatively prime to each other if there is no factor
common between them other than 1 i.e. if their G.C.D is 1.
 Thus a and b are relatively prime to each other if gcd (a,b) = 1
 Any prime number is relatively prime to all numbers other than 1 and its multiples.
 For example:
25 and 33 are relatively prime to each other.
7 and 21 are not relatively prime to each other.
Euler’s totient function:
 For any natural number n the Euler’s totient function (n) is defined as the total
number of natural numbers less than n and relatively prime to n.
 For example let n = 15
The set of natural numbers less than 15 and relatively prime to 15 is:
{1, 2, 4, 7, 8, 11, 13, 14}
(15) is number of elements in this set i.e. 8
Hence (15) = 8
 For any prime number n, all the numbers less than n are relatively prime to n.
Hence for any prime number n, (n) = n – 1
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
19
Fermat’s theorem:
Fermat’s theorem states that if ‘p’ is a prime number and ‘a’ is a positive integer not
divisible by p, then:
Proof:
If p is a prime number and a is a positive integer not divisible by p, then according to
modular arithmetic the set of numbers: { 0 mod p, a mod p, 2a mod p, ...... ,(p-1)a mod p }
is identical to set { 0, 1, 2, ...... , p-1 }.
Since 0 mod p = 0 the first element of the two sets are equal.
Now multiplying the remaining elements of the two sets and taking modulus we get:
[(1a mod p)(2a mod p).....((p-1)a mod p)] mod p = (123.......(p-1)) mod p
Using product rule on RHS:
(a2a.....(p-1)a) mod p = (123.......(p-1)) mod p
ap-1(p-1)! mod p = (p-1)! mod p
Canceling (p-1)! on both sides:
ap-1 mod p = 1 mod p
or ap-1
1 mod p
Euler’s theorem:
Euler’s theorem states that for every a and n that are relatively prime:
2.2
Principles of public key cryptographic systems
Drawbacks of single key encryption:
 Single key encryption uses one key shared by both the sender and the receiver. If
this key is disclosed, all communication between the sender and the receiver
becomes transparent to the attacker.
 This is symmetric system and therefore it does not prevent the parties from
forging a message and claiming it to be sent by the other party.
Public key encryption:
Public key encryption is based on using different keys for encryption and decryption
purposes.
In public key encryption each communicating party generates a pair of keys. One of the
keys is publicly available and is therefore called the public key KU. The other key is
known only to the respective party and therefore called as private key KR.
The keys are generated in such a way that a message encrypted using the public key can
be decrypted using the private key only while a message encrypted using the public key
can be decrypted using the private key only.
Public key encryption can be used for authentication and confidentiality both and it also
eliminates the need for a secure medium for distribution of secure keys.
Steps involved in public key encryption:
1. Each communicating entity generates a pair of keys to be used for encryption and
decryption of messages.
2. One of the keys is kept secret and is known only to the user. This key is the private
key.
3. The other key is placed in the public register and is accessible to every one. This key
is the public key.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
20
4. Keys are used for encryption and decryption depending on the application.
Data confidentiality using public key encryption:
Confidentiality refers to the security of the information while it is transmitted through
an insecure channel. No other entity except the intended receiver should be able to view
the message.
Following figure shows how data confidentiality is obtained using public key encryption:
A source A produces messages in plain text P = [P1, P2, ......] where the elements P1, P2,
P3, ...... are letters in some finite alphabet.
The receiver of the message B generates a pair of key i.e. a private key KRB known only
to B and a public key KUB known to everyone including A.
For confidentiality the receiver’s public key is used for encryption. A message
encrypted using the receiver’s public key can be decrypted using the receiver’s private
key only. Since the private key is known to no one else, the message will be secure from
everyone and confidentiality will be achieved.
Therefore A encrypts the plain text message using the receiver’s public key KUB and
the cipher text of the form C = [C1, C2, ......].
C = EKUB[P]
Upon reception B decrypts this message using the private key and generates the plain
text message as:
P = DKRB[C]
- This method ensures confidentiality but not authentication as anyone having the
public key of B can forge a message masquerading as A.
Authentication using public key encryption:
Authentication refers to the genuineness of the communicating entities. For example if
A and B are communicating, both A and B should be aware of each other’s identities.
Authentication can be implemented using public key encryption in the following manner:
Here the sender A generates a plain text message P and encrypts this message using his
private key KRA to generate the cipher text C as:
C = EKRA[X]
Since this message is encrypted using the private key of the sender, it can be
decrypted only using the public key of the sender. Therefore if a communicating party is
able to decrypt the message using the public key, the identity of the sender will be
authenticated as no one else can encrypt a message using the private key.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
21
Upon reception the receiver decrypts the message as: P = DKUA[C]
-
This method provides authentication but not confidentiality as the message is
encrypted using the sender’s private key and everyone having the public key can
decrypt the message and view the contents.
Authentication and confidentiality using public key encryption:
Authentication and confidentiality both can ensured using public key encryption by
subjecting the plain text message to two rounds of encryption as shown in the figure:
As shown in the figure the message is encrypted twice first using the sender’s private
key and then using the receiver’s public key.
The public key of the receiver is used to ensure confidentiality the private key of the
sender is used to authenticate the sender.
The cipher text is generated as:
C = EKUB[EKRA(P)]
The cipher text is decrypted as:
P = DKRB[DKUA(C)]
-
The disadvantage of this method is that the complex encryption algorithm has to
be executed twice at each end which increases the processing time.
Requirements of public key encryption:
1. It should be computationally feasible for all the communicating parties to generate a
key pair (KU, KR)
2. It should be computationally feasible for a sender A knowing the public key of the
receiver B to generate cipher text as C = EKUB(P).
3. It should be computationally feasible for the receiver B to decrypt the cipher text
and obtain the original message as P = DKRB(C).
4. It should be computationally infeasible for an attacker who knows KU to find KR.
5. It should be computationally infeasible for an attacker who knows C and KU to find P.
6. Encryption and decryption functions can be applied in any order:
M = EKUB[DKRB(M)] = DKUB[EKRB(M)] = EKRB[DKUB(M)] = DKRB[EKUB(M)]
2.3 RSA algorithm:
RSA algorithm is a practical implementation of public key encryption.
It is a block cipher scheme where the plain text and cipher text are integers between 0
and n-1. Typically n=1024.
Here the plain text is encrypted in blocks where the size of each block is k bits, such
that 2k < n ≤ 2k+1.
For a block of plain text M, the cipher text C is generated as: C = Me mod n
The cipher text is decrypted as P = Cd mod n = Med mod n
Both sender and the receiver know the value of n and e whereas only the receiver knows
the value of d.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
22
Thus the public key of the receiver is KU = {e, n} and the private key of the receiver is
KR = {d, n}
The RSA algorithm consists of following modules:
I. Key generation:
1. Generate two large random and distinct prime numbers p and q which are
approximately of same size in terms of bit length.
2. Compute n = pq and Ф = (p-1)(q-1).
3. Select a random integer e, 1<e<Ф such that gcd(Ф, e) = 1
4. Compute unique integer 1<d< Ф such that ed 1 mod Ф
II. Encryption:
The sender encrypts the message M as:
1. Obtain the KU of the intended receiver.
2. Represent the message M in integer in the interval 0 to n-1.
3. Compute C = Me mod n and send it to the intended receiver.
III. Decryption:
The receiver recovers the plain text from the cipher text as:
P = Cd mod n = Med mod n
-
Note: even though we have to select the values of p and q which are similar, we
cannot take very nearby values because if
then
. The value of n is
known to everyone and hence anyone can find the value of p and by trial and error
and find the keys.
2.4 Key management:
There are two main aspects of key management Distribution of public keys
 Use of public key encryption to distribute secret keys
Distribution of public keys:
1. Public announcement of public keys:
In this method each user distributes public keys to recipients or broadcast them to the
entire community.
The drawback of this method is forgery.
Suppose X is an attacker and he sends following message to B and C after blocking the
message from A.
X to B & C : [IDA, KUX]
So here X is sending his public key pretending to be A and can masquerade until
discovered by A. Hence in method anyone can create a key claiming to be someone else
and broadcast it.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
23
2. Publicly available directory:
-
In this method, the public keys are registered with a public directory. This
assures greater security to the keys.
The directory must be trusted with following properties:
1. It should contain the name and public key entries in the form {IDX, KUX}.
2. The participants should register securely with the directory.
3. The directory should be periodically published.
4. The directory should be electronically accessible.
3. Public key authority:
In this method highly trusted public key authority controls the distribution of keys.
The public key authority provides all the functionalities of the directory. All the
communicating entities interact with the directory to obtain public keys. The only
requirement of this method is real time access to the directory.
Following figure shows the key distribution procedure by public key authority:
The key distribution takes place in the following steps:
1. A PKA: Request || T1
The initiator A sends a message to public key authority containing a request for
current public key of B and a time stamp T1. Time stamp is used to prevent replay
attacks.
2. PKA A: EKRAUTH [KUB || Request|| T1]
The authority responds with a message that is encrypted using its private key
KRAUTH. This message contains the public of B and the original message that was
sent by A to public key authority. The original message is sent back to A so that A
can verify the message for any modification or replay attacks.
The message is encrypted using the private key of the authority to authenticate the
public key authority and prevent masquerade attacks.
3. A B: EKUB[IDA || N1]
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
4.
5.
6.
7.
24
A stores the public key of B and encrypts a message using this key and sends it to B.
This message contains the identity of A and a nonce N1 which serves as an identifier
to the message.
B PKA: Request || T2
B sends a message to public key authority requesting the public key of A. This
message contains the identity of A and a time stamp T2.
PKA B: EKRAUTH[KUA || Request || T2]
The public key authority responds by sending a message with KRAUTH containing the
public key of A and the original request message along with the time stamp.
B A: EKUA[N1 || N2]
B sends a message to A after encrypting the message with the public key of A in
response to message (3). This message contains the original nonce N1 along with a
new nonce N2. The original nonce is sent back to A so that A is assured of the
identity of B. Since B is sending the nonce N1 which was encrypted using the public
key of B, it is actually B with whom A is communicating as no one else can find N1.
A B: EKUB[N1 || N2]
A sends the nonce N2 back to B to authenticate itself.
4. Public key certificates:
Public key certificates allow key exchange without real time access to public key
authority.
Following figure shows the key exchange procedure with public key certificates:
A public key certificate binds the identity to public key along with other information
such as period of validity, rights of use etc. All the contents of the certificate are
signed by the certificate authority and therefore it can be verified by anyone who
knows the public key of the certificate authority.
Each communicating party sends its public key to the certificate authority securely.
For party A the certificate authority verifies the relevant details and provides a
certificate of the form:
CA = EKRAUTH [IDA, KUA]
Similar certificates are given to all the communicating parties after authentication.
All the communicating parties exchange the certificates instead of exchanging the
public keys.
Whenever a party receives a certificate from another party, it will obtain the public
key of the sender by decrypting the certificate using the public key of the
certificate. If the certificate is successfully decrypted with the public key of the
certificate authority, the sender of the certificate is authenticated.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
25
Public key distribution of secret keys:
This method assumes that the two communicating parties A and B have already
exchanged the public keys.
The secret key is exchanged in the following steps:
1. A B: EKUB[N1 || IDA]
A uses the public key of B to encrypt a message to B which contains the identity of A
IDA and nonce N1, which is used to identify this transaction uniquely.
2. B A: EKUA[N1 || N2]
B sends the response to A containing the nonce N1 and a new nonce N2. This message
is encrypted using the public key of A. B sends the received nonce N1 back to A to
authenticate itself to A.
3. A B: EKUB[N2]
A sends the nonce N2 back to B to authenticate itself to B.
4. A B: EKUB[EKRA(Ks)]
A selects a secret key Ks and sends it to B after encrypting it twice. The secret key
is first encrypted using KRA and then using KUB. This ensures authentication as well
as confidentiality.
5. Finally B decrypts the received message and obtains the secret key as:
Ks = DKRB [DKUA(Ks)]
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
CHAPTER 3
3.1
26
MESSAGE AUTHENTICATION
Message authentication
Purpose of message authentication:
There are three main aspects of message authentication1. Protecting the integrity of the message.
Preventing the messages from getting modified during transit and in the case of
any modification the receiver should be able to detect it and discard the
message.
2. Validating the identity of the originator.
Authentication scheme should ensure that the sender of the message is same
individual as in indicated by the identity in the message.
3. Non repudiation of origin.
The authentication scheme should be able resolve the disputes resulting due to
sender denying any message which has its identity.
Requirements of authentication:
For any message to be authenticated following attacks must be prevented1. Disclosure
2. Traffic analysis
3. Masquerade
4. Content modification
5. Sequence modification
6. Timing modification
7. Source repudiation
8. Destination repudiation
3.2
Message authentication functions
Message
authentication
functions
Message
encryption
Message
authentication
code (MAC)
Hash function
I. Message encryption:
Here the cipher text of the message serves as its authenticator.
1. Symmetric encryption:
In symmetric encryption a source A transmits a message M to a receiver B after
encrypting it with a secret key K shared between A and B.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
27
Since no other party knows the secret key K, confidentiality is provided. It also
authenticates the two parties for each other. If party B receives a message
encrypted using key K and containing the identity of A, it is assured that it was
generated by A as no other party knows the secret key K.
2. Public key encryption:
Direct use of public key encryption:
In public key encryption sender A generates a message M and encrypts it using
public key KUB of the intended receiver B. upon reception party B decrypts the
message using its private key KRB.
The direct use of public key encryption provides only confidentiality and not
authentication because an attacker can easily obtain the public key of party B and
forge a message using identity of party A as shown:
Attacker C: EKUB [M, IDA]
Upon reception of such a message party B will not be able to detect that the
message is unauthorized.
Encryption using private key:
Here the sender A transmits a message M to the receiver B after encrypting it
using its private key KRA. Upon reception B decrypts this message using the
public key KUA of A and obtains M.
This method provides authentication because if B is able to decrypt the message
using KUA, it was definitely encrypted using KRA which is known only to A and no
other party. Only A can encrypt a message using its private key and therefore it
is authenticity is confirmed.
The drawback of this method is that it does not provide confidentiality because
anyone can obtain the public key KUA of A and decrypt the messages.
Authentication using multiple encryption:
In this method every message is encrypted twice before being transmitted to the
receiver.
Here the sender A first encrypts the message using its private key KRA and then again
using the public key KUB of the receiver.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
28
This method provides authentication and confidentiality both but at the cost of extra
processing time for running the complex encryption algorithm twice.
Drawbacks of using message encryption to provide authentication:
 This method provides partial authentication by authenticating only the sender of
the message and not the contents of the message. Any attacker can obtain a copy
of cipher text and remove some bits from it or rearrange the bits even if he is not
able to decrypt the message. Such types of attacks cannot be prevented and only
solution is to detect and discard such messages. This method provides no
mechanism for detecting such unauthorized modifications.
 To provide both authentication and confidentiality, the complex encryption
algorithm has to be used twice which increases the load on the system and the
processing time.
II. Message authentication code (MAC):
In this method an additional data called as cryptographic checksum or message
authentication code (MAC) is added to the message which serves as its authenticator.
Following figure shows the procedure for authentication using MAC:
The sender A generates a message M to be transmitted to receiver B.
The cryptographic checksum is calculated by subjecting M to a function C called as MAC
function using the secret key K.
MAC = CK (M)
This cryptographic checksum or MAC value is then appended to the original message and
then transmitted to the intended receiver.
The MAC function and the secret key are known only to the two communicating parties
involved.
Upon reception, the receiver separates the message and MAC and then recalculates the
MAC value from M using K. If the received MAC value and the recalculated MAC value
are equal, the message is authenticated otherwise it is discarded.
The message authentication is based on the fact even if an attacker is able to modify
the message, he cannot modify the MAC value accordingly as he does not know the MAC
function or the secret key. If an attacker modifies the message to produce an
unauthorized effect, the recalculated MAC value and the received MAC value will not
match and the message will be discarded at the receiving end.
Requirement of MAC:
1. If an attacker observes M and CK (M), it should be computationally infeasible for
him to construct a message M’ such that: CK (M’) = CK (M).
2. CK (M) should be uniformly distributed in the sense that for randomly chosen
messages M and M’, the probability that CK (M’) = CK (M) is 2-n where n is the
number of bits in MAC.
3. MAC should depend equally on all bits of the message.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
29
III. Hash function:
Hash function is a public function that maps a message of any length into a fixed length
hash value which serves as its authenticator.
Fig. shows the basic procedure involved in authentication using hash function:
The sender generates the message M and the hash value ‘h’ is calculated by subjecting
M to hash function as: h = H (M)
This value is appended to the message at the source.
The receiver authenticates the message by recomputing the hash value from the
message and then comparing it with the received hash value.
Authentication is based on the fact that it is not possible for an attacker to modify the
message and the hash value accordingly. Hence even if an attacker modifies the message
it will be detected at the receiving end as the calculated and received hash values will
not match.
Practical implementations of authentication using hash function:
1. Implementation using symmetric encryption:
2. Implementation using public key encryption:
3. Implementation using public key encryption and a secret data:
Properties of hash function:
1. The hash function produces a fixed length output for variable length input.
2. It can be applied on a block of data of any size.
3. H (x) should be relatively easier to calculate for any x, so that hardware and
software implementation is possible.
4. One way property: For any given value h, it is computationally infeasible to find x
such that H (x) = h.
5. Weak collision resistance: For any block x, it is computationally infeasible to
find y not equal to x such that H(x) = H(y).
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
30
6. Strong collision resistance: It is computationally infeasible to find any pair (x,y)
such that H(x) = H(y).
Secure hash algorithm
The secure hash algorithm takes as input a message with a maximum length less than 2 64
bits and produces a 160 bit message digest. The input is produced in 512 bit blocks and
following steps are involved in the processing:
1. The message is padded so that its length is congruent to 448 modulo 512. Padding
is always added even if the message is of desired length.
The number of padding bits is in the range of 1 to 512 bits and the padding
consists of a single 1–bit followed by the necessary number of 0 bits.
2. A block of 64 bits is appended to the message. This block is treated as an
unsigned 64-bit integer and contains the length of the original message before
padding.
3. A 160 bit buffer is used to hold intermediate and final results of the hash value.
The buffer is represented by five 32-bit registers A, B, C, D and E.These buffers
are initialized to following hexadecimal values:
A = 67452301
B = EFCDAB89
C = 98BADCFE D = 10325476
E = C3D2E1F0
4. The message is processed in 512 bit or 16-word blocks.
The algorithm consists of module having rounds of processing of 20 steps each.
There are four rounds having similar structure but using different primitive
logical functions.
Each round takes as input, the current 512 bit block i.e. Yq and the 160 bit buffer
value ABCDE and updates the contents of the buffer.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
31
5. After all the 512 bit blocks have been processed, the output from the Lth stage is
the 160 bit message digest or the hash value where L is the number of blocks in
the message.
5.3
Digital signatures
Need for digital signatures:
Message encryption and authentication protects two communicating parties against any
third party but it does not protect the two parties against each other. Disputes arise
when there is source or destination repudiation. In those situations where the two
communicating parties do not have complete trust on each other, digital signatures are
required.
Properties/requirements of digital signatures:
1. It must verify the date and time of the signature along with verifying the author.
2. It must authenticate the contents at the time of the signature.
3. It must be verifiable by the third party to resolve the disputes.
4. The digital signature must be a bit pattern that depends on the message being
signed.
5. The signature must use some information unique to the sender to prevent forgery
and denial.
6. It should be relatively easy to produce, recognize and verify the digital signature.
7. It must be infeasible to forge a digital signature either by constructing a new
message for an existing digital signature or by constructing a fraudulent digital
signature for a given message.
8. It should be practical to retain a copy of the digital signature in storage.
Arbitrated digital signature techniques:
In arbitrated digital signature techniques, the signed message from the sender X to the
receiver Y goes first to an arbitrator A who subjects this message and its signature to
various tests to check whether the origin and contents are genuine or not. The message
is then dated and sent to Y with an indication that it has been verified by the
arbitrator. The presence of an arbitrator solves the problem of source repudiation.
Following approaches are used in arbitrated digital signatures:
1. Conventional encryption:
a. Where arbitrator can see the message:
X A: M || EKXA [IDX || H(M)]
A Y: EKAY [IDX || M || EKXA (IDX || H(M)) || T]
In this method the arbitrator must share a secret key KXA with the sender X and
secret key KYA with Y. Here the arbitrator can see the message. The arbitrator
calculates H(M) from the message received and compares it with received H(M). After
verifying the origin and contents, the arbitrator forwards another message to the
receiver which contains a signature.
The signature consists of the identity IDX and the hash value H(M). The timestamp T
ensures that it is not a replay attack.
Y cannot decrypt the signature but still the message is considered authentic as it has
come through A.
This method requires both X and Y to have to trust A in the following manner:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
-
32
X must trust A not to reveal KXA and not to generate false signatures of the
form EKXA [IDX || H(M)].
Y must trust A to forward a message only after verifying the hash value and the
signature.
Both X and Y must trust A to resolve disputes fairly.
b. Arbitrator cannot see the message
X A: IDX || EKXY[M] || EKXA [IDX || H[EKXY(M)]]
A Y: EKYA [IDX || EKXY(M)] || EKXA[IDX || H[EKXY(M)] || T1]
Here X and Y must share a secret key KXY between them.
In this case the arbitrator cannot see the message.
Drawbacks of using conventional encryption:
- Arbitrator can form an alliance with the sender deny a signed message.
- Arbitrator can form an alliance with the receiver to forge sender’s signature.
2. Public key encryption:
X
A
A: IDX || EKRX [IDX || EKUY(EKRX(M))]
Y: EKRA [IDX || EKUY[EKRX(M)] || T]
In this case X double encrypts a message M, first with its private key KRX and then
with the receiver’s public key KUY. This is a signed secret version of the message. This
signed version with IDX is encrypted again with KRX and is sent to A along with IDX.
The inner double encrypted message is secure from the arbitrator. However A can
decrypt the outer encryption to assure that the message must have come from X.
The arbitrator A verifies the validity of the private-public key pair of X and if the key
pair is validated, A verifies the message.
After verification, A transmits a message a message to Y encrypted with KRA. The
message includes IDX, double encrypted message and a timestamp.
Here the message is secret from A. Another advantage of this method is that no
information is shared among the parties before communication which prevents alliances
to defraud.
Digital signature standard:
Digital signature is a public key technique which uses an algorithm designed to provide
the digital signature function.
The DSS approach makes use of a hash function. The hash value of the message is given
as input to a signature function along with a random number K generated for that
particular signature. The signature also depends on the sender’s private key and a set of
parameters which constitute a global public key (KUG).
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
33
The output of signature function is a signature consisting of two components labeled as
‘s’ and ‘r’. These two components are appended to the message and the entire block is
transmitted.
Upon reception, the hash value of the message is calculated. The hash value and the
message are given to the verification function which requires the public key of the
sender along with KUG. The output of the verification function is a value equal to the
signature component r if the signature is valid.
Digital signature algorithm:
The strength of digital signature algorithm is based on the difficulty of computing
discrete logarithms.
The DSA consists of following steps:
1. Calculating global public key components:
1. Select a prime number p with a length between 512 and 1024 bit.
2L-1 < p ≤ 2L for 512 ≤ L ≤ 1024
L is a multiple of 64
2. Select a 160 bit prime number q such that q is a prime divisor of (p-1).
3. Select g such that 1 < g and g=h^[(p-1)/q] mod p and 1 < h < p-1.
The numbers p,g and q form the global public key KUG = {p, g, q}
2. Calculation of private key X of the user:
Select the private key X of the user such that 0 < X < q.
X should be selected randomly or pseudo randomly.
3. Calculating the public key Y of the user:
The public key of the user is calculated using his private key as Y = gX mod p.
Knowing the value of Y, it is computationally infeasible to find X, since discrete
logarithm is involved.
4. Generating user’s per message secret number K:
It is a random or pseudo random integer K such that 0 < K < q.
It is unique for every signature.
5. Creating a signature:
Creation of a signature requires calculation of two quantities r and s that are
functions of the public key components (p, q, g), user’s private key X, hash code of
the message H(M) and K.
r is calculated as r = (gK mod p) mod q
s is calculated as s = [K-1(H(M)) + Xr] mod q
The signature is (r, s)
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
34
6. Verification:
Verification is done by following calculations:
1. W = (s’)-1 mod q
2. u1 = H(M’)  W mod q
3. u2 = (r’) w mod q
4. v = [(gu1 gu2) mod p] mod q
If v = r’, then the message is validated.
CHAPTER 4
DATA COMPRESSION
4.1 Data compression
The process of converting an input data stream into another data stream having reduced
size is called as data compression.
The input stream could be from a file or buffer in the memory.
Source file: the input file to the encoder.
Compressed file: the output file produced by the encoder which has a smaller size
compared to the source file.
Compressor or encoder:
It is the program that converts the raw data into the input data stream and then
compresses it to create the output stream.
Decoder or decompressor:
It is the program which generates the original data stream from the compressed data
stream.
Note: In general the term CODEC is used for coder-decoder.
General law of data compression:
General law of data compression states that for compression short codes should be
assigned for common events and long codes should be assigned for rare events.
This law is based on eliminating the redundancy in the data to achieve compresssion.
4.2
Classification of compression algorithms
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
35
Lossy and lossless compression techniques:
Lossy compression techniques:
In lossy compression methods, compression is achieved by losing some part of the
information.
In such cases the decompressed data is not identical to original data and some
information is permanently lost and therefore such methods are irreversible
compression methods.
Lossy compression methods are generally used for audio video and image compression.
Eg. JPEG, MPEG, EZW etc.
Lossless compression techniques:
In lossless compression methods compression is achieved without losing any information
and therefore such methods are used in cases where information cannot be lost like
text files.
Eg. Huffman coding, Shannon Fano coding, Arithmetic coding, LZW etc.
Adaptive and non adaptive compression techniques:
Non adaptive compression techniques:
Non adaptive compression is rigid is and does not modify its compression parameters or
tables in response to the different patterns of the input data being compressed.
Such methods are best suited to compress data of a single type or of a definite pattern.
Eg. Huffman compression.
Adaptive compression techniques:
In adaptive compression techniques the compressor examines the input data statistics
and patterns and modifies its parameters and compression tables accordingly.
In other words the compressor adapts itself to varying conditions of input data for
obtaining efficient compression.
Eg. adaptive Huffman coding.
Semi-adaptive method:
A semi-adaptive method uses a two part algorithm where the first part reads the input
stream to collect the statistics of data being processed and the second part does the
actual compression using the statistical information provided by first part.
Symmetric and asymmetric compression techniques:
Symmetric compression techniques:
In symmetric compression techniques same algorithm is used by compressor and the
decompressor but is applied in opposite directions.
Asymmetric compression techniques:
In asymmetric compression techniques different compression algorithms are used by
compressor and decompressor.
4.3
Compression parameters
Compression ratio:
Compression ratio is defined as the ratio of the output stream size to the input stream
size.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
36
For compression C.R. < 1
For expansion C.R. > 1
Compression factor:
Compression factor is defined as the ratio of the size of the input stream to the size of
the output stream.
For compression C.F. > 1 and for expansion C.F. < 1.
Compression gain:
Compression gain is defined as-
Reference size is either size of input stream or size of the compressed stream
produced by standard lossless compression method.
4.4
Runlength encoding (RLE)
Runlength encoding is a lossless compression technique used for compression of text and
images.
RLE is useful for compression of those files where the characters are repeated many
times continuously.
In RLE a character string is encoded only if it is repeated more than 3 times and the
compressed data is written in the following format:
( escape character, data character, runlength )
 The escape character i.e. ‘@’ is used to indicate that data has been compressed.
 The data character is the character which is repeated.
 Runlength gives the number of times the character is repeated.
For example consider the following stream of data given as input to the RLE encoder:
aabcxfffffwwww1111111ssw
The compressed output stream will be:
aabcx@f5@w4@16ssw
Note: For encoding a character stream the minimum value of runlength has to be 4
because, the runlength encoding procedure requires three bytes which is same as the
number of bytes occupied by three characters. Hence if a character run of length three
or less is encoded, it will not result in any compression
RLE image compression:
A digital image consists of small dots called as pixels. Pixels are arranged in an array
called as bitmap of the image in the form of scan lines.
RLE image compression is based on the fact that there is a high probability that a
randomly selected pixel will have all the neighboring pixels of similar color.
Each pixel occupies 3 bytes, one byte for each color field in (R, G, B) color space. The R,
G and B fields are encoded as three different data streams using RLE.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
37
Typically each row is encoded separately using runlength encoding.
Compression ratio can be further improved by ignoring shorter runs.
4.5
Relative encoding (differencing)
Relative encoding is used when the elements of the data stream to be encoded have
similar values. In such cases instead of sending each element, the difference between
the elements can be transmitted to save bandwidth.
Differencing is used for telemetry and facsimile applications.
For example consider the following data stream generated by a temperature
measurement telemetry system:
Temperature (0C): 300, 301, 304, 300, 301, 299, 298, 302
The data stream can be encoded by transmitting the relative values considering the
first value as the reference value.
The encoded stream will be: 300, 1, 4, 0, -1, -2, 2
If the difference between the successive values is transmitted, the stream will be
encoded as: 300, 1, 3, -4, 1, -2, -1, 4
If the difference between the reference value and the current value is large, then
actual value is transmitted instead of sending the relative value.
4.6
Scalar quantization
Scalar quantization is used to compress the data which is in the form of large numbers
as quantized numbers will occupy lesser space. But quantization leads to permanent loss
of information.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
CHAPTER 5
5.1
38
STASTICAL METHODS OF DATA COMPRESSION
Statistical modeling of information source
In statistical modeling of an information source, the probabilities of source symbols are
tracked.
The order of the model depends on number of previously occurring symbols taken into
account. With increasing the order, the probabilities obtained become more and more
reliable but the complexity increases.
The overall efficiency of any data compression technique depends on individual
performances of the modeling processes and the encoding methods.
A statistical model used in compression can be shown as:
Here the probabilities of the symbols occurring in the input stream are tracked and
then forwarded to the encoder along with symbols for encoding.
5.2
Information theory
Measurement of information:
The information content of any message mK is measured as:
pk is the probability of occurrence of mK
The unit of information is bits.
From the above expression it can be concluded that as the probability of occurrence of
a symbol increases, the information content decreases i.e. less frequently occurring
symbols convey more information as compared to more frequently occurring symbols.
Note- for calculations use the formula:
Entropy of a source:
Consider a source that generates n different symbols S1, S2, ... , Sn with probabilities
P1, P2, ... , Pn respectively.
The entropy of the source is defined as the average information content of the source.
It gives the minimum number of bits required to represent each symbol.
It is given by the following expression:
Above expression can be simplified as:
H (S) = P1I1 + P2I2 + ...... + PnIn
Entropy is measured in terms of bits/symbol.
Average length of a code:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
39
It is the average number of bits needed per symbol.
It is given by the following expression:
PK is the probability of occurrence of Kth symbol and LK is its length in terms of bits.
Redundancy:
Redundancy is defined as the symbols largest possible entropy and its actual entropy.
It is given by the following expression:
For data to be compressed efficiently, R should be as small as possible i.e. the number
of bits used to represent a symbol should be very close to the actual information
content of the symbol.
5.3 Prefix codes
Prefix property:
Prefix property states that when a certain bit pattern has been assigned as the code
for a symbol then no other code can start with that pattern.
Consider the example where the symbols are assigned codes without following prefix
property:
Symbol
Code
S1
0
S2
01
S3
10
S4
010
If the symbols transmitted are S2 S3 S4, the corresponding data stream will be:
0110010
This data stream can be read as: S2 S3 S4 and also as: S2 S3 S2 S1
To avoid such ambiguities prefix property should be used while developing the code
words for the symbols.
Prefix codes:
A prefix code is a code which satisfies prefix property.
A unary code of a non negative integer n is defined as (n-1) zeroes followed by a single
one or (n-1) ones followed by a single zero.
5.4 Shannon-Fano coding:
Shannon-Fano coding produces variable size codes for the symbols occurring with
different probabilities. The coding depends on the probability of occurrence of the
symbol and the general idea is to assign shorter codes for symbols that occur more
frequently and long codes for the symbols occurring less frequently.
Shannon-Fano algorithm:
The algorithm used for generating Shannon-Fano codes is as follows:
1. For a given list of symbols, develop a corresponding list of probabilities so that each
symbol’s relative probability is known.
2. List the symbols in the order of decreasing probability.
3. Divide the symbols into two groups so that each group has equal probability.
4. Assign a value 0 to first group and a value 1 to second group.
5. Repeat steps 3 and 4, each time partitioning the sets with nearly equal probabilities
as possible until further partitioning is not possible.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
5.5
40
Huffman coding
Huffman coding gives a variable size code based on symbol probabilities. This method is
based on reducing the redundancy in the number of bits used for representation of
information.
The general idea is to achieve compression by assigning shorter codes for frequently
occurring symbols and longer codes for symbols occurring less frequently.
Algorithm:
The encoder starts by building a list of the symbols in the descending order of
probabilities. It then constructs a tree with a symbol at every leaf from bottom to top.
This is done in steps where at each step the two symbols with smallest probabilities are
selected, added to the top of partial tree, deleted from the list and replaced another
symbol representing those two symbols.
When the list is reduced to just one symbol, then the tree is completed. The tree is
then traversed from right to left to determine codes for the symbols.
Note: when there are more than two nodes having smallest probabilities, select the
nodes which are highest and lowest in the tree and combine them. This will reduce the
total variance of the code.
The Huffman code having smallest variance is preferred. The variance of a code
measures by how much the size of the individual codes deviate from the average size.
The variance of a code is defined as:
PK = probability of occurrence of kth symbol
LK = number of bits used to represent the symbol
LA = average length
Huffman decoding:
- The Huffman table used for coding must be transmitted to the decoder as many
times
as
it
is
updated
if
the
technique
is
adaptive.
For static Huffman coding only one table is sufficient for the decoder.
- The decoder starts at the root of the tree and reads the first bit from the
compressed stream. If the bit is zero the bottom edge is followed otherwise, top
edge of the tree is followed. In the same manner successive bits are read until
the decoder reaches a leaf where it finds a symbol.
Drawbacks of Huffman coding:
The symbol probabilities which are the basic requirements are very rarely known in
advance. This makes the algorithm inefficient.
There are two possible solutions to this problem:
- use adaptive method
- use semi adaptive method
5.5
Adaptive Huffman coding
-
In adaptive Huffman coding both the compressor and the decompressor start
with an empty Huffman tree. No symbols are assigned codes and every new
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
-
41
symbol is treated as a leaf node with the same weight. As new symbols are added,
the tree is also updated such that the updated tree is also a Huffman tree.
The first symbol is written on the output stream as it is. This symbol is then
added to the tree and a code is assigned to it. The next time this symbol occurs,
its current code is written on the output stream and its frequency is incremented
by 1. Each time the symbols are processed, it has to be checked whether the tree
satisfies the Huffman properties.
The Huffman property is that if we scan the tree, the frequency of occurrence
of symbols should decrease from right to left and from top to bottom i.e. the
symbol on top right position will have the highest frequency and the one at the
bottom left will have the lowest frequency.
This property is called as sibling property of Huffman tree.
Updating the Huffman tree:
The process of updating the tree starts always at the current node which is a leaf ‘S’
with ‘f’ as its frequency of occurrence.
Every iteration has three steps:
1. Compare S to its successors in the tree from left to right and bottom to top. If the
immediate successor has frequency (f+1) or more, then the nodes are still in sorted
order and swapping is not required. Otherwise some successors of S have identical
frequency f or smaller frequency. In such a case S should be swapped with the last
node in this group.
2. Increment the frequency from f to f+1. Also increase the frequency of all its
parents.
3. If S becomes the root, then the process stops otherwise the process repeats with
the parent of node S.
Drawbacks of adaptive Huffman coding:
1. Count overflow:
The frequency counts are accumulated and this field can overflow. Normally the
width of this field is 16 bits and can store a count up to 65535. The count of the
root is monitored every time it is incremented. When the maximum count limit is
reached, all the weights are rescaled with an integer division by 2. This is actually
done by performing an integer division only on the leaf nodes and updating the tree
again. Sometimes it leads to violation of Huffman property and the tree needs to be
updated again.
2. Code overflow:
Code overflow when many symbols are added to the tree and the tree grows longer.
The compressor has to find out the code for an input symbol S in the tree by linear
search method. If S is found in the tree, the compressor moves from node S back to
root thus building the code bit by bit. These bits have to be accumulated as they are
transmitted in the reverse order. When the tree gets longer, the codes get longer
and if the field size is exceeded, the program malfunctions.
3. Another drawback of the Huffman coding is that the codes generated contain
integer number of bits which adds redundancy to the data.
5.6
Arithmetic coding
One of the drawbacks of Huffman coding is that it assigns an integer number of bits to
individual symbols, which adds some coding redundancy. Arithmetic coding overcomes
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
42
this drawback by assigning one long code to represent the string of symbols instead of
assigning codes to individual symbols.
Arithmetic coding is also based on the probability model of the symbols to be encoded.
Initially the encoding starts with a code assigned to the first symbol which gets
modified as other symbols are added. The result code when the last symbol is encoded
is the compressed data.
Data is encoded in following steps:
1. Start by defining the current interval as [0, 1).
2. Repeat the following two steps for each symbol S in the input stream:
i.
Divide the current interval into sub-interval whose sizes are proportional to the
symbol’s probabilities.
ii.
Select the sub-interval for S and define it as the new current interval.
3. When the entire input stream has been processed, the output should be any number
within the current interval.
5.7
Context based text compression (PPM)
In context based compression the probability model of the symbol is generated
depending on frequency of the symbol and the context in which the symbol has occurred
so far. The PPM encoder switches to a shorter context when a longer one results in zero
probability.
PPM starts with an order n context and it searches its data structure for a previous
occurrence of the current context C followed by the next symbol S. If no such
occurrence is found the encoder switches to order n-1 context and then same procedure
is followed.
The encoder reads the next symbol S from the input stream, looks at the current order
n context C and based on the input data that has been encoded previously, it determines
the probability (P) that S will appear in context C.
The encoder then uses adaptive arithmetic encoder to encode the symbol S with
probability P.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
CHAPTER 6
6.1
DICTIONARY BASED METHODS
Dictionary based methods
Dictionary based methods try to compress variable size strings of information into
tokens using a dictionary.
The dictionary holds strings of symbols and it can be static as well as adaptive.
The adaptive dictionary holds the strings previously found in the input stream
allowing for addition of new strings as the input is being read.
The encoder tries to match a part of the input stream with the words (strings)
stored in the dictionary. If a match is found, the token is written on the output
stream which contains a pointer to that location of the dictionary where the
matched word is stored. This method is also called as string compression.
If a word is found which does not match then it is written as it is on the output
stream followed by a flag character and size of the word.




6.2
Static and adaptive dictionary methods
Static dictionary methods
6.3



43
Adaptive dictionary methods
1. Static dictionary methods are
rigid
and the dictionary is
not modified according to the
varying input data.
1. In
adaptive
dictionary
methods
the
unmatched
strings are added to the
dictionary dynamically and
hence the dictionary is
dynamically updated.
2. The size of the dictionary is
fixed and generally very
small.
2. Here space is allocated for
addition of new strings to the
dictionary.
3. Preferred only when the
strings encountered in the
input stream follow a definite
pattern and occur in definite
patterns.
3. Preferred when the words
randomly appear in the input
data and do not fall under any
category.
LZ-77 (sliding window compression)
The LZ-77 compression method is an adaptive compression method where the
encoder dynamically builds a dictionary from the input data and then uses the
previously occurring strings to compare and compress the new strings.
The amount compression i.e. the compression ratio depends on:
- Length of the dictionary
- Size of the window used
The encoder maintains a window and shifts the input in that window from right to
left as the symbols are being encoded and that is why this method is called sliding
window.
The sliding window has two parts:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
-
44
The left part is called the search buffer and it contains the current dictionary.
It includes the strings that have been input and encoded.
The right part of the window is called the look ahead buffer and it contains the
strings which are to be encoded.
Typically the size of search buffer is very large as compared to the look ahead
buffer.
 For each encoded string a token is written on the output stream.
The LZ-77 token structure is as follows:
This token is written on the output stream and the window is shifted to right.
- The first field of the token is offset field which gives the location of the
matched string in the dictionary. This field is basically a pointer to the dictionary
which points to the memory location in the dictionary where the string is stored.
The size of the offset field is log2 (S)
- The second field of the token is the match length i.e. the number of symbols in
the string which found a match in the dictionary.
The size of this field is log2 (L-1).
- The third field of the token is the next unmatched symbol which stores the next
symbol in the input stream after the matched string.
The length of this field is log2 (C).
Drawbacks of LZ -77 compression technique:
1. This method assumes that a match is found around the window which is not the case
in practical applications.
2. Compression ratio can be improved only by increasing the size of search window
which increases the latency.
3. This method is not practically applicable as there is no definite data structure.
6.4
LZ-78
In LZ-78 method a dictionary of previously occurred strings is maintained, the size of
which is limited by the available memory.
This method reduces the token size by having only two fields in the token.
The token structure in LZ-78 method is as shown:
-
The LZ-78 token has only two fields as compared to three in LZ-77.
The pointer field points to the memory location in the dictionary at which a match
is found for the current input string.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
-
45
The second field in the dictionary stores the value of the symbol occurring
immediately after the string that found a match in the dictionary. In other words
this field stores the symbol next to the string being encoded which when added
to the string will result in a string having no match in the existing dictionary.
Encoding:
The dictionary is empty and starts with a null string at location zero. As the symbols are
input and encoded, dictionary is built by adding new strings at positions starting from 1.
If the current input symbol ‘S’ does not matches any of the strings in the dictionary,
then it is added to the next available memory location and the value of the symbol is
written in the token.
Otherwise, if the current symbol is present in the dictionary then the next symbol in
the stream is added to this symbol to form a new string and this string is checked for a
match in the dictionary. In this manner symbols are added to the string until there is no
match in the dictionary. At the point when there is no match found in the dictionary, the
location of the recently matched string in the dictionary is written in the pointer field
of the token and the recently added symbol which caused the mismatch is the next
unmatched symbol.
Decoding:
The LZ-78 decoder works by building and maintaining the dictionary in the same way as
the encoder.
Drawbacks of LZ-78 algorithm:
The drawback of the LZ-78 algorithm is the memory size as the frequently encountered
symbols as well as the longer matches have to be stored as entries in the dictionary. If
the dictionary is full, then either the dictionary has to be restarted or some of the
entries have to be deleted.
6.5
-
-
LZW
The LZW compression algorithm eliminates the ‘unmatched symbol’ field from the
token and hence only one field i.e. the pointer to the dictionary has to be
transmitted for each encoded data string. But due to this every unmatched
symbol has to be exclusively encoded.
In LZW, the dictionary is initialized to store all the symbols in alphabet and other
ASCII characters and therefore memory locations 0-255 are occupied.
The new entries in the dictionary are based on the combinations of existing
symbols which appear in the data stream.
The decoding is done by building the dictionary in the same manner as for
encoding.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
46
Question bank
Chapter #1
Data compression
1. Give the applications of data compression? (4m)
2. Compare lossy and lossless data compression techniques? (5m)
3. Suggest and explain a compression method for the compression of data transmitted
by a remote measurement system which monitors the temperature of a furnace?
(4m)
4. Explain run length encoding. What are the applications of run length encoding? (510m)
5. Encode the following data strings using run length encoding:
a) 11abbbbcccccabc
b) @aaaa$55555677777
Also find the compression ratio and compression factor in each case.
6. Compare dictionary based methods and statistical methods of text compression. (510m)
7. Write a short note on:
a) Relative encoding \ telemetry compression
b) Scalar quantization
Chapter #2
Statistical methods of data compression
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
47
1. Write a short note on:
a) Information content of a message
b) Entropy
c) Average length of a code
d) Redundancy
2. Explain Shannon Fano coding. Generate the codes for the following symbols using
Shannon Fano coding:
Symbol
Probability
S1
0.35
S2
0.21
S3
0.15
S4
0.19
S5
0.1
Also find the redundancy in coding. (10m)
3. State advantages and disadvantages of statistical methods for data compression.
4. Compare adaptive and non adaptive compression methods.
5. Explain arithmetic coding technique for data compression.
6. Explain Huffman method of data compression. Consider the following symbols with
the given probabilities:
Symbol
Probability
S1
0.4
S2
0.2
S3
0.2
S4
0.1
S5
0.1
Draw the Huffman trees using normal method and using minimum variance method.
Also find the variance and the coding redundancy in each case.
7. A source emits letters from an alphabet set S = {m, n, o, p, q} such that:
P (m) = P (n) = 0.2, P (o) = 0.4 and P (p) = P (q) = 0.1.
a) Find the entropy of the source.
b) Find the Huffman code using the standard procedure and the minimum variance
method.
c) Find the average length of the code and the coding redundancy for both the
codes.
8. What are the drawbacks of Huffman method? What are the solutions to those
drawbacks? Explain adaptive Huffman method.
9. Compare RLE and Huffman coding for an image where each pixel is represented in 8
bits and 50% of the pixels have a grey level of 127 and remaining 50% of the pixels
have a grey level of 128.
10. A source emits six discrete symbols with probabilities as P (a1) = 0.1,
P (a2) =
0.4, P (a3) = 0.06, P (a4) = 0.1, P (a5) = 0.04 and P (a6) = 0.1. Use Huffman coding to
encode the source. If the encoded string is 010100111100, decode it to find the
original string.
11. A source emits five symbols S1, S2, S3, S4 and S5 with probabilities 0.25, 0.25,
0.25, 0.125 and 0.125 respectively. Find:
a) Entropy of the source
b) Huffman code using standard procedure.
c) Shannon Fano code.
d) Average length of code and redundancy.
12. Encode the following data strings using adaptive Huffman method:
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
48
a) sir_sid_easily
b) she_sells_sea_shells
c) zxzyzxzyzx
Also show the decoding.
13. Encode and decode the following data strings using arithmetic coding:
a) swiss_miss
b) assassinimassa
14. Given 3 data symbols a1, a2 and with probabilities 0.001838, 0.975 and 0.023162
respectively. Use arithmetic coding to encode the data string “a2, a2, a1, a3, a3”.
15. Compare arithmetic coding with Huffman coding.
16. “The Huffman coding is not unique”. Explain this with an example.
17. Explain context based coding. What are its advantages?
18. Draw the trie structure for following data strings:
a) zxzyzxxyzx
b) abcbccacbcaabcb
Also show base and vine pointers.
Chapter #3
Dictionary based compression
1. Compare statistical and dictionary based compression techniques. (5m)
2. Suggest a suitable compression technique for each of the following data strings. Also
state the reasons.
a) xyzzyyxzx
b) xxxxyyyyzzzz
c) xzyyzyzzyzzzz
3. Compare LZ-77, LZ-78 and LZW compression techniques.
4. Write a short note on:
a) Zip
b) Gzip
c) CRC
d) Arc
5. Encode the following data strings using LZ-77, LZ-78 and LZW algorithms:
a) sir_sid_is_easily_teases_sea_sea_sick_seals
b) she_sells_sea_shells_at_the_sea_shore
c) alph_eats_alphalpha
6. Explain the concept of static and adaptive dictionary. Explain with a suitable example
the encoding technique using LZ-77.
7. Describe the situations when LZ-77 algorithm is best and worst. Explain the LZ-78
algorithm specifying the improvements over the LZ-77 algorithm.
8. An initial dictionary consists of letters a, b, r, y and z. Encode the following message
with LZW algorithm: “azbarzarrayzbyzbarrayarzvay”.
9. What are the advantages of LZW over other methods.
Chapter #4
Image compression
1.
2.
3.
4.
Describe different approaches for image compression.
Write a short note on gray codes.
Explain the application of DCT in image compression.
Explain JPEG compression method used for image compression. How JPEG-LS
standard is different from JPEG?
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
49
5. What is motion compression with respect to image compression? Give the basic
structure of MPEGI video standard.
6. What is motion compensation? Explain the working of MPEG in detail.
7. Draw the structured layers of MPEGI video stream.
8. Explain the various techniques used in video compression and their underlying
principles.
9. What is the effect of quantization on an image?
10. Write a short note on PCM and DPCM.
11. Explain the various steps involved in the compression of video sequences using MPEGI
video standard.
Chapter #5
Audio compression
1. Write a short note on lossy sound compression.
2. Describe A-law and μ-law companding.
3. Write a short note on ADPCM and DPCM. What are the advantages of ADPCM over
PCM?
4. What is linear predictive coding? Explain CELP and MELP in details.
5. Explain the terms “frequency masking” and “temporal masking” in audio compression.
6. Write a short note MPEG audio standard.
Chapter #6
Conventional Encryption
1. Write a short note on:
a) Goals of cryptography
b) Security service and security mechanism
c) Security attacks
2. Explain data encryption standard.
3. Explain IDEA?
4. Write a short note on Fiestel Cipher. Explain the design principles.
5. Explain CBC, ECB, OFB, CFB and counter mode of operation of block ciphers.
Chapter #7
1.
2.
3.
4.
5.
6.
Number theory and public key encryption
Explain CRT with an example.
Explain the concept of discrete logarithm.
What is the difference between index and discrete logarithm?
Compare conventional and public key encryption.
Explain RSA with an example.
Calculate the private key and public key based on RSA taking 5 and 11 as two prime
numbers. Use these keys to encrypt and decrypt a plain text input of M=17.
Chapter #8
Message authentication
1. Describe the various authentication requirements for communication across a
network. Explain different authentication functions.
2. Write a short note on MAC.
3. Explain MAC based on DES.
Kalpana Coaching Classess BE-SEM-VII-EXTC-DCE-Notes by Rohit Sinha
Ph. Dadar-24330916 Thane-25440393 For private circulation only
4. What is MAC? Where do we use it?
5. What is secure hash algorithm?
6. Differentiate between MAC and hash codes.
7. Write a short note on HMAC.
8. Write a short note on one way hash function.
9. What are the needs and requirements of digital signatures?
10. What are the drawbacks of direct signatures?
11. Explain DSA.
50
Download