A Unidirectional Approach to Achieving Instant Message

advertisement
A Unidirectional Approach to Achieving Instant Message Confidentiality1
Student: David Ja
Advisor: Dr. Matt Blaze
1
I would like to thank Micah Sherr for all the guidance and support he has provided throughout this project.
A Unidirectional Approach to Achieving Instant Message Confidentiality
Student: David Ja (davidja@seas.upenn.edu)
Advisor: Dr. Matt Blaze
Abstract
Instant message systems (IMs) have become a common tool for communication used
today. Yet, with all the frenzy for creating security and privacy for phone calls and faxes,
IMs remain relatively unprotected. While many protocols exist to supplement and secure
the current IM systems, there are too many drawbacks for them to become widely
popularized. The common fatal flaws are often that they are not simple enough to be
widely used and that they require bilateral cooperation among the communicating parties.
Confusion protocol offers an alternative to the current existing protocol. Confusion
protocol does not encrypt traffic but generates noise along with actual traffic. This
results in the actual traffic remaining secure.
The purpose of this project is to apply the confusion protocol to the current IM systems.
The goal of the system is to create a secure IM system that can replace the current IM
system by creating a library add-on that implements the confusion protocol on top of the
current IM systems. Most importantly, it will provide the user the same flexibility of the
current system without the added complication of encryption.
Related Works
Instant Message Systems
There has been a plethora of research done on instant message systems since the
beginning of the internet. The modern IM system, such as GAIM (instant message system
that works with multiple protocols), a client-server model, will be the basis for this
research project.
Encryption
Although encryption for IM is not common, encryption for internet traffic is not a novel
concept. Implementations of encryption protocols such as PGP and SSL have been
around for more than a decade. A majority of the encryption algorithms currently
implemented are no different in structure than the PGP system. PGP offers two primary
services: encryption and digital signature.
Both of PGP’s function can be used to protect the message from eavesdropping.
Encryption: the encryption algorithm used by PGP involves a two-way public key
encryption follow by exchange of symmetric key. This means the symmetric session key
can be protected by the RSA algorithm to ensure the symmetric key is secure. The reason
symmetric key encryption is used instead of straight RSA is because of the efficiency of
the symmetric key encryption.
Digital Signature: digital signature allows for the authentication of any file. It is
generated using a hash code and the private key of the sender. This means the signature
can be attached to the end of the messages to allow for authentication.
SSL provides similar encryption services relying on two-way key exchange before a
symmetric key encryption algorithm. It also provides digital signature service as PGP.
Other security protocols not mentioned here are: PEM, MOSS, S/MIME, PKCS#7, CMS,
etc. All these encryption protocols have been used to protect internet traffic like emails.
IM encryption is usually a modified version of these processes. The public key exchange
and symmetric key system that is used in PGP is often used to encrypt IM with minor
modifications (example, SecureIM, AIM built in encryption, etc).
Confusion Project2
Confusion is a project that Dr. Blaze’s group worked on in the Distributed Systems Lab.
Confusion in a simple sense is the utilization of noise to hide the real message. This, in
turn, confuses the interceptor.
Current papers suggest looking at the security problem from the eavesdropper’s
perspective, by considering the “fidelity” of a system. These papers suggest that the
confusion concept can degrade the eavesdropper’s ability to compromise a network
system.
The important result of the Confusion proposal is the unilateral implementation that can
be used with Confusion. It does not require the cooperation of both users to obtain
security. Thus, the Confusion proposal is a strong alternative to the added confusion and
complication of public key exchange and symmetric key system or other encryption
algorithm that require the users to have previously agreed on the encryption and the
encryption keys. It is the ability to incorporate confusion into existing protocols without
changing implementations that allows its use on legacy systems.
In conclusion, a confusion-based network protocol can ensure security of information
from interception.
Current Proposal
The key change to this project compared with current systems is the additional
implementation of the confusion protocol on top of a currently existing client of IM
system. The project idea is to implement confusion as a way to securely transfer IMs
without the overhead of complicated key exchange. This solves three problems.
2
“The Eavesdropper’s Dilemma”, Eric Cronin, Micah Sherr, and Matt Blaze, submitted for publication.
First, the key exchange requires the user computer to actually carry out encryption and
decryption processes. This requires the user to install some software on the currently used
machine. This turns majority of users away from utilizing this security measure because it
is not user-friendly and the “reward” of extra security does not seem to outweigh the
inconvenience of additional software. The additional library provided by this project will
be much more user-friendly.
Second, current security implementations are vulnerable to attacks that make them
insecure. Dr. B. Schneier published a paper on this vulnerability: a chosen cipher attack
against the majority of the security protocol discussed above.3 Confusion offers an
alternative that can provide some measure of data confidentiality without utilizing
cryptography.
Third, the most important improvement is that implementing confusion does not require
symmetric implementation on the receiving side. The receiving user need not do anything
for the traffic to be secure. The confusion protocol is carried out unilaterally. Only the
sender has to use the library for the outgoing message to be secure. The receiver need not
even know the confusion protocol is being used by the sender, unlike traditional key
exchange encryption.
These three advantages mean a confusion-based system will have significant advantage
over the current security implementations for IMs.
“A Chosen Ciphertext Attack Against Several E-Mail Encryption Protocols”, Jonathan Katz. Bruce
Schneier, June 23, 2000
3
Technical Approach
The construction of this project will be broken down into two sections. The first will be
the IM system itself. The second will be the confusion protocol on top of the IM system.
IM System
The IM system will be no different then the current existing model. The goal is to be able
to add the confusion protocol to current IMs without modifying them. The IM protocol
that is going to be used for testing will be the Jabber protocol implemented on UNIX.
The system that will be used to demonstrate the confusion protocol would be the GAIM
system.
The Jabber protocol offers all the ability that other IM protocol do. It consists of client
and server architecture. The client has the ability to edit and type messages and send them
off to another client. Client-server architecture takes the messages sent from client to a
server than reroute the message to destination client. This allows the server to keep track
of user data on the servers and provide additional services such as contact list
management and external access. The real services are all completed on the server side
and the client side serves only to interact with the user. The Jabber protocol is
implemented over a TCP network.
The diagram bellow illustrates the basic server client architecture.
C1----S1---S2---C3
|
C2----+--G1===FN1===FC1
The symbols are as follows:
o C1, C2, C3 = XMPP clients
o S1, S2 = XMPP servers
o G1 = A gateway that translates between XMPP and the protocol(s)
used on a foreign (non-XMPP) messaging network
o FN1 = A foreign messaging network
o FC1 = A client on a foreign messaging network
"-" represents communications that use XMPP and
"=" represents communications that use any other protocol4
4
http://www.ietf.org/rfc/rfc3920.txt, the Extensible Messaging and Presence Protocol (XMPP): Core, by
the Jabber Software Foundation
XMPP stands for Extensible Messaging and Presence Protocol (formal name for the
Jabber Protocol)
Protocol
The Jabber protocol defines the communication syntax between all the elements of the
IM system. This includes multiple servers, clients and gateways. A gateway is a system
which allows the Jabber system to interact with foreign clients and servers.
The core of the XMPP relies on its utilization of XML as its base for communication.
The communication syntax relies on the parsing of specific XML tags, which are noted in
RFCs 3920 and 3921 (I will not go into much details here as the RFCs are incredibly
extensive).
The communication between server and client is done through establishing a stream
(usually over a TCP connection). Within a stream there can be multiple XML stanza,
(XML stanza is the unit of communication), sent between the elements on either end of a
stream. Below is an illustration of the stream.
|--------------------|
| <stream>
|
|--------------------|
| <presence>
|
|
<show/>
|
| </presence>
|
|--------------------|
| <message to='foo'> |
|
<body/>
|
| </message>
|
|--------------------|
| <iq to='bar'>
|
|
<query/>
|
| </iq>
|
|--------------------|
| ...
|
|--------------------|
| </stream>
|
|--------------------|5
The key words contained in the ‘<’ and ‘>’ denotes a specific tag; usually they come in
pairs.
The Jabber protocol also supports security protocols and other protocols that prevent bad
connections and other errors. All are defined in RFCs 3920 and 3921. The primary part of
the protocol that concerns this project is the <body> to </body> stanza. This stanza
5
http://www.ietf.org/rfc/rfc3920.txt, the Extensible Messaging and Presence Protocol (XMPP): Core, by
the Jabber Software Foundation
encases the actual IM message itself. The confusion protocol is only implemented on the
message between those two tags6.
Confusion
The confusion protocol will be layered on top of the IM system. The client will utilize a
semantic noise generator to secure any transmission.
A semantic noise generator is a generator that generates words in a specific language. In
this case, it would generate English noise. Use of a semantic noise generator means that
the noise created, which is indistinguishable from the message itself, will successfully
hide the real transmission.7
The key modification to the normal client is that the messages sent from client are going
to be sent along with the noise generated by the semantic noise generator.
Again the confusion protocol will be implemented as a library. Because it is also only
implemented on the sender side all security action will be carried out unilaterally
regardless of the receiver.
Implementation
The implementation is broken down into three distinct parts: the noise generation, the
capturing of system calls to replace with new libraries, and the editing process for the
TTL and MAC of outgoing packets. By changing the TTL and MAC properly, the
network will drop the packets before it gets to the end user.
Example TTL:
The TTL number is essentially the number of “jumps” the packet takes before it reaches
its destination. If the packet takes more jumps than the TTL number, it is dropped from
the network. Thus, by changing the TTL number on the confuse packet to be LOWER
than that of the number required to reach the destination, the receiver would never
receive the confuse packets. Instead, the packets would all be dropped during the
transportation.
Noise generation
Noise generation relies on the Dadadodo technology.8 The Dadadodo program written by
Jamie Zawinski utilizes the input text and constructs a random parse tree. The new texts
are generated by doing a random walk through the parse tree weighted by the probability
of the connecting words. The important result is that the more extensive the input text the
better the end random texts will be. I will use chapter1 of Orwell’s famed 1984 novel as
6
see implementation section
see section Security
8
http://www.jwz.org/dadadodo/, Dadadodo by Jamie Zawinski
7
an example in this project. However, any form of English human text will support
Dadadodo (For instance, this protocol works equally well on the complete works of
William Shakespeare). The noise generation will be done by a function calling Dadadodo,
via popen(). The output of the Dadadodo is stored in a 2-dimensional data structure,
where the first dimension is the pointer to each word and the second dimension is the
characters themselves that form the word. Each word is fixed at 20 characters, if a word
is not as long as those null characters are used to fill out the end. The reason 20
characters are chosen is to solve the problem that TCP packets are numbered and if the
word packets are not send out in the same size, the sentence can be easily reconstructed
and lose the security.
Consider the following example:
Real Message: David had Chinese for lunch
Confusion Message: Micah helped quite a bit
If this were sent out on a TCP stream there are only 4 possible combinations because
“David” can only be followed by “helped quite a bit” or “had Chinese for lunch”, same
goes for “Micah”. If all the packets are 20 byte size than the combinations become 25 =
32 combinations. The second problem is the irregular length of the sentence. This is
solved by passing the length (number of words) of the actual message into the noise
generation function, such that all sentences not long enough are padded by NULL packets
and sentences that were too long are cut off by the length variable.
The noise generation function calls Dadadodo “number” of times (“number” is a passed
in variable as well). This results in the returning of a 3 dimensional character array,
indexed by number of confused messages, word number, and character string of a word.
The final size of return is “number” by “word_count” by 20.
The implementation is done in the confuser2.c file.
Capturing System Calls
All relevant system calls involved in sending packets over a network are replaced by a
new set of system calls. The functions replaced are:





Socket
Setsockopt
Write
Send
Close
The most important rewrite comes in the “write” and “send” system calls. These two
system calls are rewritten to parse through the send message in order to separate the real
IM message into word for word packets. The “message body” in a Jabber protocol is
contained in the <body> and </body> tags. The “write” and “send” functions first detect
the start tag and then break the sentence up into words (padding the words up to 20
characters and sending each word) until end tag.
The code is contained in the libcvore2.c file.
Confuse Process
This is the background process that waits for each message, calls the generate noise
function, and sends out the noise and the real message with the choice of TTL or MAC
for confusion. The process first waits for a message then when it receives a message it
notes down the packet information (IP address, etc) so that confuse packets can be
generated. Upon receiving the <body> tag from the send command, the program
determines if this is part of a message the already had noise generated. If not, then, the
program generates noise messages and stores them with a counter (to keep track of noises
sent) and a destination IP address (to differentiate different calls). It proceeds to send the
message through with a set of generated noise. Then, incrementally the counter that is
stored with the generated noise (goes onto the next word). The noise information is stored
in the link list of noise nodes defined as a struct. The program continuously loops to wait
for new messages.
The code is contained in the confuser2.c file.
Security
This is a two part discussion. The first issue is the security of the algorithm. The second
is the cost of the algorithm. I will begin with the latter.
Cost
There are two different type of cost related to this algorithm. The cost in terms of
additional bytes sent that taxes the network (measuring cost to network) and secondly, the
latency related to the additional transmissions.
Consider a Jabber IM message to be size m + x + n
m is the size of the actual text message (ex. “my roommate is asleep.”)
x is the xml overhead of the jabber protocol
n is the network overhead (basically everything other than data the packet carries)
Let C be the number of noise packets generated per word
Let T be the time it takes to send one packet
Assumption made on the messages for easiness in the analysis:
1. The message contains words of the English language.
2. Messages have a consistent grammatical structure.
3. Messages do not contain abnormal characters or spacing (ex. ACSII art).
4. Message is sent in one packet.
5. Words are restricted to less than 20 characters.
Latency
Latency caused by retransmission comes from the TCP protocol. Each transmission waits
for a send back before transmitting the next packet. The result, sending the additional
noise packet, increases the additional wait time.
For each Jabber IM message, the implementation breaks the message down to all data
before <body> tag, <body> tag itself, each word in the actual text message, </body> tag,
and the rest of the data. This breaks xml part of the jabber message into 4 parts (all data
before <body> tag, <body> tag itself, all data before <body> tag, <body> tag itself and
the rest of the data) and the text into word packets. Thus, the number of packets sent
instead of the original packet is 4 + m / 6, where 5 is the average length of the English
word and the last character for space or punctuation delimiting a word. The noise packets
are all the same size as the word packets (which are transformed into 20 bytes regardless
of the original word size). Thus the number of noise packets sent is C * m / 6.
Therefore the final number of packets sent to through the network instead of the original
packet is 4 + m / 6 + C * m / 6
Thus, the Latency which is caused by addition packets is ( 4 + m / 6 + C * m / 6 ) * T
Byte Cost
The byte cost will be the cost to the network. Again consider the same packet we sent
above. The total number of packets sent because of the confusion algorithm is 4 + m / 6
+ C * m / 6.
The network overhead is n and the network information is duplicated for all the packets.
Thus, we sent n * (4 + m / 6 + C * m / 6 ) number of bytes of network overhead.
The original packet contains length x + m. The implementation breaks the packet down to
words and lengthens each word to 20 characters. So the data sent to complete the original
message becomes x + m / 6 * 20
Finally consider bytes sent in the noise packets. Each noise packet is 20 bytes and there
are C copies per word. The bytes sent are C * m / 6 * 20
The total number of bytes sent is:
n * ( 4 + m / 6 + C * m / 6 ) + x + m / 6 * 20 + C * m / 6 * 20
or
n * ( 4 + m / 6 + C * m / 6 ) + x + ( m / 6 * 20 ) * ( C + 1 )
The cost ratio is:
( n * ( 4 + m / 6 + C * m / 6 ) + x + ( m / 6 * 20 ) * ( C + 1 ) )/ ( n + x + m )
Algorithm
The English language consists of the approximately 500,000 included in the Oxford
dictionary. If each word were to be confused with 500,000 noise words then the message
is considered to be “secure”.9 Unfortunately, if C = 500,000, the cost to the network
would be too high for most networks to handle efficiently even though the cost is a linear
increase.
If the number of packets is below the number of possible words in the English language
than security depends on the combination of words that can be use to hide the real
message.
Since for each word, the implementation sends C confused words, the number of
combination is ( C + 1 )( m / 6 ) without taking into account any structure of the English
language.
9
Claude E. Shannon, "Communication Theory of Secrecy Systems", Bell System Technical Journal,
vol.28-4, page 656--715, 1949.
English
The language structure increases the complexity of the problem. A shrewd eavesdropper
can analyze the combination of noise packets send and eliminate the combination that
does not follow the norm of the English Language autonomously. The term “norm” is
used because there is still no perfect method to determine if a combination of English
words creates a valid sentence. The lack of a perfect grammar checker illustrates the
challenge posed by attempting a perfect, autonomous analysis of the English language.
However, many probabilistic attempts have been accomplished. Dadadodo, for example,
builds a Markov chain based on existing English text to generate new text that follows
the same probabilistic pattern.
For this confusion implementation, however, it implies ( C + 1 )( m / 6 ) is not the measure
of security but ( C + 1 )( m / 6 ) * x where x is the percent of the combination that cannot
be eliminated autonomously by taking into account the English language. Nonautonomous threat is eliminated because one can create a large enough value for C such
that manual elimination becomes infeasible. Note that x can be a function of C and not a
constant.
Empirical Results
The derivation is beyond the scope of this current project. It is however trivially true that
( C + 1 )( m / 6 ) * x has a lower bound of ( C + 1 ). Thus, an increase in copies of confuse
packets translates to at least a linear increase of security ( d ( C + 1 ) / d C = 1 ).
However, some empirical results were obtained.
The easiest way to eliminate sentences that does not conform to the norm of the English
language is to build a Markov chain of words base on a pre-existing language text, much
like Dadadodo. Then proceed to eliminate the sentence combinations that cannot be
stepped through with the tree build with the existing text.
With limited amount of testing on a very simple program that builds the tree from chapter
2 of Orwell’s novel, the results have been promising. The program return the most likely
sentence base on only 5 noise copy and the real message. So far all the results have not
returned the original message.
The reason for using Chapter 2 of Orwell’s novel is the importance of using consistent
language so that past and current text messages are compatible with one another. If
Shakespeare were to be used, then most likely, none of the combination, including the
original message, would have passed through the filter.
Naturally an eavesdropper would use past text collected from the user the eavesdropper is
eavesdropping on.
Conclusion
While there is much additional research that needs be done to complete the security
analysis on the confusion implementation over IM, the current implementation offers
much promise. The same implementation can also be implemented for email and other
form of network traffic that transports English text. The implementation of this project
will not replace the use of encryption but offer additional security to current IM systems.
There are many assumptions that were used only to serve to simplify the implementation
that on a more elaborate system can be removed (These include, for example: the 40
word sentence limit, noise generated in the form of complete sentence, and abnormal
character restriction which can be removed with more complex inter-process
communication and parser.). The assumption of 20 character word limit is a choice made
based on the fact that the majority of English words are far shorter than 20 characters.
The longest word recorded in the Oxford dictionary is 52 characters. The filler characters
can also be changed (currently set as space) to more suitable characters depending on the
system.
The implementation overall succeeded in its original intent: to generate noise to hide the
real message transporting across a real time IM stream. There were many implementation
difficulties that were successfully maneuvered around, such as the TCP packet
sequencing problem.10
With the implementation, the main goal of proving confusion can be achieved on a
practical legacy system is accomplished. Even though its security value will need more
rigorous analysis, the preliminary results support confusion protocol for IMs.
10
Additional explanations in the difficulties section.
References
1. http://www.ietf.org/rfc/rfc2440.txt, the Open PGP protocol, PGP protocol originally by
P. Zimmerman
2. “The Eavesdropper’s Dilemma”, Eric Cronin, Micah Sherr, and Matt Blaze, Submitted
for publication.
3. http://www.jwz.org/dadadodo/, Dadadodo by Jamie Zawinski
4. Jonathan Katz. Bruce Schneier , “A Chosen Ciphertext Attack Against Several E-Mail
Encryption Protocols”, 9th USENIX Security Symposium, June 23, 2000
5. http://www.ietf.org/rfc/rfc3920.txt, the Extensible Messaging and Presence Protocol
(XMPP): Core, by the Jabber Software Foundation
6. Claude E. Shannon, "Communication Theory of Secrecy Systems", Bell System
Technical Journal, vol.28-4, page 656--715, 1949.
Appendix 1
Difficulties
The difficulty this project faces thus far lies in both the theoretical work and the
implementation of the protocol. There exist many works on normal client server IMs
systems which provided models to conduct theoretical security analysis. In the past,
however, many existing proofs are of security on encryption. The same procedures are
difficult to duplicate for proving the protocol is secure. Unlike the proof for an encryption
algorithm, much depends on the ability to analyze a specific language. Proof also
depends on the ability to generate noise. Thus, the two abilities rely on the same
technology. In a sense, the implementation also increases its strength as the ability to
“decode” the language improves. The traditional theoretical concept of perfect secrecy is
also not applicable to this project. Perfect secrecy is the condition in which the
probability of resulting in the encrypted text is independent of the message.11 More
simply, given the encrypted text, the only way to get the decrypted text is as good as
randomly guessing.
The other difficult problem however is that the cost of security cannot be analyzed until
the implementation is completely designed. This problem halted any progress on
theoretical work until the project was almost complete. This is because there are many
design consideration that affect the cost of the confusion algorithm (the 20 character per
word issue for example).
The third difficulty is the implementation of the protocol because there are many
unforeseen network issues like TCP numbering of the packets, which disallow
transmission of sentences of uneven length. However, by contrast, utilizing Dadadodo12
proved to be relatively simple.
However, the greatest difficulty is the problem with network programming itself. It is
difficult to test and construct programs for a network system. This significantly impeded
progress on this project at an implementation level (We spent 3 weeks on a bug that we
have no understanding of—regarding either its occurrence or disappearance.).
11
Claude E. Shannon, "Communication Theory of Secrecy Systems", Bell System Technical Journal,
vol.28-4, page 656--715, 1949.
12
http://www.jwz.org/dadadodo/, Dadadodo by Jamie Zawinski
Download