P2P Security 1.0 30/September/2003 Niels Olof Bouvin Aarhus Universitet P2P Security - overall topics for this talk o General security concerns relevant to all distributed systems Secure P2P systems and techniques Security Basic security concerns (Lack of) Internet security Cryptography Closing remarks on security Basic security concerns - all the things that can go wrong Dangers of distributed systems Trust o o o o who can you trust? Identity theft pretending to be you (or someone you trust) Privacy preventing others listening in on the conversation Censorship & attacks denying you the right to know (Lack of) Internet security The Internet The Internet is vast and not safe at all o data packets going from machine to machine before they reach you Many standards and protocols established back in safer days o SMTP, NNTP, telnet, ... Today there are plenty of sociopaths, who would delight in destroying your data or machine o see iloveyou, Code Red, SQL Slammer, SoBig.F, Swen, etc. etc. o not to mention industrial espionage etc. Spammers, anyone? o it has been claimed that recent worms are behind DDoS attacks against antispam sites... Who can you trust? o Surely you can trust well-established Web sites? Several important open source ftp servers have been 'owned' over the years thus leaving black hats free to insert code of their own in the cvs trees... This also happened for Microsoft last year (apparently without any ensuing nastiness)... And of course numerous sites have been hacked for credit card numbers etc. Email Email remains (despite spammers) the most successful CSCW and file sharing application on the planet The vast majority of email is not encrypted o neither in transit nor at source/destination Standards for email encryption exist o PGP (Pretty Good Privacy o S/MIME These are generally somewhat cumbersome and are (therefore) not widely used Email o Email headers can be easily spoofed faking the sender of an email is trivial One of the reasons behind the "success" of email worms users naturally trust email received from friends, colleagues, and family... Similar points can be raised for instant messaging File systems File systems are generally, by default, not encrypted Thus not funny at all to loose the company laptop PC... Without encryption your data is only as safe as the hard disk it resides on Cryptography Cryptography Fact: Messages can be intercepted. But intercepted data is worthless, if the interceptor cannot read it o (the people involved are traditionally known as Alice, Bob, and Carol (the latter usually trying to intercept and decrypt messages) Cryptography is very old (at least as old as the Roman Empire), and has been based on a long number of techniques o letter substitution or various permutations (e.g. ROT-13) o one time pads o complex machines (e.g. Enigma) o etc. Today cryptography is based on advanced, hard-to-solve mathematical problems Regardless of the method used, a key is used to signify how the plain text is transformed into cipher text Symmetric cryptography o o o The same key is used to encrypt and decrypt the message Advantages symmetric cryptography is fast Disadvantages the key must be securely exchanged between Alice and Bob if the key is compromised, the entire communication is instantly readable Asymmetric cryptography o o o o Keys come in pairs: a public key known to all a private (or secret) key known only by the individual user A message encrypted with the public key can be decrypted only by the private key So if Alice encrypts a message with Bob's public key, only Bob can decrypt it with his private key A message encrypted with the private key can be decrypted only by the public key So if Alice encrypts a message with her private key, all can verify (using Alice's public key) that Alice is the author Asymmetric cryptography o o o o Advantages as the private key is never shared, the system is secure the system can also be used to authenticate (or "digitally sign") messages Disadvantages only as secure as the private key... much slower than symmetric cryptography Establishing trust How does Alice know Bob is really Bob, and not Carol claiming to be Bob? Asymmetric cryptography often relies on CAs - Certification Authorities o these, using out-of-band methods, establish the correct identity of Bob, and assigns a (signed) certificate to Bob o Alice can then verify that some CA has vouchsafed Bob, and if she trusts the CA, she can trust Bob A problem with these certificates is the cost... A less centralised approach is taken by PGP (Pretty Good Privacy), where Bob relies on associates to confirm his identity o if Alice knows (and trusts) any of these associates, she can trust Bob's identity Symmetric/asymmetric cryptography Not an either/or situation - often used in combination Asymmetric cryptography is used for the initial communication to establish identity and (securely) exchange a randomly generated symmetric key used for the rest of the session This is the method used by SSL (Secure Socket Layer), which handles e.g. https o o o the Web server provides the Web browser with its CA signed certificate (this makes man-in-the-middle attacks harder) the Web browser generates a random key, encrypts it with the Web server's public key, and returns it to the Web server as only the Web server can decrypt the key, the Web server and Web browser can now initiate a symmetric encrypted session Secure hashes Secure (or cryptographic) hashes are used to verify the integrity of a message Most common are MD5 (128 bit) and SHA-1 (160 bit) It is computationally infeasible to create two different messages with identical secure hash codes (it requires brute force and 2^128 or 2^160 are big) Thus, if the (MD5/SHA-1) hash code of a message is known, we can check whether the message has been modified by computing the hash code of the message Given the quality of the secure hash, it is just as good to encrypt the (compact) hash code with your private key for authentication as encrypting the entire message Closing remarks on security Security - a purely technical problem? Security can be addressed through a number of technical means However, these valiant efforts are all for naught o in the face of inexperience and general cluelessness The most successful hackers have operated, not through absurd Hollywood computer guru excellence, but through social engineering o hacking is considerably easier if you can get people to tell you their password Script-kiddies usually operate through kits exploiting well-published security holes o leaving their victims with a valuable lesson if nothing else Most worms spread because people apparently just can't help clicking on attachments... o where is the problem? With Outlook or the users? o "Machine destroying email" used to be an Internet urban legend... but no more (this is progress?) Security in P2P P2P systems previously seen... Groove Various techniques for anonymity P2P systems previously seen... Gnutella/Napster/KaZaA/... - the vast majority of file sharing tools o o No security to speak of No user authentication in some cases for quite obvious reasons... but users are not anonymous... o o No on-the-wire encryption eavesdropping is trivial on most file sharing networks, as a number of people have discovered to their cost No content integrity checking the retrieved file is not what it seems... Freenet No authentication (to real world identities) as such, but can authenticate pseudonyms, allowing e.g. only the original author to update a document Each resource in a Freenet node space is encrypted and integrity checked with SHA-1 hash Network traffic not encrypted, but as the resources are encrypted, this is less of a problem Routing is performed in a way to foil surveillance JXTA o o o o o Per default: authentication at the client not on-the-wire encryption not strong checking of group membership However secure pipes can be used Arbitrary (as in "write your own") group membership checking can done Groove Security concerns for the corporate world - or this place for that matter o o o o o Most are placed behind a firewall excellent protection for most attacks unless black hats somehow get inside can make it quite difficult to collaborate with people on the other side of the firewall As email pass through the firewall, it is a popular choice for collaboration and sharing but as noted before email is usually not safe mail filters check for spam, worms and trojans, but cannot be completely updated at all times Groove - P2P for the corporation o o o o o Aims to provide a secure shared space between coworkers Encryption is the default and quite transparent to the user thus avoiding the hassle of PGP & S/MIME in email Provides chat with threaded persistent discussions shared writing space file sharing calendar o (according to their Web site, they now also integrate with Microsoft Office) Identity in Groove - a system with many, many keys o o o o o o A user in Groove has an account (possibly on a number of devices) with A pass phrase to login to the client and to generate the master symmetric key The user can have a number of identities, each with an asymmetric key pair for signing and verification an asymmetric key pair for encryption a digital fingerprint a Diffie-Hellman key pair per member per shared space Shared spaces - mutual trusting o o A mutual trusting shared space has a symmetric key used by all members to verify/authenticate messages a symmetric key used by all members to encrypt messages All space data is stored on all clients using a symmetric storage key per member per shared space (encrypted with the user's master key) There is not guarantee for identity - users could in principle spoof identities when posting to the group Shared spaces - mutual trusting messages o o o o o o A delta (a chunk of data) is sent from Bob to the group containing: a header a body encrypted using the group's symmetric encryption key a digest (SHA-1) of header and body authenticated using the group's symmetric authentication key Upon receiving this delta, Alice can verify the integrity of the delta (and thus ensuring that the delta was sent by a group member) decrypt the delta store the delta locally using her own encryption Shared spaces - mutual suspicious Mutual suspicious shared spaces are used in situation where the identity of the individual user must be ensured Each member has a Diffie-Hellman key pair per other member D-H allows Bob to compute a shared symmetric key with Alice using Bob's private DH key and Alice's public D-H key (and vice versa) By using this symmetric encryption communication between Alice and Bob is secured Shared spaces - mutual suspicious messages o o A delta is sent from Bob to the group containing: a header a body encrypted using the group's symmetric encryption key o digests (SHA-1) of header and body authenticated using each D-H derived key pair, including Bob/Bob Upon retrieval, Alice can then verify that the delta was indeed sent by Bob Initiating membership of a shared space o o o o o o o Bob is chair and sends Alice an invitation to a shared space cryptographic context (parameters for D-H) Bob's public keys Alice then replies with a one-time key encrypted in Bob's public key Alice's D-H public key Bob can then decrypt the one-time key, use it to send an encrypted message only Alice can respond to and thus verifying that she intends to join the space send the group keys to Alice send Alice's D-H public key to the other members of the space Leaving a shared space If Carol leaves the shared space, the group key is recomputed, so that she no longer can access the shared space's material Group keys are not recomputed, when people join the space, as they should have access to the history of the group Dependency on keys are stored, so that other members (who perhaps were off-line) can still access data Communication in Groove Usually peers communicate using point-to-point connections Peers are not always online or immediately available (e.g. they could be behind a firewall) Relays are used for tunnelling through firewalls, handle fan-out of messages, and do storing and forwarding Various techniques for anonymity Mix networks - defeating traffic analysis Mix networks are used to ensure that a sender and receiver cannot both be known A mix network consists of a number of known mixes - routers with asymmetric key pairs A sender chooses a path through the mix network (m_1, ..., m_n), and encrypts the message (with final destination) with m_n's public key, encrypts this message (with m_(n1)->m_n) with m_(n-1)'s public key and so on The message is then sent to m_1, who decrypts the message using its private key, and sends it to the next mix, who repeats the process Only m_1 knows the sender and only m_n knows the receiver and neither knows the route of the message (not even their own position on the path) Crowds - defeating Web browsing tracking A number of members participate in a crowd, and they are known to each other If a member, Bob, wishes to retrieve a Web page, Bob sends a request for the URL to a random member, Carol (using symmetric encryption). Carol can then choose to retrieve the Web page or forward the request to another crowd member, Alice, and so on. Eventually a member chooses to retrieve the Web page, and the Web page is returned along the request's path Summary and pointers A number of proven technologies exists, which can enable safe computing The success of these technologies hinges on ease-of-use, as they are (in themselves) quite complicated There are two directions in the area o ensuring anonymity and privacy on the Internet o ensuring identity and integrity in a working environment Next time: o Accountability o Reputation can get complicated if users are anonymous... Project description time! You must formulate a P2P project for your group You will present the result of this project to the rest of us during the first half of December (and deliver a report about it to Klaus and I by the middle of January) Project topics o o o o o o o o Anything goes! As long as it is P2P in nature it is sufficiently ambitious (neither too much nor too little) You can develop new routing and searching algorithms create a neato P2P application extend JXTA in some direction create a P2P framework The Way It Ought Be Done And Not The Way JXTA Did It etc., etc. Only one requirement prior to starting: send an URL to a short project description (no more than a page) to Klaus and I and get it approved Created by JackSVG