Scalable Cryptographic Authentication for High Performance Computing Andrew Prout, William Arcand, David Bestor, Chansup Byun, Bill Bergeron, Matthew Hubbell, Jeremy Kepner, Peter Michaleas, Julie Mullen, Albert Reuther, Antonio Rosa 2012 IEEE High Performance Extreme Computing Conference 10 - 12 September 2012 This work is sponsored by the Department of the Air Force under Air Force contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government. Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results HPEC 2012 - 2 AJP 9/12/2012 LLGrid System Architecture • LLGrid is a ~500 user ~2000 processor system • World’s only desktop interactive supercomputer – Dramatically easier to use than any other supercomputer – Highest fraction of staff using (20%) supercomputing of any organization on the planet • Foundation of Supercomputing in Massachusetts Service Nodes Users Compute Nodes Network Storage Resource Manager Configuration Server LLAN To Lincoln LAN HPEC 2012 - 3 AJP 9/12/2012 LAN Switch Cluster Switch LLGrid Usage Classic Supercomputing • Desktop Computing – CPU-time <20 minutes 10000 • Classic Supercomputing – Wall-clock time >3 hours • Interactive Supercomputing Desktop Computing 100 1 TX-2500 (952 Cores) TX-X (220 Cores) TX-3d (540 Cores) 10 100 Processors used by Job HPEC 2012 - 4 AJP 9/12/2012 Interactive Supercomputing – Between desktop and classic supercomputing – Shortens the “time to insight” – Ten development turns/day instead of one turn/week 1 Total Job duration (seconds) 1M All jobs run on LLGrid 1000 Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results HPEC 2012 - 5 AJP 9/12/2012 Challenges with Interactive Supercomputing • As the line between a shared supercomputer and a “really powerful personal computer” blurs, users expect to have access to network resources (storage, svn, cvs, etc). Challenge: Users expect seamless access to other network resources from the HPC. HPEC 2012 - 6 AJP 9/12/2012 Challenges with Interactive Supercomputing • However these commands raise security concerns. – They store passwords as plain-text on the HPC central storage. – Password synchronization has made this password very sensitive. “S3cr3t” Challenge: Ensure seamless access without putting the user’s “one common password” at risk. HPEC 2012 - 7 AJP 9/12/2012 Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results HPEC 2012 - 8 AJP 9/12/2012 Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. User HPEC 2012 - 9 AJP 9/12/2012 Server Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request HPEC 2012 - 10 AJP 9/12/2012 Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request A Authentication Request HPEC 2012 - 11 AJP 9/12/2012 Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request A Authentication Request A HPEC 2012 - 12 AJP 9/12/2012 Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request A Authentication Request Signed Authentication Response and copy of PKI certificate A HPEC 2012 - 13 AJP 9/12/2012 Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request A Authentication Request A Signed Authentication Response and copy of PKI certificate A HPEC 2012 - 14 AJP 9/12/2012 Cryptographic Authentication • Cryptographic authentication of clients using X509 PKI certificates has long been part of the SSL and TLS standards. • The root of trust will certify that a specific keypair belongs to a specific user or process. Server User Connection Request A Authentication Request Access Granted: Welcome Andy! A Signed Authentication Response and copy of PKI certificate A HPEC 2012 - 15 AJP 9/12/2012 Challenges with Cryptographic Authentication • Cryptographic authentication depends on both the security of the user’s private key and access to it. – Storing the private key on central storage is little different than storing a user’s password. Challenge: Where to store the private key? HPEC 2012 - 16 AJP 9/12/2012 Challenges with Cryptographic Authentication • Cryptographic authentication depends on both the security of the user’s private key and access to it. – Storing the private key on central storage is little different than storing a user’s password. No guarantee the key won’t be lost, copied or left unprotected. HPEC 2012 - 17 AJP 9/12/2012 Challenges with Cryptographic Authentication • One traditional solution is to store the key on the client system and forward authentication requests back to the user’s system. – Could be on the client system or in a smart card. HPEC 2012 - 18 AJP 9/12/2012 Challenges with Cryptographic Authentication • One traditional solution is to store the key on the client system and forward authentication requests back to the user’s system. – However this fails if the user disconnects from the HPC. Poof! Forwarding requests back doesn’t work for semi-interactive computing or background jobs. HPEC 2012 - 19 AJP 9/12/2012 Challenges with Cryptographic Authentication • Connecting smart cards to the HPC is not practical. – Some network-attached key storage devices exist, but their practical benefit in this scenario is questionable. Poof! HPEC 2012 - 20 AJP 9/12/2012 Challenges with Cryptographic Authentication • We implemented a virtual smart card to run on each node. – Allows for keys to be used on any node, connected or disconnected. – Allows for different keys on each node. Poof! HPEC 2012 - 21 AJP 9/12/2012 Virtual Smart Card Defined • Uses the smart card communication API: PKCS#11. • Authenticates users and allows authorized users to perform cryptographic operations. • Protects private keys from being copied, even by authorized users of the key. • High throughput capability & low latency. – Physical smart cards have a latency approximately 800-900ms. HPEC 2012 - 22 AJP 9/12/2012 The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. – Runs as it’s own user account. HPEC 2012 - 23 AJP 9/12/2012 Keyd The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. – Runs as it’s own user account. – Has access to all the keys. HPEC 2012 - 24 AJP 9/12/2012 Keyd The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. – Runs as it’s own user account. – Has access to all the keys. Keyd • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. – Loaded by applications running as a HPC user. PKCS#11 HPEC 2012 - 25 AJP 9/12/2012 The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. – Runs as it’s own user account. – Has access to all the keys. Keyd • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. – Loaded by applications running as a HPC user. – Connects through a unix socket. – User credentials passed through the socket Secure, provided you trust your linux kernel. HPEC 2012 - 26 AJP 9/12/2012 PKCS#11 The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. – Runs as it’s own user account. – Has access to all the keys. Keyd • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. – Loaded by applications running as a HPC user. – Connects through a unix socket. – User credentials passed through the socket Secure, provided you trust your linux kernel. • The SVN client can then load the PKCS#11 library and use the keys to authenticate to the SVN server. HPEC 2012 - 27 AJP 9/12/2012 PKCS#11 The keyd Daemon: A Virtual Smartcard • We created the keyd daemon to be the brains of our virtual smartcard. – Runs as it’s own user account. – Has access to all the keys. Keyd • We then created a library that conformed to the PKCS#11 standard and could talk to this daemon. – Loaded by applications running as a HPC user. – Connects through a unix socket. – User credentials passed through the socket Secure, provided you trust your linux kernel. • The SVN client can then load the PKCS#11 library and use the keys to authenticate to the SVN server. – Other applications can be enabled in the future. HPEC 2012 - 28 AJP 9/12/2012 PKCS#11 Configuring SVN for TLS Client Auth • The SVN server was configured to accept the LLGrid’s root of trust. • The SVN client on the LLGrid was configured to load the keyd daemon PKCS#11 library. – One configuration entry: ssl-pkcs11-provider=libkeyd_pkcs11 Keyd Daemon SVN Server SVN User Connection Request A Authentication Request A Signed Authentication Response and copy of PKI certificate A HPEC 2012 - 29 AJP 9/12/2012 Outline • What is the LLGrid • The Problem: External services authentication • The Solution: Cryptographic authentication • Results HPEC 2012 - 30 AJP 9/12/2012 X509 PKI Certificate Enrollment • Keypair generation and X509 PKI certificate creation is performed during user account creation. – LLGrid Adminstrators act as the root of trust. • We developed scripts that execute parallel key generation across nodes in the cluster. 500 Keypair & Certificate Generation 450 400 Time (seconds) – Each certificate asserts both the user identity and the node identity to meet the guidelines to be used for either server or client TLS authentication. 350 300 250 Serial 200 Parallel 150 100 50 0 1 10 100 Nodes HPEC 2012 - 31 AJP 9/12/2012 1000 Results • Created a general purpose key storage and certificate management solution for HPC. – Keys are not managed by the end-user, ensuring a low risk of compromise requiring revocation. • Demonstrated that it can be used to enable single sign-on integration to systems outside of the HPC. – Mitigated security concerns over passwords being stored on the LLGrid central storage. – Avoided the issue of periodic password changes impacting batch processing. HPEC 2012 - 32 AJP 9/12/2012 Future Work • Future work will look to use these PKI certificates to secure inter-node web services communication. – Certificates are valid for both TLS client or server authentication. HPEC 2012 - 33 AJP 9/12/2012 Questions? HPEC 2012 - 34 AJP 9/12/2012