Access to MIDAS AWS Twitter Data This document describes how MIDAS researchers can request access to the 1% API sample Twitter data archive that is being maintained on AWS EC2 cloud resources. The first step is to request an account on the AWS Twitter data analysis compute instance by sending an email to the system maintainer, Doug Roberts, droberts@rti.org. Access control to this compute instance is maintained via ssh key pairs. The following sections of this document describe how to generate a key pair, and how to log on to the instance. Once the requestor’s public ssh key has been sent to the systems maintainer, a user account for the requestor will be created, and instructions will be sent for how to access the data and the keyword search code. It should be noted that this AWS instance is a Linux host. SSH Login to MIDAS-AWS twitter instance SSH login facilitates authentication without the actual transmission of password over the network thus eliminating chances of spoofing by hackers. This combined with the proper configuration on the server one can reduce the chances of hackers getting in. This is the choice of authentication/login for the MIDAS-AWS-Twitter instance. If you are familiar with SSH Keys you can skip the next section. SSH Keys Basics: (https://wiki.archlinux.org/index.php/SSH_Keys; https://help.ubuntu.com/community/SSH/OpenSSH/Keys) SSH keys serve as a means of identifying yourself to an SSH server using public-key cryptography and challenge-response authentication. One immediate advantage this method has over traditional password authentication is that you can be authenticated by the server without ever having to send your password over the network. SSH keys always come in pairs, one private and the other public. The private key is known only to you and it should be safely guarded. By contrast, the public key can be shared freely with any SSH server to which you would like to connect. When an SSH server has your public key on file and sees you requesting a connection, it uses your public key to construct and send you a challenge. This challenge is like a coded message and it must be met with the appropriate response before the server will grant you access. What makes this coded message particularly secure is that it can only be understood by someone with the private key. While the public key can be used to encrypt the message, it cannot be used to decrypt that very same message. Only you, the holder of the private key, will be able to correctly understand the challenge and produce the correct response. This challenge-response phase happens behind the scenes and is invisible to the user. As long as you hold the private key, which is typically stored in the ~/.ssh/ directory, your SSH client should be able to reply with the appropriate response to the server. Because private keys are considered sensitive information, they are often stored on disk in an encrypted form. In this case, when the private key is required, a passphrase must first be entered in order to decrypt it. While this might superficially appear the same as entering a login password on the SSH server, it is only used to decrypt the private key on the local system. This passphrase is not, and should not, be transmitted over the network. Creating the key files on a linux client: Create a key-pair on a linux box using the command ssh-keygen -t rsa This will generate 2 files named “yourid” and “yourid.pub” in the location you specified. It may default to the working directory. Provide the “yourid.pub” file to the MIDAS-AWS server admin (Doug) so that he can add you as a user on the AWS instance. Logging from a linux PC If the key files were properly generated and located, you should be able to long in as indicated below. ssh -l yourid 23.21.236.206 If you would like to see the error messages use the verbose (-v) option as below ssh -v yourid@23.21.236.206 Using a SSH config file to point to the proper key file for authentication: If the server fails to find your private key file, it is a good idea to have a .ssh/config file which tells the server where to look for the key file. A sample config file is shown below. If you already have a config file with entries for other keys, append the file with information about the new key file. # .ssh/config # host example Host 23.21.236.206 User yourID IdentityFile ~/.ssh/keyfile Please change the file permission to read only by the following command chmod 600 .ssh/keyfile In the above type the contents in bold as is and substitute the remaining as applicable in your case. If this is successful you should be able to login by typing the command ssh -l yourid 23.21.236.206 Working from Windows PC: Tools required: 1. puttyGen to generate the key-pair: You can down load the puTTYgen tool at (http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html) . Note that it also comes packaged with winSCP. 2. winSCP to transfer files between the local PC and the remote server 3. putty for logging to the remote server via SSH Generating key pairs from a windows machine: Instructions for generating the key pairs can be found at: http://the.earth.li/~sgtatham/putty/0.53b/htmldoc/Chapter8.html. Pay attention to section 8.2.10 "Public key for pasting into authorized_keys file". This is the best way to save your public key file to the remote Administrator. Start the puttyGEN tool on your machine: Click the “Generate” button Move the mouse around as instructed to generate some randomness. You will see the above screen after the key is generated. Saving Private Key: Click the “Save Private Key” button. Enter a “Key passphrase for extra security. Save your private key file in a location of your choice as “yourid.ppk” Saving the Public Key: Public key needs to be sent to the remote administrator and needs to be in a suitable format. Please refer to section 8.2.10. Resist the temptation to click the “save public key” button. Instead select all the contents of the text box "Public key for pasting into authorized_keys file". Copy and paste this content to a text file (use notePad, textPpad or wordpad tool) called “yourid.pub” and send it to your administrator. Using puTTYgen to convert the private key: If you have a key generated on a linux machine which you plan to use on your PC, you can use puttyGEN to convert the private key file to the ppk format suitable for your PC. Start puTTYgen either by selecting the tool under winSCP or by selecting the puTTYgen.exe file Click the “Load” button Browse to the location of the private key file by selecting the “All Files” option Click “Open” button Click on OK Click “Save private key” At this stage you can enter a pass phrase or choose to have none by clicking “Yes” Choose a proper name and location for you private key with a “ppk” extention and click on “Save” button Logging from a windows machine: Logging to MIDAS-AWS Twitter instance (23.21.236.206) Start PuTTY. Enter the IP address in the “Host Name” box in the “Session” category. Then select Connection-SSH-Auth Browse to the location of the “privateKey” (youFile.ppk) and click on open Type in your username (Provided by MIDAS-AWS admin) and you should be authenticated