Segurança em Sistemas Informáticos MESTRADO INTEGRADO EM ENGENHARIA INFORMÁTICA ANONYMITY ON THE INTERNET OVER PROXY SERVERS Fábio Rodrigues – ei08116@fe.up.pt Matej Bulić – ei12010@fe.up.pt 07 de Dezembro de 2012 1 Index Index .......................................................................................................................................... 2 Project Description ..................................................................................................................... 3 Objectives of The Project ........................................................................................................... 3 Paper Description and Explanation ............................................................................................ 3 Program Description .................................................................................................................. 4 Program Functionalities ............................................................................................................. 5 General Idea About How The Program Works ........................................................................... 5 Program Implementation ............................................................................................................ 7 Work Distribution........................................................................................................................ 9 Conclusion ................................................................................................................................. 9 2 Project Description The project was done in the context of SSIN class ( Segurança em Sistemas Informáticos) at Feup (Faculdade de Engenharia da Universidade do Porto) as part of one of the components to determine the student's final grade. From the various themes proposed by the teacher our group chose “Anonymity on The Internet” more specifically “Anonymity on the Internet Over Proxy Servers”. We decided that it would be a good idea to compile the results of our research in form of a paper and that we would show what we have learned from said research by implementing a proxy server from scratch, using the Java programming language. Objectives of The Project The project has two main objectives. The first objective is to do extensive research about the subject of “Anonymity in The Internet” and “Proxy Servers” and use the knowledge obtained to write an easy to understand paper to be read by someone that is not familiar with advanced internet and networking concepts. The paper is then intended to give a basic understanding of why internet anonymity is important, how to achieve it and to explain in more detail one of the most commonly used methods to obtain anonymity, the proxy servers. The second objective is to apply the knowledge we acquired from the research we did and implement a proxy server that can provide basic anonymity to its users as well as give some basic proxy functionalities besides anonymity (such as forbidding the access to some specific sites). Paper Description and Explanation The paper is a compilation of the research we did on the matter of anonymity on the internet and proxy servers and was divided into several chapters to ease its reading. Introduction This chapter introduces the paper. It has a brief explanation about how the internet works and the need for anonymity on the internet. It also introduces the concept of a proxy server. 3 Privacy and Anonymity in the Internet This chapter explain how websites get user related information and how that information can be used to track the user identity. It lists a few precautions users can follow in order to protect themselves and their identity. Proxy Servers This chapter explains in detail what a proxy server is, its general purposes and how it works. This chapter includes three sub-chapters that detail three types of proxy servers: HTTP proxy servers, anonymous proxy servers and CGI proxy servers. Solutions to Increase Anonymity This chapter presents to the reader several ways, other than the use of proxy servers, which help increase a user’s degree of anonymity on the internet. Conclusion Our personal thoughts about the theme of the project are listed here. This chapter also contains a reference to the main points we covered in the paper. References The references used to write the paper. Program Description The program to develop is a jar file (an archive file format typically used to aggregate many Java class files and associated metadata and resources into one file to distribute application software or libraries on the Java platform), that contains the proxy server we developed. So basically our program is an HTTP Web Proxy implementation made from scratch by ourselves. The reasoning behind developing the proxy from scratch is due to several factors: ● ● ● ● Many http proxies are paid; The open source proxies are either too complex or badly documented; If we develop from the start, it’s easier to debug possible problems; We can control how every module of the program works if done by us from the start; 4 ● We learn more from our own work than from the work of others. The program was made using the Apache HttpComponents library, to ease the parsing and handling of the requests. The Apache HttpComponents project is responsible for creating and maintaining a toolset of low level Java components focused on HTTP and associated protocols. To learn more about this library please consult the following link: http://hc.apache.org/ Program Functionalities Our main focus was to make our proxy able to hide the clients. So the first functionality we implemented was the request handling and forwarding done by the server. After that we tried to add several more functionalities to make the proxy as complete as possible. Here is a full list of the main functionalities of our proxy: ● Forward all of the client's GET requests and process them, this lets them browse the web anonymously; ● Can be used to bypass country restrictions anonymously. ● Support for many concurrent clients at the same time (multi-threading); ● Java Swing Interface with built-in log and options menu; ● Modular design to make future developments easier to implement; ● Can block specific sites (e.g. : facebook.com), specified on the "restrictions" text file; In order to test our proxy, we used Google’s browser: Google Chrome. Every browser can be set to work via a proxy by accessing the browser options and changing the connection to a certain IP and port ( proxy server’s IP and port). We decided to use chrome because we can simply right click on the browser symbol and add the following to the target path text: “--proxyserver=localhost:6000”. We are basically telling chrome to use the server located in the ip ”localhost” (our own machine, for testing purposes) and port 6000. After that we just need to tell our proxy server to listen for requests on that same port (6000). If more information is needed, a screencast of the application is available in the following link: http://youtu.be/n4KCQ9NzPKc General Idea About How the Program Works The program was made using a modular design to allow us to add functionalities after we completed the core. In this case the core of the program is the socket. 5 On our program, the server runs on a specific computer and has a socket that is bound to a specific port number. The server just waits, listening to the socket for a client to make a connection request. Picture 1: Schematics of a connection request made by the client On the client-side the client knows the hostname of the machine on which the server is running and the port number on which the server is listening. To make a connection request, the client tries to rendezvous with the server on the server's machine (by it’s IP address) and port. The client also needs to identify itself to the server so it binds to a local port number that it will use during this connection. This is assigned by the system. If everything goes well, the server accepts the connection. Upon acceptance, the server gets a new socket bound to the same local port and also has its remote endpoint set to the address and port of the client. It needs a new socket so that it can continue to listen to the original socket for connection requests while tending to the needs of the connected client. On the client side, if the connection is accepted, a socket is successfully created and the client can use the socket to communicate with the server. The client and server can now communicate by writing to or reading from their sockets. Picture 2: Schematics of the connection made when the client’s request is accepted After we got the socket done we added classes to handle GET, POST and CONNECT requests. Although only the class that handles GET requests is fully functional, considerable effort was made to make the other two classes work. In fact, we managed to login in facebook using our proxy, but that required leaving the password hardcoded inside our code or the usage of cookies, and since every site has a different form used by the POST requests, it’s almost impossible to do a proxy that manages to login in every website. The class to handle the CONNECT requests is supposed to be used when connecting to another proxy. Due to lack of time only the basis is implemented and it was not fully tested so we left it out of the final product. Support for several clients, in other words multi-threading was added afterwards in order to support many concurrent requests. So instead of processing a request at a time, the socket class launches an Http Request Handler thread every time a request is made. When the request is fulfilled, the thread is destroyed. The last thing we did was the interface. We did a simple interface for the server using Java Swing, the primary java Gui widget toolkit. The interface’s job is to: 6 ● Provide an easy way to start the program and specify the program parameters (in this case the port where our server’s socket will be listening), avoiding command lines or code recompilation using an IDE. ● Provide a way for the server host to see the requests being handled, by displaying a log of requests in the interface. Program Implementation In order to better explain how our program was made, we created a simplified UML class diagram of our program: Picture 3: Simplified UML diagram of our proxy implementation MainWindow 7 This is the first class of the program. Here we define the graphical elements of the interface such as frames, panels, buttons, text boxes and a text area which serves as a log. The log is updated with information when the clients make requests to the proxy server. When the user enters the port number in the respective field and presses the ok button, a thread with the ProxyServer class is created. This way, by having a thread for the interface and the socket listening in another thread, lets us update both threads independently (e.g. : we don’t need to wait for a client request to be fulfilled on the socket thread before updating the information on the log about the site the client is trying to access). ProxyServer This class creates a server socket on the port specified by the host and starts it. It includes an infinite loop (only stops when the program is closed). When a client makes a request to the server’s socket port, an instance of the ProxyThread class in launched in a new thread. ProxyThread This class is the heart of the program. It accesses the server’s socket input and output streams and processes information accordingly. The input stream is the data entering the server through the socket. The output stream is the data leaving the server through the socket. In this case, the input stream includes request made by clients to the server and responses of the final server after the proxy has forwarded the client’s request. The output stream includes responses sent to the client, that is, the forwarding of the final server’s response back to the original client and the requests forwarded to final servers. It is also here that the restrictions file is verified to check if the website the client is trying to reach is not blocked. If access is blocked, no answer is returned to the original client. This class also verifies what type of request is being made: GET, POST or CONNECT and calls the respective class to handle each one. ReadSiteRestrictionsFile This auxiliary class reads the file named “restrictions.txt” that is in the same directory as the proxy server application, line by line. This lets the host specify a list of sites, one per line, that the proxy should block. HttpGetMethod This class handles the Http GET requests. It connects to the url of the website, receives the website’s response through the socket’s input stream and redirects that response to the socket output stream in order to forward it to the client that made the original request. Some precautions have been taken such as ensure that the bytes in the streams have the correct encoding, in this case ISO-8859-1. HttpConnectMethod 8 This class is supposed to handle Http CONNECT requests. This command may be used to establish a TCP connection to a nominated host or to be used with a proxy that can dynamically switch to being a tunnel (e.g. SSL tunneling). Since our proxy only supports http and not https we didn’t finish implementation and testing of the class because of the lack of time. Nevertheless we decided to leave it in the program because considerable time was spent trying to make it work. HttpPostMethod This class is supposed to handle Http POST requests. We were able to connect to facebook accounts using our real usernames and passwords using the specific form used by the site login system. Since every site has a different form with different parameter names, it’s almost impossible to make a generic class that works for all the POST requests. The display of this functionality required that a real account was left hardcoded in the program, therefore we decided not to finish implementing it yet again due to lack of time to do so. Work Distribution Here we specify what each student did for this project. Fábio Rodrigues: ● ● ● ● ● All of the slides of the first presentation; Investigation about the best ways to implement a proxy; Full implementation of the proxy; Slides presented on the second presentation (slide 11 to 25); Elaboration of this document (final report). Matej Bulić: ● ● ● Investigation needed to write the paper; Writing of the full paper; Slides presented on the second presentation (slide 1 to 10); Conclusion 9 We feel that we fulfilled the objectives of the project, given the short time we had to work on it. We learned more about how anonymity can be achieved in the internet (or at least how to improve it) and about the implementation of a proxy. We also had a chance to practice our programming skills in the fields of socket implementation, http request handling, multi-threading and graphical user interface creation. We think that there is so much more that can be improved, if only there was time, so here is a list of possible future developments: ● ● ● ● ● Implement support for POST requests to the most commonly used sites (e.g. : gmail, facebook, webmail); Finish testing and implementing the support for CONNECT methods, to enable the program to connect to other proxies, increasing our anonymity. Implement different kinds of proxy into one (e.g.: ftp proxy, https proxy, proxies to play games, etc...); Save logs in a database for statistical purposes; Support for youtube videos. 10