HASTAC Website Protection System HASTAC Website Protection System …because "Humans are smarter than any computer" Software Requirements Specification Academic Supervisor: Dr. Ron Rymon Presented By: Ronen Mendezitsky Alon Weiss |P age 1 HASTAC Website Protection System Table of contents Contents HASTAC ......................................................................................................................... 1 Software Requirements Specification ........................................................................... 1 Table of contents ............................................................................................................ 2 Overview ........................................................................................................................ 3 Motivation ...............................................................................................................3 The Product ..............................................................................................................3 General Goals ...........................................................................................................3 The Problem ................................................................................................................... 4 The Target Audience – User Characteristics ................................................................. 5 Alternative Solutions to the Problem ............................................................................. 6 Direct:......................................................................................................................6 Indirect ....................................................................................................................8 The Proposed Solution ................................................................................................... 9 The Users .................................................................................................................9 The Environment .................................................................................................... 11 Hardware ............................................................................................................... 11 The Process ............................................................................................................ 11 Use cases ...................................................................................................................... 12 Requirements ............................................................................................................... 16 Domain Definitions.................................................................................................. 18 HASTAC Website Protection System - Installation Procedure ........................................ 18 HASTAC Website Protection System - System Configurations ....................................... 19 HASTAC Website Protection System - Functionality and Operation ............................... 23 The Challenges............................................................................................................. 24 Criteria for Success ...................................................................................................... 24 Initial Plan .................................................................................................................... 24 References .................................................................................................................... 25 |P age 2 HASTAC Website Protection System Overview Motivation Decreasing the amount of network traffic caused by brute force hacking attempts into password protected websites and substantially reducing the amount of hacked accounts caused by these brute force attacks. The Product An online security system that will be displayed at the login page of any password protected website as an extra security measure. The system will add another input field where the user will have to answer a question embedded in a picture that asks random questions about that image. A generated image will contain a large amount of details including randomly generated images and words. General Goals Decrease the amount of brute force hacking attempts by relating more human thought into the login procedure. Allow easy integration to existing systems, and use as little resources as possible. Allow future upgrades to the system with plug-ins. |P age 3 HASTAC Website Protection System The Problem With increasing activity on the web, the quick and comfortable way of purchasing products, services and information online, more and more websites are being created to provide online paid services. This sort of development has caused the rapid increase of attempts to hack into these services providing websites. Websites owners are not only affected by the hacking of accounts, but more so by the hacking attempts, and more specifically brute force hacking attempts which increase the bandwidth load on web servers and can cause reduction in the speed and performance of the website, and could go as far as causing a DoS (Denial of Service) on the website, which can cause site owners to create a significant loss of activity on their website. One of the main reasons for that increased malicious activity is the development of automated brute forcing programs that use various techniques to find a valid username and password pairs of active accounts, by using readymade or auto generated wordlists with a fixed design of "username: password" structure which is commonly used by most password protected website today, be it a form or a regular pop up login page. These simple to operate programs allow people without any experience or knowledge in network and website security to hack into password protected websites with a push of a button. These attempts cause a very high network activity on the website, which raises the costs of webmasters even if an account has not been actually hacked. In addition to these automated brute force programs, many internet surfers can today find their way into one of the many communities, whose sole purpose is website hacking. This brings the "art" of brute forcing into the hands of any capable person, thus posing a problem for webmasters all over the world at a day to day basis. |P age 4 HASTAC Website Protection System The Target Audience – User Characteristics Since the World Wide Web consists of a vast amount of password protected websites (both internet and intranet), with new websites providing online paid services launching each and every day, all of them could end up being potential victims of brute force attack attempts. Our proposed solution offers them an easy to integrate solution that will decrease the amount of attempts for brute forcing. |P age 5 HASTAC Website Protection System Alternative Solutions to the Problem Over the years, there have been many attempts to solve the problems of website brute forcing attempts. Although that conceptually the ability of creating a new type of security always means that the same security measure can be cracked eventually, even already existing solutions are being periodically updated in order to fight brute forcing and other measures of attacks. In this section, we will review several of these security measures, both in a direct way and the indirect way. Direct: Product: Vendor: Link: Price: Strongbox Ray Morris ( bettercgi.com ) http://www.bettercgi.com/strongbox/ 150$ per site (one-time) Strongbox is one of many products that make the use of a 5 letter image-based code protection. "A CAPTCHATM is a program that can generate and grade tests that most humans can pass, but current computer programs can't pass." It also features antislurp mechanism, to disable website leeching. Product: Vendor: Link: Price: T4wsentry.pl Fisher Technologies, Inc. http://www.tools4webmasters.com/t4wsentry.htm 65$ per site (one-time) T4wsentry.pl is a Perl script that requires the user to log-in from a specific page, in order to access the restricted area of the website. The login page features a username, password, CAPTCHA image fields. The server-side script also monitors simultaneous access by more than one surfer under the same username. Once logged in, the user receives a cookie with a unique identifier, so the rest of the communication can be over regular HTTP. The script operates on the UNIX platform and is widely used in conjunction with the Apache web server. |P age 6 HASTAC Website Protection System Product: Vendor: Link: Price: Pennywize Zarvon P/L http://www.pennywize.com/ 30$-170$ (monthly rate) Pennywize monitors each and every hit on the "members only" pages and tests if a single account has had too many hits from different IP addresses. Once it finds such an account, it will be automatically disabled for a predefined period of time. It also detects IP addresses that have made many failed attempts to access a single account, or multiple accounts in a predefined time interval. In the latest version (currently 3.0) Pennywize also handles proxy based attacks. Product: Vendor: Link: Price: BotDetect LANAP software http://www.lanapsoft.com 60$-100$ per site (one-time) BotDetect is a CAPTCHA solution which supports up to 50 different CAPTCHA algorithms at variable length and image size, producing different file formats (jpeg, gif, png and others). It is a windows-based solution that supports ASP/ASP.NET alone, using IIS servers. |P age 7 HASTAC Website Protection System Indirect Community portals and services Community portals and services band together a large group of sites and provide a readymade protection for all of them. One example for this type of protection is Microsoft's Passport service which provides access to many services using a single ID and Password. This is also referred to as "Single-Sign-In". Another example is the Adult Verification Systems (or AVS) that group together a large amount of websites under a single username and password. Digital Rights Management (DRM) With the advent and popularity of Peer-to-Peer networks and Micro-Payments technologies, some copyright holders publish their content in a non-secured website or P2P network. These files cannot be played without purchasing a special license that can be acquired at the vendor's website. This renders brute-force attacks on the website useless, and the hackers try to 'crack' the protected files instead. |P age 8 HASTAC Website Protection System The Proposed Solution The Users The target audience for using "HASTAC" (Humans Are Smarter Than Any Computer) is the webmaster community of ASP.NET-based websites. (31% of the web servers worldwide, as shown below) Basic skills are required to deploy the system (usually a single click for autodeployment). Once installed, the system will be fully automated in such a manner that it will allow any user to use it in order to access the website through a verified authentication operation, and will not require any maintenance. We have created a prototype and managed to generate over 100 Challenge & Response pairs per second, with minor CPU usage level (around 12%). This means that the production system should be scalable to even the busiest of websites. Our market research showed that there is only one competing production-level product for the ASP.Net platform, and it provides solution of lesser quality. Apache Microsoft Sun NCSA Other Source: http://news.netcraft.com/archives/web_server_survey.html |P age 9 HASTAC Website Protection System Scenario #1 A webmaster of a single website that has no protection and a lot to secure requires authentication to his sensitive content. Regular solution: Dividing several web pages into a "secured zone" and an "unsecured zone". Our solution: Adding another field of authentication to the login page, in order to ensure no bots try to hack their way in using brute force attacks. Scenario #2 A group of webmasters wish to create a single sign-in solution for their websites. Regular solution: Use existing Active Directory or similar methods. All servers connect to a centralized account server. Our solution: Adding another field of authentication to the login page, in order to ensure no bots try to hack their way in using brute force attacks. The centralized account server is not accessed until the challenge has been answered correctly, thus saving precious resources. Scenario #3 A specific service requires high-fidelity human authentication, such as e-voting systems, polls, forms filling (registration, 'contact us', questionnaires etc.), public & free e-mail services, all to avoid mass junk data from being stored or sent using the service. Regular solution: Use standard CAPTCHA mechanism Our solution: Tighten the security with a better performing CAPTCHA, which is virtually unbreakable by standard OCR tools. | P a g e 10 HASTAC Website Protection System The Environment The product is designed for web servers (both corporate intra-nets and Internet): That run Microsoft Windows Server operating system That use SQL Server for the user base That have the .NET Framework 2.0 installed and running That have IIS 5.0 or 6.0 Installed and configured Hardware In our preliminary test, we have succeeded to generate over 100 challenge images per second on an average home PC with an average CPU load of 12% (This means there are 100 people trying to log in to the site). The minimum requirements for our system are imposed by the software that runs on the server. Naturally, these requirements vary, and depend on the amount of users, memory, CPU and storage space available. The Process When installed, the CAPTCHA module is integrated within the login mechanism and adds the extra authentication step in order to verify that a human is on the other end. When a user logs in to the system, the additional CAPTCHA image will be displayed and it will require solving the challenge before any further verification that require database. | P a g e 11 HASTAC Website Protection System Use cases General Flow: Start User freely browses unsecured zone Wishes to browse secured zone Yes – Log in No Existing user? Yes No- Register Log-in completed Failed User is rejected Registration completed Succeeded User Accepted User is rejected User logged in successfully Check system policy User has registered, account is automatically activated User has registered, account requires activation User freely browses secured zone | P a g e 12 HASTAC Website Protection System User Registration and Activation: User registration User Activation Display Registration form User receives activation link Show Challenge Follows link Activate account Verify Response Failed Complete registration Succeeded Check System policy Send activation link Complete registration | P a g e 13 HASTAC Website Protection System Log-in Flow: User wishes to log-in Enters log-in page Show Challenge. Request user name and password Cookie exists? Exists Fill username & password fields Failed Verify Response Successful Complete log-in process | P a g e 14 HASTAC Website Protection System Challenge and Response generation: Challenge & Response generation User wishes to register or log-in Exists Check DB if pregenerated C&R exists Load C&R from Database Doesn't Exist Create C&R On-thefly According to system policy, start background C&R generation Send challenge to user Challenge & Response verification Verify user response against database Remove record from database Return verification result | P a g e 15 HASTAC Website Protection System Requirements Ability to Pre-generate challenge & response pairs, in order to ease the load on the server on high-traffic periods (such as the holiday's season). The pre-generation can be either done on another computer or on idle CPU cycles. Importance: Critical Variable Complexity – Be able to change the complexity of the generated image: Parameter Value Range Number of shapes 3-10 Number of colors 3-10 Background patterns Selection from a list Optical distortions 0%-100% Word length 3-20 Salting/Graining 0%-100% Importance: Critical Use many types of fonts, shapes, colors and backgrounds to make it harder for off-the-shelf brute-force CAPTCHA solvers to penetrate the website. Importance: Critical Fast response to ill-formatted inputs (short passwords, no usernames, bad answers etc.) – Employ some form of heuristics on the client page before sending potentially bad data. Importance: Critical Plug-in architecture to allow future development of new types of Challenges and Response types. Importance: Critical Larger Challenge & Response Space than current solutions – Provide a larger C&R space than the existing solutions. Importance: Critical | P a g e 16 HASTAC Website Protection System Copyright notice – Embed copyright signature on each generated image in order to discourage surfers from solving CAPTCHAS harvested from the original site to aid in auto filling forms. Importance: Medium Statistics – Be able to provide meaningful statistics on the performance of the system such as the graph shown below. Importance: Medium 10% 5% First attempt Second attempt 25% 60% Third attempt Failed attempts Successful login attempts between 31/12/2005-31/12/2006 Provide Back-Office / web service for configuration and/or XML configuration file. The system should be highly configurable, while still being simple to operate. Importance: Medium Extensibility to other platforms and generators – The solution should be modular and enable future enhancements (both hardware and software), and also operate on the popular apache/PHP environments. Importance: Low | P a g e 17 HASTAC Website Protection System Domain Definitions a) Secured zone: A set of web pages and content whose access is limited only to registered members of the website. b) Unsecured zone: A set of web pages and content that is available to the general public. c) User: A person who wishes to enter the secured zone of a website. When a user wants to access the website he can either: a. Access the unsecured zone. b. Authenticate himself and access the secured zone. (This procedure is referred to as "Login Transaction"). The authentication consists of entering an account name and password, and additionally answering a computer generated challenge. d) Administrator: Typically the owner of the website. Manages the rules of the login process to the secured zones of the website and additionally integrates the "HASTAC website security" software component into the website. e) Challenge generation process: A process in which the software generates an number of challenge and response pairs and stores it in the database. HASTAC Website Protection System - Installation Procedure This process consists of four stages: a) Setting up the database. This is done by "attaching" an existing, blank database and creating administrative accounts. | P a g e 18 HASTAC Website Protection System b) Copying the binary file (HASTAC.DLL), the back-office software to the website server. c) Integrating the HASTAC component in the login page/component. d) Configuring the system. (Explained in the next section) HASTAC Website Protection System - System Configurations HASTAC is controlled from a back-office system that will include 5 tabs: a) Main b) Configuration c) Tools d) Statistics e) About Main The 'Main' section shows general information about the system: a) System status b) System time c) Number of login transactions d) Number of pre-generated challenge and response pairs In addition, it allows changing the system status, reset login transaction counters and a shortcut to generate more challenge and response pairs. | P a g e 19 HASTAC Website Protection System Configuration The 'Configuration' section allows the administrator to define different aspects of the challenge and response generation process. Number of login attempts Same difficulty for all attempts Number of shapes Number of colors Background patterns Optical distortion Word length Salting Image file formats Copyright notice Number of allowed login attempts before a redirection to an "Access Forbidden" page occurs. When checked, this option allows to apply the same difficulty level to all login attempts. Otherwise, every login attempt will be configurable. Average number of shapes shown in each generated picture. Average number of colors shown in each generated picture. A variety of background patterns available for each generated picture. Distort the generated image to harden the readability of words in order to prevent OCR mechanisms from cracking the words inside the generated image. Varied length of words generated on the images. Additional image processing that adds salt (noise to the picture in order to make OCR technology difficult to use). Select from a variation of image file formats. A fixed string that will be shown on each generated picture on a random edge | P a g e 20 HASTAC Website Protection System Tools The 'Tools' section allows the administrator to pre-generate challenge & response pairs in off-peak hours, to ease the CPU in peak hours. It also allows changing the 'pre-generated images' cache size. Clicking on the "Pre-generate Challenge & Responses Now" button starts a background process at the server, which generates Challenge & Response pairs. The timetable allows scheduling the generation process automatically for each day of the week and each hour of day. On websites with high traffic this will alleviate the CPU usage during the 'rush hours'. A red box signifies that the generation process will take place A clear box signifies that the generation process will take place only when the Challenge and Response pair cache is empty. Websites with a limited storage, light traffic and/or low CPU usage may want to leave the boxes unchecked. The "Generate up to # Challenge & Response pairs" field specifies the maximum number of Challenge and Response pair to pre-generate. Taking into account that an average file is 4Kb big, 1 Mb of storage space is equivalent to 256 Challenge and Response pairs. | P a g e 21 HASTAC Website Protection System Statistics The 'Statistics' section displays meaningful information about the successful login attempts, login traffic, daily login statistics. The "Successful login attempts" displays statistics about the percentage of successful login attempts to the site for each attempt. The shown statistics, for example, shows that 60% of the users manage to login on their first attempt. 25% of the users manage to login on their second attempt, 10% on their third attempt, and the rest 5% fail to login successfully. The "Image Traffic So Far" field displays the amount of traffic utilized by the HASTAC Protection system. The Day login statistics shows the number of login attempt for every day in the current month. The Most active accounts shows the top 10 most active accounts. | P a g e 22 HASTAC Website Protection System HASTAC Website Protection System - Functionality and Operation When a user tries to access to protected website area, a challenge and response image is displayed. The user has to input his account name, password and an additional answer to the challenge field. (See also flow charts in pages 11,12). | P a g e 23 HASTAC Website Protection System The Challenges There are several challenges facing the development of the HASTAC Website Protection System. 1. The suggested algorithm has no parallel in the website protection field and has to be written from scratch. 2. Making Question and Answer space be as large as possible. 3. The system has to consume as little bandwidth as possible (preferably less than 5 Kilobytes for each Challenge Image) 4. The CPU utilization should be minimal, even during the generation process 5. SQL Database access and HDD I/O should be minimal. 6. Image manipulation algorithms should be developed to render OCR useless. 7. The system has to be user friendly, both to the user and to the website administrator. 8. The system should be upgradable with plug-ins. Criteria for Success Success: Meeting all the requirements described above. Failure: Poor integration, Challenge & Response quality, and resource usage. Bad plug-in support. Initial Plan The plan for HASTAC Website Protection System Top Level Design will be as follows: 1. Research and Development of the HASTAC algorithm 2. Research brute-force techniques of CAPTCHA-protected websites. 3. Investigate integration methods with current ASP.NET websites. 4. Build administration interface ("Back-Office") for the system | P a g e 24 HASTAC Website Protection System 5. Define the main software modules and their integration 6. Perform stress-testing on the algorithm References 1. http://www.bettercgi.com/strongbox/ (StrongBox) 2. http://www.tools4webmasters.com/t4wsentry.htm (T4WSentry) 3. http://www.lanapsoft.com (BotDetect) 4. http://www.pennywize.com (PennyWize) 5. http://www.captcha.net/ (The CAPTCHA Project) 6. http://www.firepages.com.au/captcha.htm (A different approach for CAPTCHA, based on image recognition and counting) 7. http://news.netcraft.com/archives/web_server_survey.html (Web-server platform statistics) 8. http://video.google.com/videoplay?docid=-8246463980976635143 (Dr. Luis von Ahn from Carnegie Mellon University talking about current CAPTCHA technology and its disadvantages) | P a g e 25