Requirements Analysis Document for IT Management and Automation of Automated Solutions, Inc. Developed by: Tito Esteves (Project Manager) Lenny Simon Yunier Rodriguez Peter Greko Group #: 1 Advisor: Dr. S. Masoud Sadjadi School of Computing and Information Sciences Florida International University Contact Information: sadjadi@cs.fiu.edu, More information: http://www.cs.fiu.edu/~sadjadi April 18, 2009 1 Table of Contents 1. Introduction 1.1 Audience 1.2 Purpose of the system 1.3 Scope of the system 1.4 Objectives and success criteria of the project 1.5 Definitions, acronyms, and abbreviations 1.6 References 1.7 Overview 2. Current System 2.1 Web Hosting 2.1.1 Data integrity and uptime, key to the business 2.1.2 Server Farm 2.1.3 Issues and problems 2.2 Office Departments 2.2.1Issues and problems 2.2.2 New Customer Process Flow 2.2.2Customer Issues / Helpdesk Process Flow 3. Proposed System 3.1 Overview 3.1.1 Cost Benefits of the Proposed System 3.1.2 Server Farm 3.1.3 Office Departments 3.2 Functional Requirements 3.2.1 Remote Control 3.2.2 Auditing and Asset Management 3.2.3 Monitoring 3.2.4 Patch Management 3.2.5 Backup and Disaster Recovery 3.2.6 Endpoint Security 3.2.7 User State Management 3.2.8 Help Desk 3.3 Nonfunctional requirements 3.3.1 Usability 3.3.2 Reliability 2 3.3.3 Performance 3.3.4 Supportability 3.4 Agent Groups 3.4.1 Organization 3.4.2 Organization Units 3.4.3 Subnets 3.4.4 Tree 3.5 Agent Roles 3.5.1 DHCP/DC/DNS Win2k3 Servers 3.5.2 IIS Win2k3 Servers 3.5.3 MS SQL Win2k3 Servers 3.5.4 Exchange Win2k3 Servers 3.5.5 Automation Win2k3 Servers 3.5.6Marketing, Accounting, Human Resources, and Sales XP Desktops 3.5.7Tech Support XP Desktops 3.5.8Executives XP Laptops 3.5.9Legal XP Desktops 3.5.10Customer Service XP Desktops 3.6 Mapping Functions to Agent Roles 3.2.1 Roles and Functions 3.2.2 Roles with Detailed Backup Functions 3.2.3 Reasoning for Tape Backup vs. NAS Backup 3.2.4 Patch Management 3.2.5 Backup & Disaster Recovery 3.2.6 Endpoint Security 3.2.7 User State Management 3.2.8 Help Desk 4. Glossary 3 1. Introduction Over the past few years Automated Solutions has been plagued with numerous IT problems. Their issues range from systems crashing to unexpected data loss. This past November an unwanted event occurred in which several of Automated Solutions hosting customers lost their data. The system administrators at Automated Solutions failed to fully restore customers’ data using the available backups. In this case, the problem was that the system administrators never had time to verify the backups. In addition, their backup plan at the time consisted of a full backup for the first of the month and a differential backup for the 7th, 14th, and 21st of each month. When it came time to restore the systems, some of the backups were corrupted. In consequence of their inefficient backup schema and system implementation, restores took a really long time. Because of the failure to restore customers’ data from backup, Automated Solutions lost hundreds of customers. This is one of many problems that have taken place throughout the past years. Due to the current economic conditions, Automated Solutions simply cannot afford to lose any more customers. At Automated Solutions it has been common practice to deal with problems in a sort of break and fix manner. The system administrators at Automated Solutions react to problems that occur instead of taking measures to prevent problems from occurring. Just to be clear, it is not that the system administrators do not care but rather the system administrators simply do not have enough time given the current system implementation. 1.1 Purpose of the System The purpose of the proposed system is to cut the operating and salary costs for Automated Solutions while increasing the efficiency and quality of the IT management services. In order to achieve this, we will be implementing a complete IT Automation system for the administrators to use throughout the company. 1.2 Audience The audience of this Requirements Analysis Document will be system administrators, project managers, CIO, and system designers of Automated Solutions. 1.3 Scope of the System Services rendered in this document will vastly improve the IT management duties and structure for the current model that is being implemented at this time. Several of these services are detailed below. 4 Items that are within the IT Automation System: Patch Management of the computers Automatic and Recurring Patch Scans Patch Approval Automated Patch Deployment Inventory and auditing of the computers Complete PC Inventory and Configuration Fast and Easy Deployment Automatic and Scheduled Computer Audits Centralized Inventory Repository Network Monitoring and Alerts Monitor the Windows Event log Alert on hardware and software changes Alert on specific file changes and protection violations Know if disk space is running low on computers Monitor computer online offline status Know if a server goes down Know when traveling users with notebooks connect Alert message and recipient configuration Reporting Hardware and Software Inventory Changes to Computer Hardware or Software Disk Utilization License Usage and Compliance Network Usage and Statistics Server and Workstation Uptime History Help Desk Trouble Tickets Computer Logs and Status Security Patch and Update Status Backups 5 Backing up the data whether it be via NAS or Tape is within the scope of the system but off site handling and care of the data for disaster recovery is handled by a 3rd party. Security Anti-virus and spyware protection Remote Desktop Management Access Computers from anywhere with a web browser Transfer Files between local and remote computers Items that are out of the of the IT Automation System: Maintenance and Monitoring of: A/C units in the data center or office Fire extinguisher system in data center (FM-200 Fire Suppression System) Diesel Generator UPS Backup Overall Security of the Building (ID Badges, Video Surveillance, etc) Disaster Recovery is handled by a 3rd party 1.4 Objectives and success criteria of the project The success of the project relies heavily on solving the issues that are spelled out in this document. There are several issues and problems that need to be addressed for both the financial success of the company and the growth of the company at hand. These current issues are hindering profitability of the company and are in great need to be solved for future growth. A more streamlined system that solves the issues for both the hosting services and the in house web design and programming side needs to be implemented. Objectives are as follows. 1. Streamline Administrative tasks by the IT support professionals that are currently working. 2. Better backup solutions that can support the level of service that is needed. 3. Security methods to secure both the Hosting Division and the Web Design and Programming division. 4. Quick response time and availability of support personnel for problem solving and trouble tickets. 6 5. Solution to all the issues that are labeled later in this document at 2.1.3 and 2.2.1. 6. The ability to handle growth with the resources provided. 7. The ability to control software and hardware for company accounting and equipment tracking. 8. The ability to track server and client resources. 9. The ability to easily report on production machines for business decision making. 1.5 Definitions, acronyms, and abbreviations The following are detailed explanations of the references that are contained in this document. Abbreviation IIS C# NAS .NET 2003 Definition Internet Information Services Description This is the Server 2007 Web Server that Microsoft offers in their product line C# programming language Another application development language by Microsoft A self-contained computer connected to a network, with the sole purpose of supplying file-based data storage services to other devices on the network. A framework of programming languages by Microsoft for creating applications and web interfaces. Someone that administrates the network, a general term for someone that handles the computing equipment for the organization. Referring to a device that has a tape drive, information is stored on the tape during a one time operation usually pertaining to Backing up all the data from the hard drive or other storage device. An action used for copying all the contents of a particular storage device onto another storage medium for record storage and other disaster recovery issues that would require a previous version of the data. A term used to describe the usage of technology that helps with job workflow and Network Access Storage A programming framework Admin Administrator Tape Tape Backup Backup IT Backing up data Information Technology 7 efficiency. Usually electronic devices that is associated with a computer. 1.6 References The references for this project come from meetings with the client where the client stated all the problems and requirements and based on those issues we came up with solutions to those problems. These solutions also come from detailed studies about the clients’ current technology infrastructure and capabilities that can be improved. Another great reference is the consultation with actual employees that give us the real issues currently happening that sometimes management don't get to see. 1.7 Overview There have been several issues with the current IT infrastructure that has effected both new customer acquisitions and customer retentions. Issues with backups and server uptime on the hosting end of it are continual problems. Problems with availability of administrators and overworked support staff are issues on the programming and support side. The need for addressing these issues is spelled out in this document in a way that is easily readable and understood. Services rendered in this document will vastly improve the IT management duties and structure of the current system infrastructure at Automated Solutions. Several of these services are detailed in this document. 2. Current System The current IT structure is for a medium-sized hosting company that consists of 2 divisions, the web hosting division and the Web Design and WEB 2.0 Programming. 2.1 Web Hosting Automated Solutions supplies .NET 2003 framework hosting servers for companies that use .NET as their backend systems using C#. Automated Solutions offers several packages for hosting services. These packages are very competitive and give the customer many options that often go with the in-house programming and design group. 2.1.1 Data integrity and uptime, key to the business Integrity of the customer’s data is one of the most key parts of the business next to uptime. There are many databases that are stored on the servers that several customers rely on for running their businesses. The way it is handled can mean life or 8 death in a highly competitive industry that exists today, especially with corporate hosting companies such as GoDaddy. Offering superior customer reaction time and service is crucial to success. 2.1.2 Server Farm 255 windows 2003 servers 220 Windows 2003 IIS Servers 10 Windows 2003 MS SQL Servers 10 Windows 2003 MS Exchange Servers 10 Windows 2003 Domain Controllers / DHCP Servers 5 Windows 2003 DNS Servers Figure 2.1.2a 2.1.3 Issues and problems With the current system mentioned above there are several problems that the system administrators have to deal with on a daily basis. These problems are listed below: Backups of customer’s web sites are currently kept via RAID (Redundant Array of Independent Disks) level 0 which is based on a method called striping, which breaks all the data on the computer into smaller chunks of data that are placed across all available disk drives. Even though the reading and writing speeds are fast, with that level of RAID hard drives are failing. Due to this fact, administrators’ data is being lost since the data is spread across all drives. Issue 1 Keeping inventory of the computers has been an ongoing and impossible task. Administrators not only have to go through and create little labels and Issue 2 9 stick them on each computer but the excel document that is currently used to keep track of the computers is edited by many people. This creates chaos since many people gather most of the data and their formatting styles (e.g. fonts, colors, etc) are not consistent. It is time consuming and hard work to keep track of all the servers that had been patched. Currently administrators are divided up to handle a certain number of servers for managing the patches and updates but too much time is taken away from their other duties. Issue 3 The current backup solution only supports restores for one week or one month. In the beginning, there were mostly static pages with little change in content. Now with WEB 2.0 applications, dynamic content is used often and changes are made daily. When a disaster (e.g. HDD crashes) happens, a month worth of information may be lost and this has in return lost loyal customers. Issue 4 Conventional KVM switches are used to log into servers. Many trips to the server racks are made throughout the workday to make changes. This can be tedious due to onsite security measures. Issue 5 Web host availability, performance, and security are not where they need to be, and there have been several security issues / exploits that have worked well on the servers. Issue 6 The datacenter has automatic updates enabled as default for ease of use. Sometimes the patches break other applications. This was also the reason why a lot of the websites were defaced recently. Issue 7 There is no detailed monitoring and logging on servers or other computers at this time. There is a Network Operations Center but a lot of logs do not get saved. Having a detailed system of monitoring would greatly help in problem avoidance, especially after the defacing incident. Issue 8 Some of the systems are slow, frequently crash, and do not have updates. There is no accurate accounting of all the equipment. Additionally, Automated Solutions is looking to expand its existing system with the addition of new equipment and peripherals. Issue 9 Licensing, auditing, inventory is not in place in Automated Solutions current setup. Automated Solutions is currently not in compliance with licensing. If Automated Solutions were audited by government or software companies, they could face very sever fines. Issue 10 10 The problems are dealt with as they occur, instead of preventing problems from occurring. In IT management it is better to be proactive than reactive. Issue 11 2.2 Office Departments The rest of the back office overhead is handled here along with the call center and programmers. 80 Windows XP Desktops 10 Windows XP Laptops Figure 2.3a 2.2.1 Issues and problems Issue 12 HR/Account/Marketing/Sales have been saving their documents to their local computers. This information is not currently getting backed up. This is a huge issue in case of hard drive failure or file/data get erased. Several employees’ hard drives are failing so the need for some sort of centralized storage with backup is needed. 11 Issue 13 Employees are always calling up the operations department for basic PC issues. Most of the time they have to send an operations technician to the physical computer with the issue to fix it. Issue 14 Operations receive many calls about employees’ computers that do not boot up correctly saying it is missing a boot loader or it is corrupt. Nine out of ten times it is due to the employee’s hard drives filling up and them maxing out the space. If Operations could get an alert or some sort of notification when the HDD space hits 90% that would save time for the technician to work on replacing/repairing/reformatting the hard drive. It would also not waste value time at whatever position this employee has to deal with such simple PC problem. Issue 15 Employees in the marketing department are using programs such as Adobe Illustrator, Photoshop, Flash and others that are memory intensive applications. There have been several instances when employees have added their own memory modules to their desktop computers, which in time fried or burned out the motherboard. They didn't even attempt to match the motherboard compatibility with the compatible memory. Our technicians had to completely rebuild the entire computer. Employees also add their own video cards and other components that are not needed for their job duties. Issue 16 There have been some issues with employees in the office who are able to set whatever wallpaper / screensaver they would like on their desktop. Some have been offensive to others and management has to hear about it. A method of controlling this is needed due to past threats of sexual harassment lawsuits. Issue 17 Presently, employees leave their computers on when away from the computer including when they go home. This creates two dilemmas: unnecessary power is used and also a security concern due to workstations being logged on. In addition to hosting which mostly provides a basis of cash revenue, website design and WEB 2.0 programming bring in a considerable amount of revenue. For this, optimization and crisis elimination is needed. Many “crisis” situations take place causing considerable financial drain to fix the “problem of the week”. In the past the usual “throw money at the 12 problem” solution has never yielded long term results. There are project managers that work with the customer and several programmers/designers to complete the tasks at hand. 2.2.2New Customer Process Flow There are 2 main ways of new customer signups and processing. 1. Phone sales Call center runs from 9AM to 4PM Eastern Standard time. They take orders via phone and setup auto payment systems through a credit card ordering system run by a 3rd party. 2. Online website signups Through the main website, via the following process: Initial domain search for available domains are done Purchasing of the domain through DNS server and DNS provisioning service Hosting package choice selection Online merchant payment and auto payment setup Access controls configured and username passwords are sent Contact information is given along with automated customer assistance for the initial login 2.2.3Customer Issues / Helpdesk Process Flow There are 2 main ways for customers to get help with hosting issues. Helpdesk phone support and online chat/email support. Both require onsite resources and office space. 1. Helpdesk Workflow Caller Queue assigns calls to a support specialist Support specialist logs the call into their own personal Excel spreadsheet Support Specialist works with the server team to try and resolve customer issues 2. Online Support Workflow Support Specialist works with chat interface and help@automatedSolutions.com Answering questions for help and working with specific server administrators to complete the tasks. Reasons for Auditing / Automation 13 Auditing is often overlooked as a vital tool to prevent system intrusion and compromise. By implementing auditing and IT automation, it will provide fast, reliable reporting for the computing infrastructure, which would be a tremendous help to the administrators in the data center. We will examine each need for auditing and automation as per the two business units listed above. Web Hosting Currently administrators don’t have an easy method of keeping track and managing backups. Being able to perform incremental backups would be a huge advantage. The inability to do incremental backups often causes issues when systems go down. In addition, by combining full backups with incremental/differential, it would greatly reduce network bandwidth and server resources needed to perform those backups. Additionally, administrators are not verifying backups because of the time constraints associated with managing the systems without any type of automation and auditing. Due to this fact, there are times when recoveries are not successful due to corrupt backups. Another critical matter that can be addressed with IT automation is that of software licenses. Automated Solutions does not have a system to know what licenses are on which computers. This in turn causes problems with inventory. Web design and programming With automation, desktop environments can be standardized and policies can be put into place to put things like the approved company wallpaper. Another issue that can be addressed by IT automation is one of power efficiency. Through IT automation, power settings and policies can be put into place so that hard drives and monitors go into sleep mode after a predetermined amount of time. Policies can also be set so that when a PC is idle for a few minutes that workstation would be locked requiring a password to be entered in order to use the workstation. Having an auditing system in place to monitor system changes such as new applications being installed or changes made to a system would be of great help to the administrators. When any such events occur, our proposed system would email the administrator or even be able to deal with the issue dynamically using a scripting tool. 14 3. Proposed System 3.1 Overview 3.1.1 Cost Benefits of the Proposed System Currently, Automated Solutions has over twenty administrators on the payroll constantly monitoring the servers to make sure they are up and running. Their job is to make sure that the servers in the server farm are updated with the latest patches, backups of the customers data is consistent and valid and most of all to make sure the servers and network are running properly. Most of the money spent on salaries at Automated Solutions is the salary of the administrators. An average of $350,000 a year is currently being spent on the administrators. Most of the administrators’ works parttime but a good amount are full-time as well as on-call. Implementing an IT automated solution would not only make the administrators jobs easier so they can attend to more system critical issues, but also Automated Solutions can cut back on the cost of salaries of the administrators. Figure 3.1.1a below is a chart that shows how the total administrators’ salaries have increased over the years: Figure 3.1.1a Since 2005 the number of staffed administrators has grown from 3 to 22 and the combined yearly salary has increased from $35,000 to $350,000. It’s understandable that as the company grows there is a need for an increase in staff but by using IT 15 automation the amount of servers they are administering can be done by half of the amount of administrators. We believe that the number of administrators can be cut by over 77% and the savings of salaries by 47%. Implementing an IT automation approach can definitely cut the costs of the salaries as well as give the administrators more time to handle other critical issues like escalations and helpdesk tickets. 3.1.2 Server Farm 515 Windows 2008 servers 465 Windows 2008 IIS Servers 10 Windows 2008 MS SQL Servers 10 Windows 2008 MS Exchange Servers 10 Internal Windows 2008 Domain Controller / DNS Servers 10 Windows 2008 DHCP Servers 5 External Windows 2008 DNS Servers 5 IT Automation Servers Figure 3.1.2a 16 3.1.3 Office Departments 80 XP Desktops 10 XP Laptops 3 Marketing XP Desktops 7 Accounting XP Desktops 5 Human Resources Desktops 15 Sales XP Desktops 20 Tech Support Desktops 10 Executives XP Laptops 5 Legal XP Desktops 25 Customer Service XP Desktops 3.2 Functional Requirements 3.2.1 Remote Control Administrators have to take time out of their busy schedule to physically go to the computers that are having issues. Doing this takes time away from other issues that need more attention. Having a solution that will enable then to remotely access any computer in the server farm or even the office would greatly reduce their troubleshooting time and allow them to focus on other more critical needs in the workplace. Time and time again administrators are often walking around going from server to server or workstation to workstation fixing issues. Other employees that need them rarely find them in their offices. This issue has caused several prolonged server outages (server down time) when administrators were elsewhere and could not be found. 3.2.2 Auditing & Asset management Administrators have to keep track of all the computers that are currently on the network as well as keep track of them for upgrades and physical location. Currently, they use an Excel spreadsheet that lists all of the computers and their components. This excel file is updated by most of the administrators and is causing problems by the means of formatting and keeping track of the most updated inventory Excel file. Having a solution that the administrators can easily list and monitor the computers would save time and reduce their frustration. An inventory of all machines that are on the network and can be successfully scanned by the automation software will be maintained. Additional audits such as installed software will be conducted by the same criteria. This will greatly help in both maintaining software licensing information and hardware inventories. 17 3.2.3 Monitoring Administrators that keep a close watch on network and computer performance have a difficult time doing so. Often they have to log into each individual server to manually look at the Event Viewer to determine if the server is up and running without any problems. Having a solution to keep track of hard drive usage, when a new computer is added to the network would help administrators troubleshoot the issues more effectively. By using the monitoring functions, the IT Automation system will have the ability to alert when any hardware and software changes happen to systems that is able to be configured with monitoring. These monitored computers will send alerts to a central source for monitoring and management. The network will be monitored by the IT automation software through the usage of agents on devices that will allowed to be monitored and are on the network. This monitoring will be capable of sending alerts when problems with devices on the network occur. Particularly useful for the IIS server farm that the company uses. The automation package will be able to go to machines that have been selected and are running a Windows Operating system, and capable of uploading the appropriate log files needed. These will be monitored with the IT Automation software and able of recognizing problems and other events of interest. Specific alerts can be created with the IT Automation system that will identify specific file changes and protection violations. This will allow advanced programmable monitoring of any critical functions that would need attention. Other alerting functions would be notifying of low disc space on servers and workstations capable of monitoring. Along with all of the previous features, the status of the computers online/offline will also be monitored with an alert feature for any outages of network appliances. 3.2.4 Patch Management Keeping track and deploying patches/updates to the computers on the network has been a difficult task for the administrators. Some computers don’t get updated since the administrators don’t have an effective or organized solution in doing so. Being able to deploy patches across a certain amount of computers would make administrators jobs a lot more efficient. There will be a patch management structure put in place that will allow automation in patching machines when updates to selected software and operating systems are chosen. Automated updates that are centrally administered and controlled will be handled by the new IT Automation package. Its functionality and performance is explained in this document. Selected and tested patches will be delivered to their target machines with an automated system. 3.2.5 Backup & Disaster Recovery 18 Being able to track and make changes to the backup/recovery scheme as well as validate those backups would ensure that the backups are valid and administrators don’t have to worry when it’s time to restore. 3.2.6 Endpoint Security Being able to deploy and manage anti-virus, spyware, and root-kit protection to the server farm computers as well as the office computers effectively would ensure optimal protection throughout the workplace. 3.2.7 User State Management To reduce the electric bill and increase the life span of the computers components having a solution that the administrators can use to adjust power settings would be best. Also, to be able to manage network drives and printer mapping easily would help the administrators keep those policies consistent and manageable 3.2.8 Help Desk Administrators spend most of their time handling Help Desk tickets from the tech support staff that deal with server related issues. Having an organized ticketing system would increase the administrator’s productivity level. Administrators are looking for some sort of email alerting feature within the ticketing system so they can quickly ne notified if an issue occurs. Help Desk Trouble Tickets and Remote control applications shall be implemented by the IT Automation system to better facilitate the helpdesk with problem and ticket resolution. 3.3 Nonfunctional requirements 3.3.1 Usability With all the changes in automation and systems integration, users and employees must be able to accomplish the same tasks that they did before and even improve the usability of their systems once all the improvements are done. Many users are requiring that the usability of the network and shared files be streamlined and more user-friendly. The IT Automation system must be intuitive and easy to use, so that problem solving is improving when there is downtime is critical for customer satisfaction and to meet the service level agreement that has been stated in each web package. Customer satisfaction and customer enthusiasm about hosting and selecting the company’s web designers and programmers is the key to success and maximum profits. The IT automation system should be available via a website that is remotely accessible by the administrators whether they are at home or at work. A standard Internet connection and a login for the administrators are required. 19 3.3.2 Reliability Desktop users are requesting that with all the changes, they want their computers to be more reliable to do their work. They need their desktop to not freeze due to memory issues or hard drives being full. Uptime is critical for the server farm. Reliability is a very important part of this formula and components are often scrutinized for their reliability over their “bang for buck” features. Reputation is everything, so for that reliability of the IT automation system is critical. The system needs to reliably deliver patch updates, anti-virus updates, have the ability to reliably remote control servers, and to update software when vulnerabilities are detected. If the IT automation server running the desired system is down administrators would be notified via email and or SMS. There should be no time during the week when the system is down or where the administrators are unable to access it. 3.3.3 Performance The biggest issue employees are having and requesting for a solution, is for their computers to perform to the required level for the work that needs to be done. Users want a better way to handle viruses, spyware, and all the software update and scanning to improve the performance of their desktops. To be able to deliver on the tasks at hand in a timely manner it is critical for the server(s) running the IT automation system to be able to withstand high amount of traffic and effectively use the systems resources. 3.3.4 Supportability Another request users and employees are having, is for their desktops to not be down or not understanding the new improvements. They want to be comfortable with the new features and automation on their systems, so they want to have support and help whenever they need it, or when their desktop is having issues. When administrators access the IT automation system remotely they should be able to use a standard web browser. The system should support all Windows and Mac based platforms. 20 3.4 Agent Groups 3.5 Agent Roles Below are several tables explaining the different procedure classes used in the roles. Backup Procedure Classes Class A Class B Class C In this class, a full backup will be performed on the 1st of each month, differential backups on the 7th, 14th and 21st of each month, and incremental backups on all other days. These backups will be stored in Tapes. In this class, a full backup will be performed on the 2nd of each month, differential backups on the 8th, 15th and 22nd of each month, and incremental backups on all other days. These backups will be stored in Network-Attached Storage (NAS) devices. In this class, a full backup will be performed on the 2nd of each month, and differential backups on the 8th, 15th and 22nd of each month. These 21 backups will be stored in Network-Attached Storage (NAS) devices. Table 3.5a – Explanation of the different Backup Procedure Classes Patch Management Procedure Classes Class A In this class, patch management and updates will be performed on test servers by the administrators and then applied to the machines as they become available. On the test servers, the updates will be set to download but not to be installed. The administrators will follow a six step process to ensure that the patches are applied in an organized and timely manner: Notification – Administrators will be aware of new security updates or service packs. Assessment – Administrators will identify which Windows 2008 servers on the network require the security update or service pack. Obtainment – Acquire the security update or service pack from Microsoft. Testing – Administrators will make sure to test the security updates or service packs before applied to the other Windows 2008 servers on the network to ensure that no issues occur. Deployment – Administrators will deploy those security updates or service packs in a timely manner. Validation – Administrators will then make sure that the security updates or service packs have been fully installed on the Windows 2008servers. Class B In this class, patch management and updates will be performed by the users themselves and applied to the machines. Table 3.5b – Explanation of the different Patch Management Procedure Classes Security Procedure Classes Class A Class B In this class, virus scanning will be done twice a week, Mondays and Fridays, at 4:00am to keep the machines free of malicious programs and anti-virus definitions updates will be done daily at 1:30am on one of the automation servers and then pushed down to the machines that need the updating. Some services should be disabled by default for security reasons, here are some of those services: Messenger Alerter License Logging Service In this class, virus scanning will be done once a week, Wednesday, at 3:00am to keep the machines free of malicious programs and anti-virus 22 definitions updates will be done daily at 12:00am on one of the automation servers and then pushed down to the machines that Table 3.5c – Explanation of the different Security Procedure Classes Remote Control Procedure Classes Class A In this class, remote control capability will be available for administrators to be able to access servers remotely in case the administrators need to restart services manually or update mailbox settings, and also in case tech support technicians need to remotely fix office computers’ problems without having to go physically into the computers. Table 3.5d – Explanation of the different Remote Control Procedure Classes Audit Inventory Procedure Classes Class A In this class, auditing will be performed so that administrators can keep track of the hardware components on each server as well as the software installed. It will also guarantee management that all the hardware components have not changed and that no prohibited software has been installed by employees. Class B In this class, no auditing will be performed as there is no necessity to keep track of hardware components. Table 3.5e – Explanation of the different Audit Inventory Procedure Classes Monitoring Procedure Classes Class A Class B In this class, monitoring of disk space will be done to make sure that sufficient space is available in all the servers. Also, alerts will be put into place to monitor some of the services that are required for the server to run properly. Administrators would then be notified if a service is down or even if the server is unreachable. In this class, monitoring of applications and network usage will be automatically and alerts displayed for different purposes. For example, to assist in energy efficiency by turning monitors off when users are idle and by applying the same wallpapers to all computers in the same role. Another example will be monitoring network and bandwidth usage to make sure employees are not streaming any content or downloading anything like music or video. Table 3.5f– Explanation of the different Monitoring Procedure Classes 23 3.5.1 Internal DC/DNS Windows 2008 servers: This role will have 10 internal Windows 2008 Domain Controllers/DNS servers that will provide domain controller functionality as well as DNS to the internal network. This role will have the following automation functionalities: This role will initiate a Class A backup procedure to ensure Active Directory, DNS and the schema database are backed up. Using a Class A backup procedure ensures the data will be easily restorable as well as using the least amount of tape media when a restore needs to be done. Security is important on Windows 2008 servers handling Active Directory and DNS so a Class A security procedure will be initiated. Here are several services that should be enabled in order for the internal Windows 2008 DC/DNS servers to function properly: DHCP Server DNS Server IIS Admin Service The internal Windows 2008 DC/DNS servers will have a Class A remote control procedure so the administrators can access these servers remotely in case the administrators need to restart services manually, make any changes to active directory or update DNS settings. A Class A patch management procedure will be initiated on the internal DC/DNS servers to make sure that the patches and updates are tested and deployed in a organized and timely manner. Auditing of the 10 Windows 2008 DC/DNS servers with a Class A auditing procedure will help the administrators keep track of the hardware components on each server as well as the software installed. Since active directory and DNS is stored on these internal Windows 2008 DC/DNS servers a Class A monitoring procedure will be used. 3.5.2 IIS Windows 2008 servers This role will have 465 Windows 2008 Internet Information Services (IIS) servers that will store all of the customers’ web site files. Some of the web site file types that will be stored are .HTML, .ASP, .JPG, .GIF and other web related files. This role will have the following automation functionalities: This role will initiate a Class A backup procedure to ensure IIS settings and the customers web files are backed up. Using a Class A backup procedure ensures the data will be easily restorable as well as using the least amount of tape media when a restore needs to be done. Security is extremely important on Windows 2008 servers storing customers web sites and other files so a Class A security procedure will be initiated. Here are several services that should be enabled in order for the Windows 2008 IIS server to function properly: IIS Admin Service 24 World Wide Publishing Service ASP .NET State Services The IIS servers will have a Class A remote control procedure so the administrators can access these servers remotely in case the administrators need to restart services manually or view/manage customer web site files. A Class A patch management procedure will be initiated on the IIS servers to make sure that the patches and updates are tested and deployed in a organized and timely manner. Auditing of the 465 Windows IIS servers with a Class A auditing procedure will help the administrators keep track of the hardware components on each server as well as the software installed. Since all of the customers’ web site files are stored on these Windows 2008 IIS servers a Class A monitoring procedure will be used. 3.5.3 External Windows 2008 DNS servers This role will have 5 external Windows 2008 DNS servers that will provide DNS lookups for the hosting customers. This role will have the following automation functionalities: This role will initiate a Class A backup procedure to ensure the customers DNS settings are backed up. Using a Class A backup procedure ensures the data will be easily restorable as well as using the least amount of tape media when a restore needs to be done. Security is important on Windows 2008 Servers handling DNS so a Class A security procedure will be initiated. Here are several services that should be enabled in order for the external Windows 2008 DNS servers to function properly: DNS Server IIS Admin Service The external Windows 2008 DNS servers will have a Class A remote control procedure so the administrators can access these servers remotely in case the administrators need to restart services manually, update DNS settings, or manually edit the zone files for the hosting customers. A Class A patch management procedure will be initiated on the external DNS servers to make sure that the patches and updates are tested and deployed in a organized and timely manner. Auditing of the 5 Windows 2008 DNS servers with a Class A auditing procedure will help the administrators keep track of the hardware components on each server as well as the software installed. Since DNS settings are stored on these external Windows 2008 DNS servers a Class A monitoring procedure will be used. 3.5.4 Windows 2008 DHCP servers This role will have 10 Windows 2008 DHCP servers that will provide all of the Windows 2008 servers in the server farm with IP address: 25 This role will initiate a Class A backup procedure to ensure DHCP settings, subnets and network settings are backed up. Using a Class A backup procedure ensures the data will be easily restorable as well as using the least amount of tape media when a restore needs to be done. Security is important on Windows 2008 servers handling DHCP so a Class A security procedure will be initiated. Here are several services that should be enabled in order for the Windows 2008 DHCP servers to function properly: DHCP Server IIS Admin Service The Windows 2008 DHCP servers will have a Class A remote control procedure so the administrators can access these servers remotely in case the administrators need to restart services manually, update DHCP settings, or manually edit/configure subnets. A Class A patch management procedure will be initiated on the DHCP servers to make sure that the patches and updates are tested and deployed in a organized and timely manner. Auditing of the 10 Windows 2008 DHCP servers with a Class A auditing procedure will help the administrators keep track of the hardware components on each server as well as the software installed. Since DHCP settings are stored on these Windows 2008 DHCP servers a Class A monitoring procedure will be used. 3.5.5 Windows 2008 Exchange servers This role will have 10 Windows 2008 Exchange servers that will provide email services to all of the hosting customers: This role will initiate a Class A backup procedure to ensure customers mail accounts and email messages are backed up. Using a Class A backup procedure ensures the data will be easily restorable as well as using the least amount of tape media when a restore needs to be done. Security is extremely important on these Windows 2008 Exchange servers handling customers email so a Class A security procedure will be initiated. Here are several services that should be enabled in order for the Windows 2008 Exchange servers to function properly: Internet Mail Service IIS Admin Service The Windows 2008 Exchange servers will have a Class A remote control procedure so the administrators can access these servers remotely in case the administrators need to restart services manually or update mailbox settings. A Class A patch management procedure will be initiated on the Exchange servers to make sure that the patches and updates are tested and deployed in a organized and timely manner. Auditing of the 10 Windows 2008 Exchange servers with a Class A auditing 26 procedure will help the administrators keep track of the hardware components on each server as well as the software installed. Since email messages are stored on these Windows 2008 Exchange servers a Class A monitoring procedure will be used. 3.5.6 Windows 2008 SQL server This role will have 10 Windows 2008 SQL servers that will provide database services and store the web hosting customers’ databases. This role will initiate a Class A backup procedure to ensure customers databases are backed up. Using a Class A backup procedure ensures the data will be easily restorable as well as using the least amount of tape media when a restore needs to be done. Security is extremely important on these Windows 2008 SQL servers handling customers databases so a Class A security procedure will be initiated. Here are several services that should be enabled in order for the Windows 2008 SQL servers to function properly: SQL Server IIS Admin Service The Windows 2008 SQL servers will have a Class A remote control procedure so the administrators can access these servers remotely in case the administrators need to restart services manually or update mailbox settings. A Class A patch management procedure will be initiated on the SQL servers to make sure that the patches and updates are tested and deployed in a organized and timely manner. Auditing of the 10 Windows 2008 SQL servers with a Class A auditing procedure will help the administrators keep track of the hardware components on each server as well as the software installed. Since the customers’ databases are stored on these Windows 2008 SQL servers a Class A monitoring procedure will be used. 3.5.7 Windows 2008 Automation servers This role will have 5 Windows 2008 Automation servers that will handle the IT automation functions. This role will initiate a Class A backup procedure to ensure all automation server settings/configurations are backed up. Using a Class A backup procedure ensures the data will be easily restorable as well as using the least amount of tape media when a restore needs to be done. Security is extremely important on these Windows 2008 automation servers handling the IT automation functions of the network so a Class A security procedure will be initiated. Here are several services that should be enabled in order for the 27 Windows 2008 automation servers to function properly: Routing and Remote Access IIS Admin Service The Windows 2008 automation servers will have a Class A remote control procedure so the administrators can access these servers remotely in case the administrators need to restart services manually or make any server level changes. A Class A patch management procedure will be initiated on the automation servers to make sure that the patches and updates are tested and deployed in a organized and timely manner. Auditing of the 5 Windows 2008 automation servers with a Class A auditing procedure will help the administrators keep track of the hardware components on each server as well as the software installed. Since the IT automation services/functions are stored on these Windows 2008 automation servers a Class A monitoring procedure will be used. 3.5.6 Accounting, Human Resources, and Legal XP Desktops This role will have 7 accounting, 5 Human Resources, and 5 Legal XP Desktops. All this computers will have the same hardware configurations and similar type of software. This role will also have the same power policies. Besides that, this role will have the following automation functionalities: This role will have class B backup procedure to ensure that all the data on these machines is backed up frequently and when the time comes to restore any data it will be done in an efficient way and in the least amount of steps. This role will have class B security procedure. The accounting, human resources, and legal departments’ employees are not too technological so having virus scanning and definition updates automatically will help to keep the machines protected. This role will have class A remote control procedure because it needs to be capable of being remotely connected to in case support needs to be done on them. This role will have class B patching procedure because there’s no need to have patching automated and it is relative easy to roll back any patching that has problems. This role will have class A auditing procedure applied to it. These machines need to have a specific hardware configuration and doing hardware auditing will make sure those configurations aren’t changed. This role will have class B monitoring so that users of these machines don’t install any other software on them and to have the same power policies on all the machines. 3.5.7Tech Support, Customer Service, and Sales XP Desktops This role will have 20 tech support, 25 customer service, and 15 sales XP Desktops. This role will have the following automation functionalities too: This role will have class C backup procedure to backup important data on these machines. But the data on this role is not as critical as in others so it will only have full 28 backups and differential backups in the days specified, no incremental backups will be done on this role. This role will have class B security procedure. Virus scanning and definition updates will be done automatically without user input to have these machines protected. This role will have class A remote control procedure because it needs to be capable of being remotely connected to in case support needs to be done on them. This role will have class B patching procedure; this role is not critical enough so patching is manual. This role will have class A auditing procedure applied to it because hardware configurations can’t change from what they originally are. This role will have class B monitoring because it helps to ensure that power policies are followed and that no prohibited use is being done by employees. 3.5.8Marketing XP Laptops This role will have 3 Marketing XP Desktops. These desktops will have a better hardware configuration than all the other departments’ XP desktops due to the fact that marketing computers run memory extensive software like Photoshop and Illustrator. This role will have the following automation functionalities: This role will have class B backup procedure to backup critical data like graphic files from the marketing department. This role will have class B security procedure. Virus scanning and definition updates will be done automatically without user input to have these machines protected. This role will have class A remote control procedure because it needs to be capable of being remotely connected to in case support needs to be done on them. This role will have class A patching procedure, the data and work on this machines is important enough that it will be a problem if this machines are down for a long amount of time. So patching will be done only after the patches have been approved to be safe to install. This role will have class A auditing procedure because it is essential to make sure there is enough disk space to store graphic files that are big most of the time, and that there is enough memory to handle the workload this machines have. This role will have class B monitoring to make sure rules are being followed by employees. 3.5.9Executives XP Laptops This role has 10 executives XP Laptops. This role will have the following automation functionalities: This role will have class B backup procedure because the executives using these laptops want their work and data to be securely backed up. This role will have class B security procedure. Virus scanning and definition updates will be done at a designated time when users are less likely to be using the computers, for this role of laptops, this means that executives will have to keep the 29 machines online at those times. This role will have class A remote control procedure in case these executives are away and support needs to fix anything on them. This role will have class A patching procedure because executives can’t afford for their laptops to be down when they need it for presentations and other important work. Testing patches before applying them to these laptops will ensure no complications will come up due to patching problems. This role will have class B auditing because since this role only have laptops, they come preconfigured and no hardware changes will be done to them. Also, these laptops are used by executives and not employees, so there is no risk of prohibited hardware changes being done on them by employees. This role will have class B monitoring so that power policies are enforced, especially on these laptops that also have batteries on them. 3.6 Mapping Functions to Agent Roles 3.6.1 Roles and Functions Roles (Class) Remote Control Patch Mgt/Testing Audit Inventory (Class) (Class) (Class) A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A B B A B A B C B A B A B Backup Security (Class) Windows 2008 IIS Servers Internal Windows 2008 DC/DNS Servers External Windows 2008 DNS Servers Windows 2008 DHCP Servers Windows 2008 SQL Servers Windows 2008 Exchange Servers Windows 2008 Automation Servers Accounting, Human Resources, and Legal XP Desktops Tech Support, Customer Service, and Sales XP Desktops Marketing XP Desktops Executive XP Laptops B B Monitoring (Class) B A A A B B A A B B Table 3.6.1a: Each letter indicates the class of features and automation functions that will be applied to each individual role based on the procedure classes’ tables. 3.6.2Roles with Detailed Backup Functions Table 3.6.2a below shows Automated Solutions current backup roles as well as a calendar showing when certain backups are performed. Server Farm FB DB 30 Tape NAS Local Backup Backup Roles Copper Servers 1st 7th, 14th, 21st Yes Bronze Servers 1st 7th, 14th, 21st Yes st th th st Silver Servers 1 7 , 14 , 21 Yes Platinum Servers 1st 7th, 14th, 21st Yes st th th st MS SQL Servers 1 7 , 14 , 21 Yes MS Exchange Servers 1st 7th, 14th, 21st Yes st th th st DNS/DC Servers 1 7 , 14 , 21 Yes Office Backup Roles Accounting, Human Resources, and Legal Yes Computers Tech Support, Customer Service, and Yes Sales Computers Marketing Computers Yes Executives Laptop Yes Computers Table 3.6.2a – Shows the current backup roles in the server farm and office Yes Yes Yes Yes Yes Yes Yes - - FB – Full Backup (complete backup of all files) Full backup is performed on the 1st of every month for the Data Center computers at 2am. DB – Differential Backup (catches all files that have changed since the last full backup) Differential Backup is performed on the 7th, 14th, and 21st of each month Currently, backup of the data on the servers is being done 4 times a month. First, it is the Full Backup and then 3 separate Differential backups throughout the rest of the month. The issue with this is if a restore is need on the 9th, 16th or any day in which a backup isn’t performed only the data from the differential backup is restorable. For example, if the data needs to be restored for the 9th, first a Full Backup is initialized then a Differential Backup from the 7th but then no data from the 8th to the 9th is recoverable. Possibly having incremental backups in between the Differential Backups would be best. Office computers are currently saving their data to their local hard drives. This is an issue in case of hard drive failure on those particular computers. Also, files currently cannot be shared among the rest of the employees in the department. Using NAS on those office computers would allow the files to be saved off of those local computers where the data can be accessible by the employees in the departments. Also, saving files to the NAS would alleviate the issue of losing the data in case of a hard drive failure on those local computers. 31 Figure 3.6.2abelow shows the current backup roles in a calendar format: Figure 3.6.2a Table 3.6.2b shows Automated Solutions proposed backup roles as well as a calendar showing when certain backups would be performed. Server Farm Backup Roles Windows 2008 IIS Servers Internal Windows 2008 DC/DNS Servers External Windows 2008 DNS Servers Windows 2008 DHCP Servers Windows 2008 Exchange Servers Windows 2008 SQL Servers Windows 2008 Automation Servers FB IB DB Tape NAS Backup 1st 1st 1st 1st 1st 1st 1st AOD AOD AOD AOD AOD AOD AOD 7th, 14th, 21st 7th, 14th, 21st 7th, 14th, 21st 7th, 14th, 21st 7th, 14th, 21st 7th, 14th, 21st 7th, 14th, 21st Yes Yes Yes Yes Yes Yes Yes - Yes Yes Yes Yes Yes Yes Yes 32 Office Backup Roles Accounting, Human Resources, and 2nd AOD 8th, 15th, 22nd Yes Legal Computers Tech Support, Customer Service, and 2nd 8th, 15th, 22nd Yes Sales Computers Marketing Computers 2nd AOD 8th, 15th, 22nd Yes nd th th nd Executives Laptop Computers 2 AOD 8 , 15 , 22 Yes Table 3.6.2b – Shows the proposed backup roles in the server farm and office Yes Yes Yes Yes FB – Full Backup (complete backup of all files) Full backup is performed on the 1st of every month for the Data Center computers at 2am and the 2nd of every month for the Office computers and 2am. IB – Incremental Backup (catches files that have changes since the last backup) AOD – All Other Days This incremental backup will perform on all of the days that either a FB (Full Backup) or DB (Differential Backup) isn’t ran: Days of the Month it will run: 3rd-6th 8th-13th 15th-20th 22nd-31st DB – Differential Backup (catches all files that have changed since the last full backup) Differential Backup is performed on the 7th, 14th, and 21st of each month 33 Figure 3.6.2b below shows the current backup roles in a calendar format: Figure 3.6.2b 3.6.3Reasoning for Tape Backup vs. NAS Backup Tape backup is best for archiving the data. Disk (NAS – Network Access Storage) is best for making copies of the entire disk image, OS and applications. Since the data is most important for the servers, tape is the best choice for customer accounts as well as the Mail, SQL and DNS/DC servers. NAS backup will be using SCSI drives with RAID. 4. Glossary Terminology Automation Servers Description Servers that are used to manage workstations, servers, and other network devices. Domain Controllers A domain controller is a server that is running a version of the Microsoft Windows Server 2000, 2003, or 2008 and has the Active Directory service installed. A domain controller (DC) is a server that responds to security 34 authentication requests (logging in, checking permissions, etc.) within the Windows Server domain. Domain Name The Domain Name System (DNS) is a hierarchical naming system for System computers, services, or any resource participating in the Internet. DNS translates human-friendly computer hostnames into IP addresses. SQL Server Any database management system (DBMS) that implements the SQL query language. In the context of Automated Solutions, we will be using Microsoft SQL Servers. Exchange Exchange Server is a messaging and collaborative software product Server developed by Microsoft. Exchange's major features consist of electronic mail, calendaring, contacts and tasks; support for mobile and web-based access to information; and support for data storage. SCSI Small Computer System Interface, or SCSI is a set of standards for physically connecting and transferring data between computers and peripheral devices. SCSI is most commonly used for hard disks and tape drives, but it can connect a wide range of other devices, including scanners and CD drives. RAID Redundant Array of Inexpensive Disks, or RAID is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. When multiple physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across multiple disks, but the array is seen by the computer user and operating system as one single disk. 35