Mosaic Technology’s IT Director’s Series: Exchange Data Management: Why Tape, Disk, and Archiving Fall Short Mosaic Technology Corporation * Salem, NH (603) 898-5966 * Bellevue, WA (425)462-5004 Mosaic IT Director’s Series Exchange Data Management Exchange Data Management: Why Tape, Disk, and Archiving Fall Short Introduction......................................................................................3 Tape Backup Challenges .....................................................................4 Full Database Recovery ...................................................................5 Mailbox Recovery............................................................................6 Message Recovery ..........................................................................6 Pros and Cons of Tape Backup Methods .............................................7 Challenges of Disk-Based Backup.........................................................8 Backup to Disk ...............................................................................8 Continuous Data Protection ..............................................................8 Snapshot-Based Backup ..................................................................9 Application Intelligence....................................................................9 Pros and Cons of Disk-Based Backup Methods................................... 10 E-mail Archiving .............................................................................. 11 Benefits of E-mail Archiving............................................................ 11 Challenges of E-mail Archiving........................................................ 11 The Challenge of Managing Multiple Products....................................... 12 Next Generation Data Management for Exchange ................................. 14 Conclusion...................................................................................... 15 www.mosaictec.com 2 Mosaic IT Director’s Series Exchange Data Management Introduction Email has become the number one method of business communication, exceeding even the telephone in importance within an organization. It is a key application in corporate data centers and email servers contain an increasing percentage of corporate data assets. As a result, email is now the mission critical application for enterprises. With this growth in importance comes a challenge for IT administrators - ensuring that in the event of errors or failures, email can be recovered and restored as soon as possible. For many years, traditional backup products tried to address the challenges that come with protecting and recovering e-mail messages. The advent of Exchange 2000 and 2003 brought increases in Message Store size and a scalability challenge for many organizations. Traditional tape backup methods can no longer capable of keep up with the increased size of Message Stores and limited backup window. Recent regulations (SEC, HIPAA, SarbanesOxley, etc.) that require organizations to archive e-mail add to the problem. This paper reviews traditional data protection and e-mail archiving methods and describes their pros and cons with respect to managing Exchange. Mimosa Systems believes that the challenges that you are facing to manage Exchange can be solved with a new breakthrough approach to Exchange data management. www.mosaictec.com 3 Mosaic IT Director’s Series Exchange Data Management Tape Backup Challenges For the majority of organizations, tape backup is the preferred solution to protect Exchange. With tape you can use NTBACKUP.EXE, the free backup utility provided by Microsoft, or you can use one of the popular third-party backup products. These tape backup products perform online backup of Exchange and stream data from the Exchange database to tape media at high speeds. Using the Exchange Backup API (ESE – Extensible Storage Engine), third-party products perform a full backup of Exchange including all databases and log files. With Exchange 5.5, there was one Exchange database per server -- by default a full backup protected the entire 5.5 Server. Exchange 2000 and 2003 introduced the storage group concept. Exchange 2000/2003 Servers can have up to four storage groups. Each storage group can contain up to five databases. Exchange 2000/2003 full backups are performed at the storage group level because a storage group shares the same log file for all the databases it contains. As with Exchange 5.5, it is important to copy the complete log files with each backup. Figure 1: Traditional Tape Backup Schedule. Standard practice dictates a weekly full backup of Exchange. Each day an incremental Exchange backup is performed to copy new log files. (figure 1.) After the backup completes log files are truncated. Optionally a differential backup makes a copy of new log files created since the last full backup, but does not truncate the logs. In the event of a recovery, the full backup tape is restored followed by the incremental backup tapes. Recovery time increases with the number of tapes to restore, so many users prefer a daily full backup of Exchange if their backup window permits. www.mosaictec.com 4 Mosaic IT Director’s Series Exchange Data Management Full Database Recovery Traditional tape backup methods allow for full recovery of the Exchange database(s). In the event of a hardware failure or corrupt database, the entire Exchange database can be restored from tape. (figure 2.) Recovery time depends on the amount of data contained in the Exchange database. As a rule of thumb, recovery time is two to three times the amount of time it takes to perform the backup. For a typical 40 GB Exchange database, backup time is one to two hours and the recovery time is two to six hours, depending on the amount of log files. For most organizations, going without e-mail service for greater than two hours is unacceptable. One alternative is to use a Recovery Server. A Recovery Server is a complete Exchange Server running in standby, ready to take over e-mail services when the primary Exchange Server fails. New in Exchange 2003 is a Recovery Storage Group. A Recovery Storage Group is a storage group that is available to take over temporary e-mail services when a primary storage group fails. It is essential that you protect Exchange from a total system failure and can restore a complete Exchange database. However, the most common errors that impact Exchange are not system hardware failures, but human errors. A common system administrator error is to inadvertently delete a user mailbox. If a large number of mailboxes are active it is easy to confuse mailboxes and delete one by mistake. The Exchange Backup API supports full Exchange database backup only and does not support mailbox restore; therefore, some interesting methods have been developed to recover a lost mailbox. Figure 2: Typical Exchange Full Database Recovery. www.mosaictec.com 5 Mosaic IT Director’s Series Exchange Data Management Mailbox Recovery Using Exchange 5.5, a common method to recovery a mailbox is to restore a full database backup to a Recovery Server. This requires a dedicated server which may be cost prohibitive to some organizations. Microsoft added a Mailbox Recovery feature in Exchange 2000 and 2003. It lets you configure a set period of time for deleted mailboxes to remain on the Exchange Server. Normally a setting of 30-days gives you plenty of time to recover a mailbox deleted by mistake t. Beginning with Exchange 5.5, third-party backup vendors devised a method of mailbox backup by introducing a second backup pass at the mailbox level. This method uses the Microsoft Message API (MAPI) and is commonly referred to as “Brick-level Backup”. There is a drawback to this method -- the second-pass of the Exchange Database places a very large burden on the Exchange Server CPU. It takes four to eight times as long to perform a brick-level backup as it takes to perform a full backup. In many cases, the brick-level backup alone can take longer than a 24 hour back up window. Due to these penalties, the majority of users do not perform brick-level backups. They rely on the Mailbox Recovery feature to restore deleted mailboxes in Exchange 2000/2003 and use a Recovery Server in Exchange 5.5. (Figure 3.) Figure 3. Typical Exchange Mailbox Recovery Message Recovery It is common for end users to delete e-mails by mistake. The Outlook client provides a Deleted Items folder for message recovery, but if this folder is emptied, help is required. No practical tape backup method exists for www.mosaictec.com 6 Mosaic IT Director’s Series Exchange Data Management message level recovery. It is not supported by the Exchange Backup API. Third-party brick-level backup methods are too slow to be practical. General practice is to configure the Exchange Server Deleted Item Folder to 30 days to allow for message recovery. After 30 days messages are simply not recoverable. Backup tapes can be restored to a Recovery Server for message recovery, but this time consuming process is only practical in special situations or for special individuals (e.g. company executives, legal searches, etc.). Pros and Cons of Tape Backup Methods The advantage of tape backup is that the technology is mature, performs well, and is fully supported by Microsoft for full Exchange backups. If the Exchange database is not too large (< 40GB) backups can be performed in a reasonable amount of time allowing for daily full backups. For large databases, incremental backups can be used, but this increases total recovery time. Another advantage is that tape backups can be transported offsite for disaster recovery. A disadvantage of daily full backups or daily incremental backups is that the amount of data that can be lost is potentially 23 hours of data. The period of time where data is not being protected defines your Recovery Point Objective (RPO). Depending on the needs of your organization, this may be unacceptable. A second disadvantage of tape backup is the slow recovery time. Depending on your backup scheme, tape recovery can take hours increasing your Recovery Time Objective (RTO). If brick level backups are not performed, mailbox or message level restores can take many more hours. Depending on the needs of your organization, this may also be unacceptable. Cons Pros Mature and stable technology 23 hour RPO Suitable for small databases Slow total recovery (RTO) Offsite storage for DR Not suitable for large databases Limited mailbox recovery Complex and error prone Figure 4. Pros and Cons of Traditional Tape Backup www.mosaictec.com 7 Mosaic IT Director’s Series Exchange Data Management Challenges of Disk-Based Backup The disadvantages of tape backup are reasonably controlled with Exchange 5.5 due to the single database per server architecture of Exchange 5.5. The size of each Exchange Server is limited by processing power and storage capacity, which limits the backup window and recovery time. Many organizations deployed additional 5.5 servers to compensate for email growth and shrinking backup windows. This added cost and complexity for dealing with multiple Exchange Servers. With the introduction of Exchange 2000 and 2003, Exchange Servers Message Stores have grown significantly to support more mailboxes per server and increased the backup window and recovery time. New methods for Exchange data protection are necessary to allow for acceptable recovery times. Backup to Disk A common method to reduce the backup and recovery time is to use disk as the backup target in place of traditional tape media. Leading third-party backup vendors offer this capability for large Exchange databases whose backup window is too small to complete full backups to tape. These solutions successfully reduce the time to perform a full backup and fit the backup window. Full database recovery performance is also improved. By leveraging new “cheap disk” technology, these disk-based backup solutions are adopted by users with very large Exchange databases who want fast backup and recovery. Disk-based backup improves total recovery time, but they continue to use the Exchange Backup API and MAPI for brick-level backups. These data streams are optimized for linear tape and do not take advantage of the random read/write capability of disk. This adds overhead and does not optimize the data for quick restores. Disk-based backup methods the same as tape and do not reduce the 23-hour window that Exchange data is unprotected. Continuous Data Protection Continuous Data Protection (CDP) is disk-based backup designed to protect Exchange in near real-time. It uses a replication agent (device driver) that sits on the Exchange Server (very low in the processing stack) and creates a 100% copy (mirror) of the Exchange Server. After the initial mirror, the software replicates block-level changes to the files. This allows for real-time, up-to-the-minute backup of the Exchange Server. www.mosaictec.com 8 Mosaic IT Director’s Series Exchange Data Management The host mirror can be installed locally or remotely depending on the disaster recovery strategy. In the event of system hardware failure (or site disaster) the host mirror can take over Exchange services. A CDP solution is complex and is used by organizations that require the most available Exchange services. Full tape backups are still necessary to maintain a point-in-time copy of Exchange. For example, should the Exchange database be corrupted, a tape backup is necessary for recovery because the host mirror is also corrupted. Full backups can be performed on the “split-mirror” which avoids any backup window problems. Snapshot-Based Backup Snapshot-based backup methods are an alternative to CDP and provide multiple recovery points. Snapshots use a low-level agent that sits on the Exchange Server and protects data using a “copy-on-write” method. As new blocks are written to the primary volume, the original blocks are first copied to the snapshot volume. This snapshot volume can be on the same local array or on a remote array. Snapshots protect only changed data and do not protect the data on the primary volume. For logical errors, snapshots are an efficient method of data protection. Snapshots can be taken hourly, for example, and reduce the RPO for Exchange. The Exchange Server can be “rolled back” to an earlier point in time to recover from a database corruption event. For protection from a system crash, full tape backups remain necessary to protect data on the primary volume. The backup can be performed using the snapshot to reduce the backup window. Application Intelligence The major disadvantage of CDP and snapshot-based backup methods is they use low-level agents that operate in the Exchange Server to intercept blocks as they are written to disk. Because they operate at such a low level, they have no understanding of the data the blocks contain. If corrupt data is introduced on the Exchange server, it is simultaneously introduced on the disk-based replica, rendering it useless. This concept of understanding the data is called Application Intelligence. The tape-based and disk-based methods that use the Exchange Backup API are application intelligent and can detect data corruption. They can also be used to perform mailbox or message level recovery. The replication technologies that copy the blocks for CDP and snapshot cannot detect data corruption and are only useful for full system recovery. www.mosaictec.com 9 Mosaic IT Director’s Series Exchange Data Management Pros and Cons of Disk-Based Backup Methods Disk-based backups use storage media which performs faster than traditional tape storage media. As a simple substitute for tape media, full Exchange database backups and recovery times can be improved dramatically using disk. CDP delivers the highest availability and is valuable to organizations that require near real-time recovery for Exchange. Snapshot-based backup methods offer an efficient means to reduce the RPO for Exchange. CDP disk-based backup methods are deficient because they require a lowlevel host agent be installed on Exchange to intercept blocks of data. They are also not application intelligent. This renders them ineffective if data corruption occurs and makes them inflexible for mailbox or message level recovery. Disk-based backup methods cannot protect Exchange in all recovery situations – crash recovery, corruption recovery, mailbox recovery and message recovery. Combining multiple tape and disk-based recovery methods adds complexity and increases the chance of error. Pros Reduce Exchange RTO CDP offers near real-time recovery Reduce Exchange RPO Useful for large databases Avoid backup window problems Cons Low-level agents installed on Exchange Server No mailbox or message recovery No data corruption detection Full tape backup still required Complex and error prone Figure 5. Pros and Cons of Disk-Based Backup Methods www.mosaictec.com 10 Mosaic IT Director’s Series Exchange Data Management E-mail Archiving E-mail archiving applications have received much attention in recent years due to the number of organizations fined for not properly preserving e-mail records. Because e-mail contains valuable company information, it must be preserved just like other paper company records. Financial institutions that are governed by SEC Rule 17a-4 are required to archive e-mail. All public corporations are also required to preserve e-mail records according to the Sarbanes-Oxley Act of 2002 as are other organizations in healthcare, government and bioscience. Benefits of E-mail Archiving The primary benefit of e-mail archiving is the preservation of company e-mail in a separate database that is designed to be scalable and allow fast search and retrieval. By indexing all incoming messages, full-text searches can quickly scan the entire archive for messages or attachments. These searches can be performed by authorized staff to comply with legal or regulatory investigation or by company employees searching for old e-mail. E-mail preservation and fast accessibility are two major benefits of e-mail archiving that go beyond the functionality that Exchange offers. E-mail archiving also provides Mailbox Extension. Mailbox Extension is Hierarchical Storage Management (HSM) technology that removes attachments in Exchange and replaces them with a “short-cut”. The originals are stored in the archive and are accessible by users from Outlook. The major benefit of Mailbox Extension is that attachments that normally occupy storage on Exchange can be relocated to the archive and the Exchange database size can be reduced dramatically – by as much as 80%. A smaller Exchange database means you can increase mailbox quotas (or eliminate them all together) and reduce the use of .pst files. Users can take advantage of the extra mailbox space to store old e-mail – a feature referred to as the “infinite mailbox”. Challenges of E-mail Archiving The basic challenge of e-mail archiving is that it must be managed in addition to protecting Exchange. It’s a standalone application. Time managing e-mail archiving is additional to the workload required to manage Exchange backup and recovery. E-mail archiving maintains an archive of Exchange data. However, this data is only used for archive search and retrieval. It is not used for data protection. Traditional tape backup methods are still required for full Exchange database protection. www.mosaictec.com 11 Mosaic IT Director’s Series Exchange Data Management A second challenge of e-mail archiving is that it uses the Microsoft Message API (MAPI) to read Exchange and extract message information. MAPI, as discussed previously, is slow and CPU intensive. Exchange Server performance suffers anytime MAPI is used to read its database. Archive Window is a new term that represents the impact e-mail archiving has on Exchange. Best practice is to run MAPI on the weekend to capture messages for archive and not burden Exchange Server during the week when usage is high. Because the archive server only reads Exchange once a week, messages deleted from the Exchange Server during the week are not captured. Therefore, e-mail archive for compliance relies on Exchange Journaling to capture e-mail during the week. Exchange Journaling is a native feature of Exchange that makes a copy of all e-mail for any assigned mailbox. It stores the e-mail in a separate journal folder. Journaling is a reliable method to capture 100% of e-mail but it introduces its own CPU burden and storage burden on Exchange. For this reason Exchange Administrators are very wary about using Journaling. Cons Pros Reduces compliance risk Preserves e-mail Full-text search of e-mail Reduces Exchange database size Reduces need for mailbox quotas and .pst files Stand alone application Additional hardware and software costs Burdens Exchange CPU with MAPI; archive window problem Requires Journaling which introduces additional burden on Exchange No integration with backup Figure 6. Pros and Cons of E-mail Archiving The Challenge of Managing Multiple Products Managing Exchange data using multiple data management products creates its own challenges. Depending on the size of your Exchange database(s) and the RTO and RPO requirements of your organization, you may need multiple products to meet your service level agreements. The most obvious problem is the increase in management complexity -- multiple products, multiple consoles, multiple servers, multiple storage areas and multiple storage formats. You will be challenged to manage all these products, and in the www.mosaictec.com 12 Mosaic IT Director’s Series Exchange Data Management event of a system crash, you will be challenged to perform a system recovery in the least amount of time – not an easy task! Another disadvantage of multiple data management products is the cumulative demands they place on your Exchange Server. Depending on the method used to interface with Exchange, MAPI places a significant burden on the Exchange Server CPU. Your Exchange Server needs to be running at full capacity 24x7 and provides better performance with less CPU competition from third-party products. Each third-party product installs its own agent on Exchange to manage data transfer. Multiple agents add risk and complexity to your Exchange Server. Finally, using multiple data management products makes it impossible to manage Exchange data for retention and disposition. New legal and regulatory requirements make it necessary to preserve e-mail for a specific time period, and to dispose of e-mail when it is no longer needed. When e-mail data exists in multiple locations, in multiple formats, it is difficult to manage. Each independent product protects its copy of the data and has no notion of being integrated with other products for the single purpose of retention and disposition. This situation places your organization at risk of expensive legal discovery and regulatory penalty. Product Full Tape Backup Brick-level Tape Backup Continuous Replication Copy-onWrite Snapshot E-mail archive Full Mailbox Recovery Recovery Message Recovery Search Retention Discovery Disposition Mailbox Extension Yes No No No No No No Yes Yes No No No Yes No No No No No Yes No No No No No No No Yes Yes Yes Yes Figure 7. Comparison of Exchange Data Management Products www.mosaictec.com 13 Mosaic IT Director’s Series Exchange Data Management Next Generation Data Management for Exchange Mimosa Systems designed Mimosa NearPoint for Microsoft® Exchange Server, to address the many challenges that face Exchange data management. NearPoint dramatically improves recovery time and reduces the recovery point with near continuous data protection. It also provides email archiving in a single-integrated solution. Mimosa did not believe that a separate stand-alone application was required for e-mail archive. In fact, data protection and e-mail archive applications can be served simultaneously with a single Indexed Object Repository. Administrators, users and auditors all enjoy self-service access to the Indexed Object Repository using standard Microsoft Outlook and OWA interfaces. Complete mailboxes can be browsed and searched at any point-in-time to find lost e-mail or to retrieve complex e-mail history information for legal discovery. The Mimosa NearPoint software solution, based on standard Microsoft technologies, runs on a standard Intel server. It is application-intelligent providing deep integration with Exchange and Outlook. NearPoint is diskbased and leverages commodity storage, such as SATA RAID and NAS appliances, to provide near real-time data protection and access to archived data from multiple Exchange Servers. Deployment is easy since there is zero foot print on Exchange Servers and desktops. Mimosa provides users and auditors instant search and access to email that has been protected, archived, and extended via a standard Microsoft Outlook or Outlook Web Access user interface. www.mosaictec.com 14 Mosaic IT Director’s Series Exchange Data Management Figure 8. Mimosa NearPoint Architecture Conclusion The challenges of managing Exchange are complex and involve the use of multiple independent and costly data management products. Traditional tape backup methods provide full Exchange database protection; however they are limited by slow recovery time, a 23-hour recovery point and for all practical purposes no mailbox or message level recovery. Disk-based backup methods reduce Exchange recovery time and the recovery point but they are kernel intrusive, suffer from data corruption problems and do not support mailbox or message level recovery. Depending on your particular recovery requirements, multiple backup products may be necessary to achieve your overall recovery service objectives with added complexity and cost. E-mail Archiving is another MAPI driven application that increases the load on your Exchange Server and further increases the demand for already stretched budgets and administrator’s time. Mimosa NearPoint for Microsoft Exchange Server is a next generation data management solution for Microsoft Exchange that combines immediate finegrained recovery, mailbox extension and archiving in a single solution. NearPoint delivers in a single-integrated solution features that are typically found in multiple products. www.mosaictec.com 15 Simplifying IT Infrastructure Mosaic Technology Salem, New Hampshire headquarters For over 20 years Mosaic Technology has provided IT Infrastructure solutions to companies around the world. We help companies evaluate their IT environments and develop solutions that meet IT and business needs. We deliver solutions that are easy to implement, easy to use, and deliver immediate and long term ROI. Quality Partnerships and Vendor Independence Mosaic partners with companies that deliver quality and value. Our product portfolio includes a variety of best‐of‐breed options ‐ often from competing manufacturers. This lets us keep an independent approach to technology and ensures our clients receive the best solution for their needs and budget. In order for us to deliver solutions that fit, we first study and understand our clients environment. Our sales and technical staff work with you through a collaborative assessment process. Together, we identify strengths and weaknesses within your IT Infrastructure. Only then do we recommend appropriate solutions that are a fit for ‐ and will improve ‐ your organization. Simplifying Infrastructure ‐‐ Solutions and Support Simple is better. Whether designing a tiered storage implementation or redesigning your backup strategy ‐ simplification makes sense. With Mosaic you get a complete portfolio of proven technologies that will streamline your operations. Our areas of expertise include: ** Mosaic Value ** • Storage – Primary, Secondary and Archival We give you independent technical • Backup, Recovery and Archiving input to your IT planning and • Servers execution. We work with you to • Networking understand and identify solutions • Software Solutions that work for you. Our product • Professional Services portfolio and technical resources let • Maintenance us be product neutral. So solutions • Flexible Leasing Options we propose are always driven by • Legacy Systems and Products your needs not by a specific manufacturer or technology. Legacy Support ‐‐ Used Systems and Trade‐ins Mosaic maintains a dynamic refurbished business. We can meet your needs or support your legacy systems with quality used systems or components. And we can give you aggressive trade‐in value on your older or ‘coming‐off lease’ equipment. On‐Site Assessments The best way to experience the Mosaic Value is to see for yourself. Every IT department is on a continual improvement path. We can give you an independent set of eyes on specific areas of your operations. It won’t cost you a dime and you will get fresh, independent, vendor neutral input into your backup, storage, network, or related area. Give us a call to make an appointment East Coast (603) 898‐5966 | West Coast (425) 462‐5004. Mosaic Technology www.mosaictec.com Partner Portfolio STORAGE SOLUTIONS BACKUP SOLUTIONS BACKUP SOLUTIONS NETWORKING SOLUTIONS SOFTWARE SOLUTIONS SERVER SOLUTIONS MAINTENANCE SOLUTIONS Mosaic Technology East Coast (603) 898‐5966 * West Coast (425) 462‐5004 www.mosaictec.com