Depending on the Exchange version you are deploying, you either have two server roles or a single server role: Exchange Server 2013 includes two server roles, the Client Access and Mailbox server roles. The Client Access server role facilitates client connectivity, while the Mailbox server role stores the data that users will ultimately access on a daily basis. Therefore, ensuring that you design these server roles correctly is critical to your Exchange Server 2013 design. Exchange Server 2016 combines the Mailbox and Client Access server roles into a single server role, called the Mailbox server role. With Exchange 2013/2016 you can deploy a solution that leverages mailbox resiliency and has multiple database copies deployed across datacenters, implements single item recovery for data recovery, and has the flexibility in storage design to allow you to deploy on storage area networks utilizing fibre-channel or SATA class disks or on direct attached storage utilizing SAS or SATA class disks with or without RAID protection. But, in order to design your solution, you need to understand the following criteria: User profile - the message profile, the mailbox size, and the number of users High availability architecture - the number of database copies you plant to deploy, whether the solution will be site resilient, the desired number of mailbox servers Server's CPU platform Storage architecture - the disk capacity / type and storage solution Backup architecture - whether to use hardware or software VSS and the frequency of the backups, or leverage the Exchange native data protection features Network architecture - the utilization, throughput, and latency aspects Previous versions of Exchange were somewhat rigid in terms of the choices you had in designing your server roles. The flexibility in the architecture with Exchange 2013/2016, allows you the freedom to design the solution to meet your needs. Prior to making any decisions, please review the following topics from the Exchange Online Help: High Availability and Site Resilience Backup, Restore and Disaster Recovery After you have determined the design you would like to implement, you can follow the steps in the Exchange 2013 Server Role Design Example article to calculate your solution's CPU, memory, and storage requirements, or you can leverage the Exchange Server Role Requirements Calculator. The calculator is broken out into the following sections (worksheets): Input Role Requirements Activation Scenarios Distribution Volume Requirements Backup Requirements Log Replication Requirements Storage Design Important: The data points provided in the calculator are an example configuration. As such any data points entered into the Input worksheet are specific to that particular configuration and do not apply for other configurations. Please ensure you are using the correct data points for your design Input When you launch the Exchange Server Role Requirements Calculator, you are presented with the Input worksheet. This worksheet is broken down into 5 key areas. This section is where you enter in all the relevant information regarding your intended design, so that the calculator can generate what you need in order to achieve it. Note: There are many input factors that need to be accounted for before you can design your solution. Each input factor is briefly listed below; there are additional notes within the calculator that explain them in more detail. Environment Configuration Within Step 1 you will enter in the appropriate information concerning your messaging environment's configuration - the high availability architecture and database copy configuration, the data and I/O configuration. Note: For optimal sizing, choose a multiple of the total number of database copies you have selected for the number of mailbox servers. Exchange Environment Configuration 1. What Exchange server version are you deploying for your solution? You can select either 2013 or 2016. 2. What server architecture are you deploying for your global catalogs? You can deploy either the 32-bit or 64-bit architecture for your Active Directory servers. The architecture you deploy will affect your core ratio planning. For more information, please see http://technet.microsoft.com/en-us/library/dd346701.aspx. 3. Do these servers only have the Mailbox server role installed? For Exchange 2013, you choose to deploy dedicated server roles or multi-role. Having the two server roles colocated affects your design in the areas of load balancing client requests, memory utilization, and CPU utilization. 4. Will these servers be deployed as guest machines in a virtualized environment? There is CPU overhead that must be accounted for when deploying guest machines that must be accounted for in the design. For Hyper-V deployments the overhead is about 10%. Check with your hypervisor vendor to determine their overhead and adjust the “Hypervisor CPU Adjustment Factor” accordingly. 5. Are you deploying a database availability group (DAG)? Deploying the solution as DAG provides you additional flexibility and resiliency choices like having multiple mailbox database copies, leveraging flexible mailbox protection features in lieu of traditional backups, and flexibility in your storage architecture (e.g. RAID or JBOD). 6. How many mailbox servers are you going to deploy within the primary datacenter? If you enter more than a single server (remember a DAG requires at least two and can support a maximum of 16), the calculator will evenly distribute the user mailboxes across the total number of mailbox servers and make performance and capacity recommendations for each server, as well as, for the entire environment. As for the secondary datacenter, the calculator will determine the number of mailbox servers you need to deploy there based on the requirements (number of databases, number of copies, etc.). 7. How many DAGs are you planning to deploy in the environment? If you enter more than a single DAG, then the calculator will distribute the user mailboxes across the total number of DAGs and make performance and capacity recommendations for each server and each DAG, as well as, for the entire environment. Site Resilience Configuration 1. Are you deploying the DAG in a site resilient configuration? A DAG can be stretched across 2 or more datacenters (the calculator only allows for 1 datacenter) without requiring the AD site or network subnet to be stretched. 2. What user distribution model will you be leveraging in your site resilient architecture? When planning a site resilience model with Exchange, keep in mind there are two variables that need to be considered: datacenter model and user distribution model. For the datacenter model, Exchange requires both datacenters to be in an Active/Active configuration. This means that both datacenters participating in the DAG solution must have active, reachable infrastructure and have the ability to support active load at any time. For the user distribution model, the design can support both Active/Passive and Active/Active user distribution. The calculator supports three scenarios for these user distribution models: 1. Active/Passive User Distribution Model - An Active/Passive user distribution architecture simply has database copies deployed in the secondary datacenter, but no active mailboxes are hosted there and no database copies will be activated there during normal runtime operations. However, the datacenter supports both single cross-datacenter database *overs, and full datacenter activation. 2. An Active/Active user distribution architecture has the user population dispersed across both datacenters (usually evenly) with each datacenter being the primary datacenter for its specific user population. In the event of a failure, the user population can be activated in the secondary datacenter (either via crossdatacenter single database *over or via full datacenter activation). There are two types of Active/Active user distribution models: 1. Active/Active (Single DAG) - This model stretches a DAG across the two datacenters and has active mailboxes located in each datacenter. A corresponding passive copy is located in the alternate datacenter. This scenario does have a single point of failure (potentially), the WAN connection. Loss of the WAN connection will result in the mailbox servers in one of the datacenters going into a failed state from a failover cluster perspective (due to loss of quorum): ---DC1--- ---DC2--- DAG1 DAG1 Active Copies Passive Copies Passive Copies Active Copies 2. Active/Active (Multiple DAGs) - This model leverages multiple DAGs to remove single points of failure (e.g., the WAN). In this model, there are least two DAGs, with each DAG having its active copies in the alternate datacenter: ---DC1--- ---DC2--- DAG-A (active/passive copies) DAG-A (Passive Copies) DAG-B (Passive Copies DAG-B (Active/Passive Copies) 3. In your site resilient architecture, how far behind can you get in terms of log shipping between datacenters? The effect of the RPO is to evaluate the non-contiguous peak hours (defined in Step 5), say 8am and 4pm, and determine the resulting throughput requirement, assuming that you can take the time in between 8 and 4 to catch up (within the specified RPO, of course). By allowing replication to get behind there are two outcomes: 1. Active Manager is less likely to choose a database copy that has a high copy queue length (unless more viable alternatives aren't available). 2. If the copy queue length is greater than the target server's AutoDatabaseMountDial setting, the database will not automatically mount once activated. Manually mounting that database will result in the loss of data that had not been copied. 4. Will you activation block the mailbox servers in the secondary datacenter? In certain situations (e.g. highly utilized network connection), you may want to control whether cross-site failovers occur automatically. This can be controlled by placing an activation block on the remote mailbox servers, thereby preventing Active Manager from selecting those copies during a failover. 5. When deploying an Active/Active (Single DAG) architecture, do you want to deploy dedicated disaster recovery Mailbox servers in the alternate datacenter? If deploying a single DAG Active/Active solution, you can choose to have dedicated DR Mailbox servers deployed in the secondary datacenter to be used in the event of disaster or utilize the existing mailbox servers that are hosting active mailboxes. Mailbox Database Copy Configuration 1. How many highly available (HA) mailbox database copy instances per database do you plan to deploy within a DAG? Enter in the number of highly available database copies you plan to have within the environment. This value excludes lagged database copies, but does account for both the active and any passive HA database copies you plan to deploy. For optimal sizing, choose a multiple of the total number of mailbox servers you have selected. 2. How many lagged database copy instances per database do you plan to deploy within a DAG? Lagged database copies are an optional feature that can provide protection against certain disaster scenarios (like logical corruption). Lagged database copies should not be considered an HA database copy as the replay will delay the availability of the database for use once activated. While technically there is no limit to how many lagged copies you can deploy within a DAG, the calculator limits you to a maximum of 2 copies. 3. How many highly available mailbox database copy instances per database do you plan to deploy within the secondary datacenter within a DAG? If you are deploying a site resilient solution, you can choose to a portion of the total HA database copies deployed in the secondary datacenter. 4. How many lagged database copy instances per database do you plan to deploy within the secondary datacenter within a DAG? If you are deploying a site resilient solution, you can choose to have a portion or all of your lagged database copies deployed in the secondary datacenter. Lagged Database Copy Configuration 1. Will you deploy the lagged database copies on a dedicated server? Using a dedicated server for lagged database copies certainly makes it easier to manage. For DAGs where the lagged database copies are evenly distributed across all the DAG mailbox servers, you will need to use the Suspend-MailboxDatabaseCopy with the -ActivationOnly flag to prevent them from being mounted, but there are scenarios that can clear this. With a dedicated server you can activation block the entire server and the setting is persistent. The choice can also affect your storage design in terms of choosing RAID or JBOD. Unless you have multiple lagged copies, lagged copies should be placed on storage that is utilizing RAID to provide additional protection. The calculator will determine the appropriate number of lagged copy servers you need to deploy based on the requirements (number of databases, number of copies, etc.). 2. How long will you delay transaction log replay on your lagged copy? This parameter is used to specify the amount of time that the Microsoft Exchange Information Store service should wait before replaying log files that have been copied to the lagged database. The maximum amount of replay delay you can set is 14 days. The value you specify here will influence the log capacity requirements for all copies and the amount of time required to mount a lagged copy. 3. How long will you delay transaction log truncation on your lagged copy? This parameter is used to specify the amount of time that the Microsoft Exchange Replication service should wait before truncating log files that have been copied to the lagged database. The time period begins after the log has been successfully replayed into the lagged copy. The maximum allowable setting for this value is 14 days. The minimum allowable setting is 0, although setting this value to 0 effectively eliminates any delay in log truncation activity. The value you specify here will influence the log capacity requirements for all copies. 4. Will the Replay Lag Manager be enabled? In Exchange 2013, the ReplayLagManager is disabled by default, whereas, it is enabled by default in Exchange 2016. Lagged copies can now care for themselves by invoking automatic log replay to play down the log files in certain scenarios - When a low disk space threshold is reached - When the lagged copy has physical corruption and needs to be page patched - When there are fewer than three available healthy copies (active or passive only; lagged database copies are not counted) for more than 24 hours In Exchange 2010, page patching wasn't available for lagged copies. In Exchange 2013 and later versions, page patching is available for lagged copies through this automatic play down feature. If the system detects that page patching is required for a lagged copy, the logs are automatically replayed into the lagged copy to perform page patching. Lagged copies also invoke this auto replay feature when a low disk space threshold has been reached, and when the lagged copy has been detected as the only available copy for a specific period of time. Note: The calculator prohibits replay lag manager when dedicated lagged copy servers are deployed, or when there are less than 3 HA copies and multiple databases / volume is not the volume architecture. Exchange Data Configuration 1. What will be the Data Overhead Factor? Microsoft does not recommend utilizing this factor at this time. 2. How many mailboxes do you move per week? In terms of transactions, you have to take into account how many mailboxes you will either be moving to this server or within this server, as transactions totaling the size of the mailbox will always get generated at the target database. 3. Are you going to deploy a Dedicated Restore Volume? A dedicated restore volume is used as a staging point for the restoration of data or could be used during maintenance activities; if one is selected then additional capacity will not be factored into each database Volume. 4. What percentage of disk space do you want to ensure remains free on the Volume? Most operations management programs have capacity thresholds that alert when a Volume is more than 80% utilized. This value allows you to ensure that each volume has a certain percentage of disk space available so that the volume is not designed and implemented at maximum capacity. 5. Do you have log shipping compression enabled within the DAG? By default, each DAG is configured to compress and encrypt the socket connection used to ship logs across different IP subnets (you can disable these features all together or enable them for all communications regardless of subnet). 6. What is your compression rate? The compression capability that is obtained for the socket connection used to ship logs will vary with each customer, based on the data obtained in the transaction log files. By default, Microsoft recommends using a value of 30%, however, you can determine this value by analyzing your environment (e.g., once Exchange is deployed you could evaluate the throughput rate with compression disabled and then compare with compression is enabled). Database Configuration 1. Do you want to follow Microsoft's recommendations regarding maximum database size? For standalone mailbox server role solutions, Microsoft recommends that the database size should not be more than 200GB in size. For solutions leveraging mailbox resiliency, Microsoft recommends that the database size should not exceed 2TB. Neither of these is by any means a hard limit, but a recommendation based on the impact database size has to recovery times. If you want to follow Microsoft's recommendation, then select Yes. Otherwise, select No. 2. Do you want to specify a custom Maximum Database Size? If you selected No for the previous field, then you need to enter in a custom maximum database size. 3. Would you like the calculator to determine the optimum number of databases for the design? By default the calculator will determine the optimum number of databases for the architecture. In the event that you may want to have a defined number of databases, select No to "Automatically Calculate Number of Databases" and enter in a custom number of databases. 4. Do you want to specify the number of databases that should be deployed? If you selected "No" to the previous field, then you need to enter in the number of databases you would like to have deployed within your mailbox server or DAG architecture. 5. Do you want to design your database infrastructure such that you deploy the correct number of databases to ensure symmetrical distribution during server failure events? If possible, the calculator will deploy the correct number of databases such that you can achieve a symmetrical distribution of the active copies across the remaining server infrastructure as the DAG experiences server failures. Note that to use this option, you must allow the calculator to automatically calculate the required number of databases. IOPS Configuration 1. What will be the I/O Overhead Factor? Microsoft recommends using 20% to ensure adequate headroom in terms of I/O to allow for abnormal spikes in I/O that may occur from to time. 2. What additional I/O requirements do you need to factor into the solution for each mailbox server's storage design? For example, let's say the solution requires 500 IOPS for the mailboxes and you have decided you want to ensure there is extra I/O capacity to support additional products (e.g. antivirus) to generate load during the peak user usage window. So you enter 300 IOPS in this input factor. The result is that from a host perspective, the solution needs to achieve 800 IOPS. This may require additional testing by comparing a baseline system against a system that has the I/O generating application installed and running. Transport Configuration 1. What will be the message queue expiration value? By default, the MessageTimeOut expiration property is set to two days. 2. What will be the Safety Net expiration value? By default, the SafetyNetHoldTime property is set to two days. This value should equal or be higher than ReplayLagTime. Mailbox Configuration Within Step 2 you will define your user profile for up to four different tiers of user populations. 1. How many mailboxes will you deploy in the environment? If deploying a single server environment, this is how many mailboxes you will deploy on this server. If you are deploying multiple servers, then this is how many mailboxes you will deploy in the environment. If you are deploying multiple DAGs, then this is how many mailboxes you will deploy across all of the DAGs. For example, if you choose to deploy 5 servers, and want 3000 mailboxes per server, then enter 15000 here. Or if you plan to deploy 2 DAGs, each with 6 servers, and you entered 24000 total mailboxes, then 12000 mailboxes will be deployed per DAG. 2. What is the solution's projected growth in terms of number of mailboxes over its lifecycle? Enter in the total percentage by which you believe the number of mailboxes will grow during the solution's lifecycle. For example, if you believe the solution will increase by 30% over the lifecycle of the design and you are starting out with 1000 mailboxes, then at the end of the lifecycle, the solution will have 1300 mailboxes. The calculator will utilize the projected growth plus the number of mailboxes to ensure that the capacity and performance requirements can be sustained throughout the solution's lifecycle. 3. How much mail do the users send and receive per day on average? The usage profiles found here are based on the work done around the memory and processor scalability requirements. 4. What is the average message size? For most customers the average message size is around 75KB. 5. What is the initial mailbox size? Enter the average initial mailbox size for the mailbox tier that is being migrated to Exchange platform. This value is only used for the Mailbox Modeling prediction formulas. 6. What will be the prohibit send & receive mailbox size limit? If you want to adequately control your capacity requirements, you need to set a hard mailbox size limit (prohibit send and receive) for the majority of your users. 7. If deploying a personal archive mailbox, what will be the personal archive quota limit? If you want to adequately control your capacity requirements, you need to set a hard mailbox size limit (prohibit send and receive) for the majority of your users. 8. What is the deleted item retention period? Enter in the deleted item retention period you plan to utilize within the environment. The default retention period is 14 days, however, you should adjust this to match your policy concerning deleted item recovery when enabling Single Item Recovery to eliminate going to backup media to recover deleted items. 9. Are you deploying Single Item Recovery? Single Item Recovery ensures that all deleted and modified items are preserved for the duration of the deleted item retention window. By default in Exchange, this is not enabled. When enabled, this feature increases the capacity requirements for the mailbox. 10. Will you have calendar version logging enabled? By default, all changes to a calendar item are recorded in the mailbox of a user to keep older versions of meeting items for 31 days and can be used to repair the calendar in the event of an issue. This data is stored in the mailbox's dumpster folder. When enabled, this feature increases the capacity requirements for the mailbox. 11. How many users within this tier should have IOPS and Megacycles Multiplication Factor applied? By default, when you specify an IOPS or Megacycles Multiplication Factor, all users within the mailbox tier will the multiplication factor applied; however you may only want a percentage of users within the tier to get that multiplication factor. 12. Do you want to include an IOPS Multiplication Factor in the prediction or custom I/O profile? The IOPS Multiplication Factor can be used to increase the IOPS/mailbox footprint for mailboxes that require additional I/O (for example, these mailboxes may use third-party mobile devices). The way this value is used is as follows: (IOPS value * Multiplication Factor) = new IOPS value. 13. Do you want to include a Megacycle Multiplication Factor when sizing the CPU requirements for the mailbox tier? The Megacycles Multiplication Factor can be used to increase the CPU cost/mailbox footprint for mailboxes that require perform more CPU work than a typical mailbox (for example, these mailboxes may use third-party mobile devices). The way this value is used is as follows: (Megacycles value * Multiplication Factor) = new Megacycles value. 14. Do your Outlook Online Mode clients have versions of Windows Desktop Search older than 4.0 or third-party desktop search engines deployed? The addition of these indexing tools to the online mode clients incur additional read I/O penalties to the mailbox server storage subsystem. Care should be taken when enabling these desktop search engines. Windows Desktop Search 4.0 and later utilizes synchronization protocols that are similar to how Outlook operates in cached mode to index the mailbox contents, and thus has a very minor impact in terms of disk read I/O. 15. Are you planning to use the I/O prediction formula or define your own IOPS profile to design toward? This question asks whether you want to override the calculator in determining the IOPS / mailbox value. By default, the calculator will predict the IOPS / mailbox value based on the number of messages per mailbox, and the user memory profile. For some customers that want to design toward a specific I/O profile, this option will not be viable. Therefore, if you want to design toward a specific I/O profile, select No. 16. What is your custom IOPS profile / mailbox? Only enter a value in this field if you selected "No" to the "Predict IOPS Value" question. 17. What will be the database read:write ratio for your custom IOPS profile? Only adjust this value if you selected "No" to the "Predict IOPS Value" parameter. When IOPS prediction is enabled, the calculator will calculate the read:write ratio based on the user profile. Backup Configuration Within Step 3 you will define your backup model and your tolerance settings, as well as, choose whether to isolate the transaction logs from the database. 1. What backup methodology will be used to backup the solution? You have several options for a backup methodology, including leveraging a VSS solution (hardware or software based) or leveraging the native data protection features that Exchange provides. The solution you choose will depend on many factors. For example, if you are deploying the mailbox resiliency and single item recovery features, you may be able to forgo a traditional backup architecture in favor of leveraging Exchange as its own backup. Or if you still require a backup (e.g. legal/compliance reasons), then you need to deploy a VSS solution. The type of VSS solution you deploy will depend on your storage architecture. Hardware VSS solutions are available with storage area networks. Software VSS solutions can be leveraged against either storage area networks or direct attached storage architectures. Also, the backup methodology will affect the Volume design; for example, hardware VSS solutions require a Volume architecture that is 2 Volumes / Database. 2. What will be the backup frequency? You can choose Daily Full, Weekly Full with Daily Differential, Weekly Full with Daily Incremental, or Bi-Monthly Full with Daily Incremental. The backup frequency will affect the Volume design and the disk space requirements (e.g. if performing daily differentials, then you need to account for 7 days of log generation in your capacity design). 3. How many times can you operate without log truncation? Select how many times you can survive without a full backup or an incremental backup (the minimum value is 1). For example, if you are a performing weekly full backup and daily differential backups, the only time log truncation occurs is during the full backup. If the full backup fails, then you have to wait an entire week to perform another full backup or perform an emergency full backup. This parameter allows you to ensure that you have enough capacity to not have to perform an immediate full backup. If you are leveraging the native data protection features within Exchange as your backup mechanism, then you should enter 3 here to ensure you have enough capacity to allow for 3 days' worth of log generation to occur as a result of potential log replication issues. 4. How long can you survive a network outage? When a network outage occurs, log replication cannot occur. As a result, the copy queue length will increase on the source; in addition, log truncation cannot occur on the source. For geographically dispersed DAG deployments, network outages can seriously affect the solution's usefulness. If the outage is too long, log capacity on the source may become compromised and as result, capacity must be increased or a manual log truncation event must occur. Once that happens, the remote copies must be reseeded. The Network Failure Tolerance parameter ensures there is enough capacity on the log Volumes so that you can survive an excessive network outage. Storage Configuration Within Step 4 you will define your storage configuration. Storage Options 1. Do you want to consider storage designs that leverage JBOD? JBOD storage refers to placing a database and its transaction logs on a single disk without leveraging RAID. In order to deploy this type of storage solution for your mailbox server environment, you must have 3 or more HA database copies and have a Volume architecture that is equal to 1 Volume / Database. If you select yes for this input, the calculator will attempt to design the solution so that it can be deployed on JBOD storage. Please note that other factors may alter the viability of JBOD, however (e.g. deploying a single lagged database copy on the same mailbox servers hosting your HA database copies). 2. Do you want to deploy multiple databases per volume? In Exchange 2010, JBOD configurations recommended a single database per volume. Due to the optimizations made in Exchange 2013 and later, you can now deploy multiple databases per volume in JBOD configurations. 3. Do you want the number of copies per volume to match the copy count? Typically, the number of copies you deploy on a single volume matches the number of copies you deploy per database. However, you may find that due to your environment (e.g., capacity constraints), it may be better suited to deploy half the number of copies per volume. 4. Do you want the calculator to automatically determine the number of volumes that should be deployed on each Exchange server? By default, the calculator will recommend a certain number of volumes to be configured on each Exchange server based on a number of factors. As a result, the calculator can recommend more volumes than you have available in your server configuration. If you have a defined number of volumes you would like to support for placement of Exchange data, you can select No to this question and then enter the number of volumes in “Number of Exchange Data Volumes per Server” field. 5. How many AutoReseed volumes per server should be deployed? When leveraging AutoReseed, you will want at least one AutoReseed volume per server. Primary Datacenter Disk Configuration 1. What are the disk capacities and types you plan to deploy? For each type of Volume (system, database, log, and restore volume) you plan to deploy, select the appropriate capacity and disk type model. Secondary Datacenter Disk Configuration 1. What are the disk capacities and types you plan to deploy? For each type of Volume (system, database, log, and restore volume) you plan to deploy, select the appropriate capacity and disk type model. Processor Configuration Within Step 5, you will define the number of processor core you have deployed for each mailbox server within your primary and secondary datacenters, as well as, enter the SPECint2006 Rate Value for the system you have selected. When you enable virtualization, you must be sure to configure the processor architecture correctly. In particular, you must enter in the correct number of processor cores that the guest machine will support, as well as, the correct SPECInt2006 rating for these virtual processor cores. To calculate the SPECInt2006 rate value, you can utilize the following formula: X/(N*Y) = per virtual processor SPECInt2006 Rate value Where X is the SPECInt2006 rate value for the hypervisor host server Where N = the number of physical cores in the hypervisor host Where Y = 1 if you will be deploying 1:1 virtual processor-to-physical processor on the hypervisor host Where Y = 2 if you will be deploying up to 2:1 virtual processor-to-physical processor on the hypervisor host For example, let’s say I am deploying an HP ProLiant DL380p G8 (2.90 GHz, Intel E5-2690) system which includes two sockets, each containing a 16-core processor, then the SPECInt2006 rate value for the system is 693. If I am deploying my Mailbox server role as a guest machine using Hyper-V, and following best practices where I do not oversubscribed the number of virtual CPUs to physical processors, then 693/(32*1) = 21.66 Since each Mailbox server will have a maximum of 4 virtual processors, this means the SPECInt2006 rate value I would enter into the calculator would be 21.66*4 = 86.625. In addition, if you are deploying your Exchange servers as guest machines, you can specify the Hypervisor CPU Adjustment Factor to take into account the overhead of deploying guest machines. Server Configuration 1. How many processor cores and what is their megacycle capability are you planning to deploy in each server? For each server type (primary datacenter, secondary datacenter, and lagged copy server) you plan to deploy, select the number of processor cores and the server’s SPECint2006 rate value. To determine your SPECint2006 Rate Value: 1. Open a web browser and got to www.spec.org. 2. Click on Results, highlight CPU2006 and then select Search CPU2006 Results. 3. Under Available Configurations, select SPECint2006 Rates and click Go. Under Simple Request, enter the search criteria (e.g., Processor matches x5550). 4. Find the server and processor you are planning to deploy and take note of the result value. For example, let's say you are deploying a HP Dl380p G8 server with two 16-core servers (2.9 GHz, E5-2690); the SPECint_rate2006 results value is 693. Alternatively, you can use Scott Alexander’s fantastic Processor Query Tool to get the per-server score and processor core count for your hardware platform Log Replication Configuration Within Step 6, you will define your hourly log generation rate, the network link, and the network link latency you expect to have within your site resilient architecture. 1. How many transaction logs are generated for each hour in the day? Enter in the percentage of transaction logs that are generated for each hour in the day by measuring an existing Exchange 2003 or Exchange 2007 server in your environment. If the existing messaging environment is not using Exchange, then evaluate the messaging environment and enter in the rate of change per hour here. Now you may be wondering how you can collect this data. We've written a simple VBS script that will collect all files in a folder and output it to a log file. You can use Task Scheduler to execute this script at certain intervals in the day (e.g. every 15 minutes). Once you have generated the log file for a 24 hour period, you can import it into Excel, massage the data (i.e. remove duplicate entries) and determine how many logs are generated for each hour. If you do this for each storage group, you will be able to determine your log generation rate for each hour in the day. This script is named collectlogs.vbsrename (just rename it to collectlogs.vbs) and you can find it here: Collectlogs VBS script Network Configuration 1. What type of network link will you be using between the servers? Select the appropriate network link you will be using between the two datacenters. 2. What is the latency on the network link? Enter in the latency (in milliseconds) that exists on the network link. Environment Customizations Within Step 7, you can define the server names and database naming prefix. Role Requirements This section provides the solution's I/O, capacity, memory, and CPU requirements. Based on the above input factors the calculator will recommend the following architecture, broken down into four sections: Environment Configuration Active Database Copy Configuration Server Configuration Log, Disk Space, and IO Requirements Processor Core Ratio Requirements This table identifies the required number of processor cores required to support the activated databases. This table is only populated if you populate the processor core megacycle information on the Input tab. The Recommended Minimum Number of Global Catalog Cores identifies the minimum number of processor cores required to sustain the load for global catalog related activities and is based on the number of processor cores required to support the activated databases. Client Access Server Requirements This table identifies the memory and CPU requirements for dedicated Client Access servers if you choose to not co-locate the server roles. This table is only populated if you populate the processor core megacycle information on the Input tab. The Recommended Minimum Number of Client Access Processor Cores identifies the minimum number of processor cores required per server to sustain the load for client related activities. The Recommended Minimum Number of Client Access Server RAM Configuration identifies the minimum amount of memory required per server to sustain the load for client related activities. This number is scaled to a multiple that can be typically installed in a server. Environment Configuration The Environment Configuration table identifies the number of mailboxes being deployed in each datacenter, as well as, how many mailbox servers and lagged copy servers you will deploy in each datacenter. In addition, if you choose to not co-locate server roles, this table will also identify the minimum number of dedicated Client Access servers you should deploy in each datacenter (taking into account worst case failure mode of two simultaneous server failures). User Mailbox Configuration The Mailbox Configuration table provides you with: The Number of Mailboxes that you entered in the Input section (this value will include the projected growth). The Number of Mailboxes / Database provides a breakdown of how many mailboxes from each mailbox tier will be stored within a database. The User Mailbox Size within Database is the actual mailbox size on disk that factors in the prohibit send/receive limit, the number of messages the user sends/receives per day, the deleted item retention window (with or without calendar version logging and single item recovery enabled), and the average database daily churn per mailbox. It is important to note that the Mailbox size on disk is actually higher than your mailbox size limit; this is to be expected. The Transaction Logs Generated / Mailbox value is based on the message profile selected and the average message size and indicates how many transaction logs will be generated per mailbox per day. The log generation numbers per message profile account for: Message size impact. In our analysis of the databases internally we have found that 90% of the database is the attachments and message tables (message bodies and attachments). So if the average message size doubles (from 75 to 150), the worst case scenario would be for the log traffic to increase by 1.9 times. Thereafter, as message size doubles, the impact doubles. o Amount of data Sent/received. o Database health maintenance operations. o Records Management operations o Data stored in mailbox that is not a message (tasks, local calendar appts, contacts, etc). o Forced log rollover (a mechanism that periodically closes the current transaction log file and creates the next generation). The IOPS / Mailbox value is the calculated IOPS / Mailbox value that is based on the number of messages per mailbox, the user memory profile, and desktop search engine choices. If you had chosen to enter in a specific IOPS / mailbox value rather than allowing the calculator determining the value based on the above requirements, then this value will be that custom value. The Read:Write ratio / Mailbox value defines the ratio of the mailbox's IOPS that are read I/Os. This information is required to accurately design the storage subsystem I/O requirements. o Database Copy Instance Configuration This table highlights how many HA mailbox database copy instances and lagged database copy instances your solution will have within each datacenter for a given DAG. Database Configuration The Database Configuration table provides you with: The Number of Databases is the calculated number of databases required to support the mailbox population within a standalone server or DAG. The Recommended Number of Mailboxes / Database is the calculated number of mailboxes per database ensuring that the database size does not go above the recommended database size limit. The Available Database Cache / Mailbox value is the amount of database cache memory that is available per mailbox. A large database cache ensures that read I/Os can be reduced. Database Copy Configuration The Database Copy Configuration table provides you with the number of database copies being deployed within each server and the total number of database copies within the DAG. Server Configuration The Server Configuration table provides you with the following: The Recommended RAM Configuration for the primary datacenter mailbox servers, secondary datacenter mailbox servers, and lagged copy servers. This is the amount of RAM needed to support the number of maximum activated database copies on a given server, in addition to, the number of mailboxes based on their memory profile. The number of processor cores utilized during the worst failure mode scenario. The CPU Utilization value is the expected CPU Utilization for a fully utilized server based on the megacycles associated with the user profile and the number of database copies. Depending on the environment, this will either be for a standalone server hosting 100% active databases, or a server participating in a DAG that is dealing with a single or double server failure event (or secondary datacenter activation). It is recommended that servers not exceed 80% utilization during peak period. The CPU utilization value is determined by taking the CPU Megacycle Requirements and dividing it by the total number of megacycles available on the server (which is based on the CPU and number of cores). If the calculator highlights the CPU utilization with a red background, then this means the design may not be able to sustain the load - either you must change the design (number of mailboxes, number of copies, etc.) or change the server CPU platform. The CPU Megacycle Requirements value defines the amount of megacycles the primary datacenter servers must be able to sustain when either all mailbox databases are active or the number of mailbox database copies that are activated based on a single server or double server failure event. For secondary servers hosting HA copies, this value defines the amount of megacycles required to support the activation of all databases after datacenter activation. For lagged copy servers, this value defines the amount of megacycles required to support all of the passive lagged copies. The Server Total Available Adjusted Megacycles value defines the total available megacycles the server platform is capable of delivering at 100% CPU utilization. This value has been normalized against the baseline server platform (DL380p G8 2GHz E52650 processors). The Possible Storage Architecture outlines whether the solution could utilize RAID or JBOD for the primary datacenter servers, secondary datacenter servers, and lagged copy servers. JBOD is only considered under the following conditions (this assumes you configured the calculator to consider JBOD): o In order to deploy on JBOD in the primary datacenter servers: You need a total of 3 or more HA copies within the DAG. If you are mixing lagged copies on the same server that is hosting your HA copies (i.e. not using dedicated lagged copy servers), then you need at least 2 lagged copies if you are deploying the 1 database / volume architecture. o For the secondary datacenter servers to use JBOD: You should have at least 2 HA copies in secondary datacenter. That way loss of a copy in the secondary datacenter doesn't result in requiring a reseed across the WAN or loss of data (in the datacenter activation case). If you are mixing lagged copies on the same server that is hosting your HA copies (i.e. not using dedicated lagged copy servers), then you need at least 2 lagged copies. o For dedicated lagged copy servers: You should have at least 2 lagged copies within a datacenter in order to use JBOD when deploying the 1 database / volume architecture. Otherwise loss of disk results in loss of your lagged copy (and whatever protection mechanism that was providing). The Recommended Transport Database Location identifies whether the transport database should be deployed on a dedicated disk, or whether it can be deployed on the system disk. The CPU Utilization / DAG table provides you with information on the expected theoretical CPU utilization during various modes: Normal Run Time (where the active copies are distributed according to ActivationPreference=1) Single Server Failure (redistribution of active copies based on a single server failure event) Double Server Failure (redistribution of active copies based on a double server failure event) Site Failure (datacenter activation) Worst Failure Mode (in some cases, this value will equal one of the previous scenarios, it could also be a scenario like Site Failure + 1 server failure; the worst failure mode is what is used to calculate memory and CPU requirements) Transaction Log Requirements The Transaction Log Requirements table provides you with: The User Transaction Logs Generated / Day indicates how many transaction logs will be generated during the day for each active database, each server, within the DAG, and within the environment. The Average Mailbox Move Transaction Logs Generated / Day indicates how many transaction logs will be generated during the day for active database, each server, within a DAG, and within the environment. This number is an assumption and assumes that an equal percentage of mailboxes will be moved each day, as opposed to moving all mailboxes on the same day. The Average Transaction Logs Generated / Day is the total number of transaction logs that are generated per day for active database, each server, within a DAG, and within the environment (includes user generated logs and mailbox move generated logs). Disk Space Requirements The Disk Space Requirements table provides you with: The Transport Database Space Required is the amount of space required to support the transport database on a Mailbox server. The value is derived from the message profile, the message queue expiration, and the Safety Net hold time. The Database Space Required is the amount of space required to support each database and its corresponding copies. This value is derived from the mailbox size on disk, the data overhead factor, whether a dedicated restore Volume is available. This row also shows you the space requirements for each server (based on the total number of database copies), each DAG, and within the environment. The Log Space Required is the amount of space required to support each database log stream and the corresponding copies. This value takes into account the number of mailboxes moved per week (assumes worst case and that all mailboxes are moved on the same day), the type of backup frequency in use, the number of days that can be tolerated without log truncation and the number of transaction logs generated per day. This row also shows you the space requirements for each server (based on the total number of database copies), each DAG, and within the environment. The Database Volume Space Required is the Volume size required to support the database (and potentially its log stream). This calculation takes the total disk space required for the database and adds to it the size of a database plus 110% (if a dedicated restore Volume does not exist) for offline maintenance operations, an additional 10% of the database size for content indexing (if enabled), and includes an amount of free space to ensure the Volume is not 100% utilized (based on Volume Free Space Percentage). This row also shows you the space requirements for each server (based on the total number of database copies), each DAG, and within the environment. The Log Volume Space Required is the Volume size required to support the databases log stream. This field lists the amount of space required to support the transaction logs for a given set of databases and includes an amount of free space to ensure the Volume is not 100% utilized (based on Volume Free Space Percentage). This row also shows you the space requirements for each server (based on the total number of database copies), each DAG, and within the environment. The Restore Volume Space Required is the amount of space needed to support a restore Volume if the option was selected in the Input Factor section; this will include space for up to 7 databases and 7 transaction log sets. Each server will be provisioned with a restore Volume. This row also shows you the space requirements for each server (based on the total number of database copies), each DAG, and within the environment. Host IO and Throughput Performance Requirements The Host IO and Throughput Performance Requirements table provides you with: The Total Required Database IOPS is the amount of read and write host I/O the database disk set must sustain during peak load (this does not factor in any RAID penalties). This row also shows you the IOPS requirements for each server (based on the total number of database copies), each DAG, and within the environment The Total Required Log IOPS is the amount of read and write host I/O that will occur against the transaction log disk set. This row also shows you the IOPS requirements for each server (based on the total number of database copies), each DAG, and within the environment. The Database Read I/O Percentage defines the percentage of database required IOPS that are read I/Os. This information is required to accurately design the storage subsystem I/O requirements. The amount of throughput required for Background Database Maintenance operations. Special Notes The Special Notes table will provide you with additional information about your design: When to use GPT disks (when a Volume size is greater than 2TB). Activation Scenarios If you are deploying a highly available and/or site resilient architecture, then this section will break down the failure scenarios. The section is broken up into two scenarios: 1. Scenario 1 - Deploying a DAG architecture within a single datacenter or deploying it in a site resilient Active/Passive user distribution model. 2. Scenario 2 - Deploying a site resilient DAG architecture with an Active/Active user distribution model. Important: For the purposes of this calculator, the term "primary datacenter" refers to the datacenter that is preferred for hosting the active copies for a given set of databases, while the term "secondary datacenter" refers to the disaster recovery datacenter that is used for datacenter activation and cross-site database failover events. Single Datacenter and Active/Passive Environments The DAG Member Layout table identifies the number of Active Mailbox servers (those that are hosting active mailboxes within the primary datacenter), the Disaster Recovery Mailbox Servers (those that host passive database copies in the second datacenter), and any Lagged Copy Mailbox servers you may be deploying. There are two tables that provide data around the Active Database configuration, one for the primary datacenter, which outlines the single or double server events, and one for the secondary datacenter, which outlines the activation of that datacenter when the primary datacenter is lost. Both tables provide you with: The Number of Active Databases (Normal Run Time) value defines the number of active databases hosted on each server when there are no server outages. Unlike Exchange 2007, Exchange is no longer bound by an active/passive high availability model. Instead, each server within a DAG can host active mailbox database copies. The calculator distributes the number of unique databases across the primary datacenter servers within the DAG, ensuring an equal distribution of mailbox database copies are activated on each server. This row exposes the total number of mailboxes that are accessible on each server as a result of the activated database copies. In addition, this row highlights the total number of databases deployed within each datacenter, in the event that crosssite database *overs are allowed to occur. The Number of Active Databases (After First Server Failure) value defines the number of active databases hosted on each server when there is a single server outage. As a result of the single server outage, the database copies that were activated on the failed server are equally redistributed across all remaining server nodes. This row exposes the total number of mailboxes that are accessible on each server as a result of the activated database copies. In addition, this row highlights the total number of databases deployed within each datacenter, in the event that cross-site database *overs are allowed to occur. The Number of Active Databases (After Double Server Failure) value is populated when you have at least 3 HA mailbox copies and at least 4 mailbox servers within your design. It defines the number of active databases hosted on each server when there are two server outages. As a result of the double server outage, the database copies that were activated on the failed servers are equally redistributed across all remaining server nodes. This row exposes the total number of mailboxes that are accessible on each server as a result of the activated database copies. In addition, this row highlights the total number of databases deployed within each datacenter, in the event that cross-site database *overs are allowed to occur. Active/Active Environments This section breaks out the architecture into two perspectives, the layout of Datacenter 1 and the layout of Datacenter 2 with respect to the DAG architecture. Recall that 1. Active/Active (Single DAG) - This model stretches a DAG across the two datacenters and has active mailboxes located in each datacenter. A corresponding passive copy is located in the alternate datacenter. This scenario does have a single point of failure (potentially), the WAN connection. Loss of the WAN connection will result in the mailbox servers in one of the datacenters going into a failed state from a failover cluster perspective (due to loss of quorum): ---DC1--- ---DC2--- DAG1 DAG1 Active Copies Passive Copies Passive Copies Active Copies 2. Active/Active (Multiple DAGs) - This model leverages multiple DAGs to remove single points of failure (e.g., the WAN). In this model, there are least two DAGs, with each DAG having its active copies in the alternate datacenter: ---DC1--- ---DC2--- DAG-A (active/passive copies) DAG-A (Passive Copies) DAG-B (Passive Copies DAG-B (Active/Passive Copies) The DAG Member Layout table identifies the number of Active Mailbox servers (those that are hosting active mailboxes within the primary datacenter), the Disaster Recovery Mailbox Servers (those that host passive database copies in the second datacenter), and any Lagged Copy Mailbox servers you may be deploying. There are two tables that provide data around the Active Database configuration, one for the primary datacenter, which outlines the single or double server events, and one for the secondary datacenter, which outlines the activation of that datacenter when the primary datacenter is lost. Both tables provide you with: The Number of Active Databases (Normal Run Time) value defines the number of active databases hosted on each server when there are no server outages. Unlike Exchange 2007, Exchange is no longer bound by an active/passive high availability model. Instead, each server within a DAG can host active mailbox database copies. The calculator distributes the number of unique databases across the primary datacenter servers within the DAG, ensuring an equal distribution of mailbox database copies are activated on each server. This row exposes the total number of mailboxes that are accessible on each server as a result of the activated database copies. In addition, this row highlights the total number of databases deployed within each datacenter, in the event that crosssite database *overs are allowed to occur. The Number of Active Databases (After First Server Failure) value defines the number of active databases hosted on each server when there is a single server outage. As a result of the single server outage, the database copies that were activated on the failed server are equally redistributed across all remaining server nodes. This row exposes the total number of mailboxes that are accessible on each server as a result of the activated database copies. In addition, this row highlights the total number of databases deployed within each datacenter, in the event that cross-site database *overs are allowed to occur. The Number of Active Databases (After Double Server Failure) value is populated when you have at least 3 HA mailbox copies and at least 4 mailbox servers within your design. It defines the number of active databases hosted on each server when there are two server outages. As a result of the double server outage, the database copies that were activated on the failed servers are equally redistributed across all remaining server nodes. This row exposes the total number of mailboxes that are accessible on each server as a result of the activated database copies. In addition, this row highlights the total number of databases deployed within each datacenter, in the event that cross-site database *overs are allowed to occur. Distribution The calculator includes a new worksheet, Distribution. Within the Distribution worksheet, you will find the layout we recommend based on the database copy layout principles. The Distribution worksheet includes several options to help you with designing and deploying your database copies: You can determine what the active copy distribution will be like as server or WAN failures occur within your environment. You can choose the location for your file share witness (primary, secondary or tertiary datacenter). You can export a set of CSV and PowerShell scripts that perform the following actions: o Diskpart.ps1 (uses Servers.csv) - Formats the physical disks, and mounts them as mount points under an anchor directory on each server. o CreateDAG.ps1 (uses DAGInfo.csv) – Creates the DAG and sets the DAG associated properties like Witness and Auto-Reseed settings. o CreateMBDatabases.ps1 (uses MailboxDatabases.csv) - Creates the mailbox database copies with activation preference value of 1. o CreateMBDatabaseCopies.ps1 (uses MailboxDatabaseCopies.csv) - Creates the mailbox database copies with the appropriate activation preference values across the server infrastructure. Important: The database copy layout the tool provides assumes that each server and its associated database copies are isolated from each other server/copies. It is important to take into account failure domain aspects when planning your database copy layout architecture so that you can avoid having multiple copy failures for the same database. Volume Requirements The Volume Requirements section is really a continuation of the Storage Requirements section. It outlines what we believe is the appropriate Volume design based on the input factors and the analysis performed in the previous sections. Note: The term Volume utilized in the calculator refers only the representation of the disk that is exposed to the host operating system. It does not define the disk configuration. The Volume Design highlights the Volume architecture chosen for this server solution. The architecture is derived from the backup type, backup frequency, and high availability architecture that were chosen in the Storage Requirements section. There are four types of Volume architecture that can be leveraged within Exchange 2013 and later: Multiple DBs / Volume 1 Volume / Database 2 Volumes / Database 2 Volumes / Backup Set Multiple DBs / Volume A Multiple DBs / Volume architecture enables you to support multiple databases (mixtures of active and passive copies of different databases) on the same JBOD disk, thereby leveraging larger disks in terms of capacity and IOPS as efficiently as possible. Typically the number of copies you deploy on a single disk matches the number of copies you have for a database (for example, if you are deploying 4 copies of each database, the recommendation is to deploy 4 copies on a JBOD disk). Like with the 1 Volume / Database architecture, the database and its search index and transaction logs are all deployed on the volume. 1 Volume / Database A single Volume per Database architecture means that both the database and its corresponding log files are placed on the same Volume. In order to deploy a Volume architecture that only utilizes a single Volume per database, you must have a Database Availability Group that has 2 or more copies and not be utilizing a hardware based VSS solution. Some of the benefits of this strategy include: Simplified storage administration. Fewer Volumes to manage. Potentially reduce the number of backup jobs. Flexibility to isolate the performance between Databases when not sharing spindles between Volumes. Some of the concerns with this strategy include: Limits the ability to take hardware based VSS backup and restores (e.g., clone snapshots). See Best Practices for Using Volume Shadow Copy Service with Exchange Server 2003 for more VSS details. 2 Volumes / Database With Exchange 2013 and later, in the maximum case of 100 Databases, the number of Volumes you provision will depend upon your backup strategy. If your recovery time objective (RTO) is very small, or if you use VSS clones for fast recovery, it may be best to place each Database on its own transaction log Volume and database Volume. Because doing this will exceed the number of available drive letters, volume mount points must be used. Some of the benefits of this strategy include: Enables hardware-based VSS at a database level, providing single database backup and restore. Flexibility to isolate the performance between databases when not sharing spindles between Volumes. Increased reliability. A capacity or corruption problem on a single Volume will only impact one database. This is an important consideration when you are not leveraging the built-in mailbox resiliency features. Some of the concerns with this strategy include: 50 databases require 100 Volumes which could exceed some storage array maximums. A separate Volume for each database causes more Volumes per server increasing the administrative costs and complexity. 2 Volumes / Backup Set A backup set is the number of databases that are fully backed up in a night. A solution that performs a full backup on 1/7th of the databases nightly (i.e. using a weekly or bi-monthly full backup with daily incrementals or differentials) can reduce complexity by placing all of the databases to be backed up on the same log and database Volume. This can reduce the number of Volumes on the server. Some of the benefits of this strategy include: Simplified storage administration. Fewer Volumes to manage. Potentially reduce the number of backup jobs. Some of the concerns with this strategy include: Limits the ability to take hardware based VSS backup and restores (e.g., clone snapshots). See Best Practices for Using Volume Shadow Copy Service with Exchange Server 2003 for more VSS details. A capacity or corruption problem on a single Volume could impact more than one Database. Results Pane Based on the above input factors the calculator will recommend the following architecture: Volume Design The Volume Design table highlights the recommended Volume architecture. Volume Configuration The Volume Configuration table highlights the number of databases that should be placed on a single Volume. This is derived from Volume Architecture model. This section also documents how many Volumes will be required for the entire solution, broken out by Database and Log sets, the number of restore Volumes per server, and the number of spare volumes (or disks) that should be deployed to support Auto-Reseed. Database and Log Configuration The Database and Log Configuration table outlines the number of databases (or copies) per server, the number of mailboxes per database, the size of each database, and the transaction log size required for each database. Database and Log Volume Design The database and log Volume Design table outlines the physical Volume layout and follows the recommended number of databases per Volume approach based on the Volume Architecture model. It also documents the Volume size required to support layout (this is where we factor in the additional capacity for content indexing, the Volume Free Space Percentage, and whether you are using a Restore Volume), as well as the transaction log Volume. Important: The DB and Log Volume Design Table identify databases by a unique number. However, databases copies are distributed across the servers, and thus, these numbers hold no significance and are used solely as an example to show a server's Volume layout. Backup Requirements The Backup Requirements section is really a continuation of the Role Requirements section. It outlines what we believe is the appropriate backup design based on the input factors and the analysis performed in the previous sections. Backup Configuration The Backup Configuration table outlines the number of databases that will be placed within a single Volume and the type of backup methodology and frequency in which the backups will occur. Backup Frequency Configuration The Backup Frequency Configuration section will provide you with an outline on how you should perform the backups for each server, utilizing either a daily full backup or weekly or bimonthly full backup frequency. Log Replication Requirements The Log Replication Requirements section is another continuation of the Role Requirements section. It outlines what we believe is the throughput required to replicate the transaction logs to each target database copy in the secondary datacenter. Peak Log and Content Index Replication Throughput Requirements The Peak Log and Content Index Replication Throughput Requirements table provides you with: The Peak Log & Content Index Throughput Required / Database is the total throughput required for a single log stream and content index. This value is based on the peak log generation hour. The Peak Log & Content Index Throughput Required between Datacenters / DAG is the total throughput required to replicate the transaction logs and content index to all database copies (lagged and non-lagged) that exist within the alternate datacenter for the database availability group. The Peak Log & Content Index Throughput Required Between Datacenters / Environment is the total throughput required to replicate the transaction logs and content index to all database copies (lagged and non-lagged) that exist within the alternate datacenter for all database availability groups. RPO Log and Content Index Replication Throughput Requirements In terms of log replication, RPO means how behind can you get in log shipping? The lower the RPO (a value of 0 or 1 essentially means you want to only lose the open log file), the higher the bandwidth you need because you cannot get behind in log replication. The higher the RPO (approaching 24) less bandwidth is needed as you are expecting to be behind (up to x hours) in log replication and to catch up at some point in the day. The RPO Log and Content Index Replication Throughput Requirements table provides you with: The RPO Log & Content Index Throughput Required / Database is the required throughput necessary to replicate the transaction logs and content index based on the RPO to the mailbox servers that are located within the secondary datacenter per database. The RPO Log & Content Index Throughput Required Between Datacenters / DAG is the RPO total throughput required to replicate the transaction logs and content index to all database copies (lagged and non-lagged) that exist within the alternate datacenter for the database availability group. The RPO Log & Content Index Throughput Required Between Datacenters / Environment is the RPO total throughput required to replicate the transaction logs and content index to all database copies (lagged and non-lagged) that exist within the alternate datacenter for all database availability groups. Chosen Network Link Suitability The Chosen Network Link Suitability table will dictate whether the chosen network link has sufficient capacity to sustain the peak replication throughput requirements and/or the RPO replication throughput requirements. If the network link cannot sustain the log replication traffic, then you will need to either upgrade the network link to the recommended network link throughput, or adjust the design appropriately. Recommended Network Link The Recommended Network Link table recommends an appropriate network link if the chosen network link does not have sufficient capacity to sustain log replication for solution for both the peak and RPO throughput requirements. Note: The Network Link recommendations do not take into account database seeding or any other data that may also utilize the link. Storage Design The Storage Design worksheet is designed to take the data collected from the Input worksheet and Storage Requirements worksheet and help you determine the number of physical disks needed to support the databases, transaction logs, and Restore Volume configurations. Storage Design Input Factors In order to determine the physical disk requirements, you must enter in some basic information about your storage solution. RAID Parity Configuration For the Database/Log RAID Parity Configuration table you need to select the type of RAID building block your storage solution utilizes. For example, some storage vendors build the underlying storage in sets of data+parity (d+p) groups. A RAID-5 3+1 configuration means that 3 disks will be used for capacity and 1 disk will be used for parity, even though parity is distributed across all the disks. So if you had a capacity requirement that would utilize 15 disks, then you would need to deploy 5 3+1 groups to build that RAID-5 array. RAID-1/0 supports 1d+1p, 2d+2p, and 4d+4p groupings RAID-5 supports 3d+1p through 20d+1p groupings (though storage solutions could support more than that). RAID-6 supports 6d+2p groupings. Database/Log RAID Rebuild Overhead When a disk is lost, the disk needs to be replaced and rebuilt. During this time, the performance of the RAID group is affected. This impact as a result can affect user actions. Therefore, to ensure that RAID rebuilds do not affect the overall performance of the mailbox server, Microsoft recommends that you should ensure sufficient overhead is provisioned into the performance calculations when designing for RAID parity. Most RAID-1/0 implementations will suffer a 25% performance penalty during a rebuild. Most RAID-5 and RAID-6 implementations will suffer a 50% performance penalty during a rebuild. The calculator defaults with the following as Microsoft recommendations, but they are adjustable: For RAID-1/0 implementations, ensure that you factor in an additional 35% performance overhead. For RAID-5/RAID-6 implementations, ensure that you factor in an additional 100% performance overhead. In addition, you should consult with your storage vendor to determine the appropriate RAID rebuild penalty. Database RAID Configuration By default, for RAID storage solutions, the calculator will recommend either RAID-1/0 or RAID-5 by evaluating capacity and I/O factors and determining which configuration utilizes the least amount of disks while satisfying the requirements. If you would like to override this and force the calculator to utilize a particular RAID configuration for your databases (e.g., RAID-0 or RAID6), select "Yes" to this option and then select the appropriate RAID configuration in the cell labeled "Desired RAID Configuration." Note that while you can potentially override the database RAID configuration, you cannot do so for the log RAID configuration - that will always be RAID-1/0. Note: The calculator prevents the use of RAID-5 or RAID-6 with 5.2K, 5.4K, 5.9K and 7.2K disk types, due to performance implications. Restore Volume RAID Configuration You can select the type of parity you will be utilizing and the RAID configuration you will be deploying for your Restore Volume. Results Pane The Storage Design Results section outputs the recommended configuration for the solution. The recommendations made are for implementing the solution potentially on RAID and JBOD storage. RAID Storage Architecture The RAID Storage Architecture Table outlines which servers (primary datacenter servers, secondary datacenter servers, or lagged copy servers) should be deployed on RAID storage. The RAID Storage Architecture / Server table recommends the optimum RAID configuration and number of disks for each Volume (database, log and restore Volume) for each mailbox server ensuring that performance and capacity requirements are met within the design. JBOD Storage Architecture The JBOD Storage Architecture Table outlines which servers (primary datacenter servers, secondary datacenter servers, or lagged copy servers) could be deployed on JBOD storage. The JBOD Storage Architecture / Server table recommends the optimum JBOD configuration and number of disks for each Volume (database, log and restore Volume) for each mailbox server ensuring that performance and capacity requirements are met within the design. Total Disks Required By default, the calculator will determine the storage architecture that should be utilized to reduce the total number of disks required to support the design, in addition, to still ensuring you have minimized single points of failure by utilizing RAID and/or JBOD based on the decisions found in the "RAID Storage Architecture" and "JBOD Storage Architecture" tables. However, you can change the storage architecture to be built entirely on RAID or entirely on JBOD (if the design supports JBOD as a possible solution; also keep in mind that certain scenarios (e.g., a single database copy in a datacenter) may result in a single point of failure) by selecting the appropriate value in the "Storage Architecture will be Deployed:" drop-down. The Storage Configuration table will output the total number of disks required for each mailbox server that requires RAID or JBOD storage, as well as, identify the total number of disks requiring RAID or JBOD storage in each datacenter. Conclusion Hopefully you will find this calculator invaluable in helping to determine your Exchange server role requirements. If you have any questions or suggestions, please email strgcalc@microsoft.com.