When you launch the Exchange Server Role

advertisement
Depending on the Exchange version you are deploying, you either have two server roles or a
single server role:


Exchange Server 2013 includes two server roles, the Client Access and Mailbox server
roles. The Client Access server role facilitates client connectivity, while the Mailbox
server role stores the data that users will ultimately access on a daily basis. Therefore,
ensuring that you design these server roles correctly is critical to your Exchange Server
2013 design.
Exchange Server 2016 combines the Mailbox and Client Access server roles into a single
server role, called the Mailbox server role.
With Exchange 2013/2016 you can deploy a solution that leverages mailbox resiliency and has
multiple database copies deployed across datacenters, implements single item recovery for
data recovery, and has the flexibility in storage design to allow you to deploy on storage area
networks utilizing fibre-channel or SATA class disks or on direct attached storage utilizing SAS or
SATA class disks with or without RAID protection. But, in order to design your solution, you
need to understand the following criteria:
User profile - the message profile, the mailbox size, and the number of users
High availability architecture - the number of database copies you plant to deploy,
whether the solution will be site resilient, the desired number of mailbox servers
Server's CPU platform
Storage architecture - the disk capacity / type and storage solution
Backup architecture - whether to use hardware or software VSS and the frequency of
the backups, or leverage the Exchange native data protection features
Network architecture - the utilization, throughput, and latency aspects
Previous versions of Exchange were somewhat rigid in terms of the choices you had in
designing your server roles. The flexibility in the architecture with Exchange 2013/2016, allows
you the freedom to design the solution to meet your needs. Prior to making any decisions,
please review the following topics from the Exchange Online Help:
High Availability and Site Resilience
Backup, Restore and Disaster Recovery
After you have determined the design you would like to implement, you can follow the steps in
the Exchange 2013 Server Role Design Example article to calculate your solution's CPU,
memory, and storage requirements, or you can leverage the Exchange Server Role
Requirements Calculator.
The calculator is broken out into the following sections (worksheets):
Input
Role Requirements
Activation Scenarios
Distribution
Volume Requirements
Backup Requirements
Log Replication Requirements
Storage Design
Important: The data points provided in the calculator are an example configuration. As such
any data points entered into the Input worksheet are specific to that particular configuration
and do not apply for other configurations. Please ensure you are using the correct data points
for your design
Input
When you launch the Exchange Server Role Requirements Calculator, you are presented with
the Input worksheet. This worksheet is broken down into 5 key areas. This section is where you
enter in all the relevant information regarding your intended design, so that the calculator can
generate what you need in order to achieve it.
Note: There are many input factors that need to be accounted for before you can design your
solution. Each input factor is briefly listed below; there are additional notes within the
calculator that explain them in more detail.
Environment Configuration
Within Step 1 you will enter in the appropriate information concerning your messaging
environment's configuration - the high availability architecture and database copy
configuration, the data and I/O configuration.
Note: For optimal sizing, choose a multiple of the total number of database copies you have
selected for the number of mailbox servers.
Exchange Environment Configuration
1. What Exchange server version are you deploying for your solution? You can select either
2013 or 2016.
2. What server architecture are you deploying for your global catalogs? You can deploy
either the 32-bit or 64-bit architecture for your Active Directory servers. The
architecture you deploy will affect your core ratio planning. For more information,
please see http://technet.microsoft.com/en-us/library/dd346701.aspx.
3. Do these servers only have the Mailbox server role installed? For Exchange 2013, you
choose to deploy dedicated server roles or multi-role. Having the two server roles colocated affects your design in the areas of load balancing client requests, memory
utilization, and CPU utilization.
4. Will these servers be deployed as guest machines in a virtualized environment? There is
CPU overhead that must be accounted for when deploying guest machines that must be
accounted for in the design. For Hyper-V deployments the overhead is about
10%. Check with your hypervisor vendor to determine their overhead and adjust the
“Hypervisor CPU Adjustment Factor” accordingly.
5. Are you deploying a database availability group (DAG)? Deploying the solution as DAG
provides you additional flexibility and resiliency choices like having multiple mailbox
database copies, leveraging flexible mailbox protection features in lieu of traditional
backups, and flexibility in your storage architecture (e.g. RAID or JBOD).
6. How many mailbox servers are you going to deploy within the primary datacenter? If
you enter more than a single server (remember a DAG requires at least two and can
support a maximum of 16), the calculator will evenly distribute the user mailboxes
across the total number of mailbox servers and make performance and capacity
recommendations for each server, as well as, for the entire environment. As for the
secondary datacenter, the calculator will determine the number of mailbox servers you
need to deploy there based on the requirements (number of databases, number of
copies, etc.).
7. How many DAGs are you planning to deploy in the environment? If you enter more than
a single DAG, then the calculator will distribute the user mailboxes across the total
number of DAGs and make performance and capacity recommendations for each server
and each DAG, as well as, for the entire environment.
Site Resilience Configuration
1. Are you deploying the DAG in a site resilient configuration? A DAG can be stretched
across 2 or more datacenters (the calculator only allows for 1 datacenter) without
requiring the AD site or network subnet to be stretched.
2. What user distribution model will you be leveraging in your site resilient architecture?
When planning a site resilience model with Exchange, keep in mind there are two
variables that need to be considered: datacenter model and user distribution model. For
the datacenter model, Exchange requires both datacenters to be in an Active/Active
configuration. This means that both datacenters participating in the DAG solution must
have active, reachable infrastructure and have the ability to support active load at any
time. For the user distribution model, the design can support both Active/Passive and
Active/Active user distribution. The calculator supports three scenarios for these user
distribution models:
1. Active/Passive User Distribution Model - An Active/Passive user distribution
architecture simply has database copies deployed in the secondary datacenter,
but no active mailboxes are hosted there and no database copies will be
activated there during normal runtime operations. However, the datacenter
supports both single cross-datacenter database *overs, and full datacenter
activation.
2. An Active/Active user distribution architecture has the user population dispersed
across both datacenters (usually evenly) with each datacenter being the primary
datacenter for its specific user population. In the event of a failure, the user
population can be activated in the secondary datacenter (either via crossdatacenter single database *over or via full datacenter activation). There are
two types of Active/Active user distribution models:
1. Active/Active (Single DAG) - This model stretches a DAG across the two
datacenters and has active mailboxes located in each datacenter. A
corresponding passive copy is located in the alternate datacenter. This
scenario does have a single point of failure (potentially), the WAN
connection. Loss of the WAN connection will result in the mailbox
servers in one of the datacenters going into a failed state from a failover
cluster perspective (due to loss of quorum):
---DC1---
---DC2---
DAG1
DAG1
Active Copies
Passive Copies
Passive Copies
Active Copies
2. Active/Active (Multiple DAGs) - This model leverages multiple DAGs to
remove single points of failure (e.g., the WAN). In this model, there are
least two DAGs, with each DAG having its active copies in the alternate
datacenter:
---DC1---
---DC2---
DAG-A
(active/passive
copies)
DAG-A (Passive
Copies)
DAG-B (Passive
Copies
DAG-B
(Active/Passive
Copies)
3. In your site resilient architecture, how far behind can you get in terms of log shipping
between datacenters? The effect of the RPO is to evaluate the non-contiguous peak
hours (defined in Step 5), say 8am and 4pm, and determine the resulting throughput
requirement, assuming that you can take the time in between 8 and 4 to catch up
(within the specified RPO, of course). By allowing replication to get behind there are two
outcomes: 1. Active Manager is less likely to choose a database copy that has a high
copy queue length (unless more viable alternatives aren't available). 2. If the copy
queue length is greater than the target server's AutoDatabaseMountDial setting, the
database will not automatically mount once activated. Manually mounting that
database will result in the loss of data that had not been copied.
4. Will you activation block the mailbox servers in the secondary datacenter? In certain
situations (e.g. highly utilized network connection), you may want to control whether
cross-site failovers occur automatically. This can be controlled by placing an activation
block on the remote mailbox servers, thereby preventing Active Manager from selecting
those copies during a failover.
5. When deploying an Active/Active (Single DAG) architecture, do you want to deploy
dedicated disaster recovery Mailbox servers in the alternate datacenter? If deploying a
single DAG Active/Active solution, you can choose to have dedicated DR Mailbox servers
deployed in the secondary datacenter to be used in the event of disaster or utilize the
existing mailbox servers that are hosting active mailboxes.
Mailbox Database Copy Configuration
1. How many highly available (HA) mailbox database copy instances per database do you
plan to deploy within a DAG? Enter in the number of highly available database copies
you plan to have within the environment. This value excludes lagged database copies,
but does account for both the active and any passive HA database copies you plan to
deploy. For optimal sizing, choose a multiple of the total number of mailbox servers you
have selected.
2. How many lagged database copy instances per database do you plan to deploy within a
DAG? Lagged database copies are an optional feature that can provide protection
against certain disaster scenarios (like logical corruption). Lagged database copies
should not be considered an HA database copy as the replay will delay the availability of
the database for use once activated. While technically there is no limit to how many
lagged copies you can deploy within a DAG, the calculator limits you to a maximum of 2
copies.
3. How many highly available mailbox database copy instances per database do you plan
to deploy within the secondary datacenter within a DAG? If you are deploying a site
resilient solution, you can choose to a portion of the total HA database copies deployed
in the secondary datacenter.
4. How many lagged database copy instances per database do you plan to deploy within
the secondary datacenter within a DAG? If you are deploying a site resilient solution,
you can choose to have a portion or all of your lagged database copies deployed in the
secondary datacenter.
Lagged Database Copy Configuration
1. Will you deploy the lagged database copies on a dedicated server? Using a dedicated
server for lagged database copies certainly makes it easier to manage. For DAGs where
the lagged database copies are evenly distributed across all the DAG mailbox servers,
you will need to use the Suspend-MailboxDatabaseCopy with the -ActivationOnly flag to
prevent them from being mounted, but there are scenarios that can clear this. With a
dedicated server you can activation block the entire server and the setting is persistent.
The choice can also affect your storage design in terms of choosing RAID or JBOD. Unless
you have multiple lagged copies, lagged copies should be placed on storage that is
utilizing RAID to provide additional protection. The calculator will determine the
appropriate number of lagged copy servers you need to deploy based on the
requirements (number of databases, number of copies, etc.).
2. How long will you delay transaction log replay on your lagged copy? This parameter is
used to specify the amount of time that the Microsoft Exchange Information Store
service should wait before replaying log files that have been copied to the lagged
database. The maximum amount of replay delay you can set is 14 days. The value you
specify here will influence the log capacity requirements for all copies and the amount
of time required to mount a lagged copy.
3. How long will you delay transaction log truncation on your lagged copy? This parameter
is used to specify the amount of time that the Microsoft Exchange Replication service
should wait before truncating log files that have been copied to the lagged database.
The time period begins after the log has been successfully replayed into the lagged copy.
The maximum allowable setting for this value is 14 days. The minimum allowable setting
is 0, although setting this value to 0 effectively eliminates any delay in log truncation
activity. The value you specify here will influence the log capacity requirements for all
copies.
4. Will the Replay Lag Manager be enabled? In Exchange 2013, the ReplayLagManager is
disabled by default, whereas, it is enabled by default in Exchange 2016. Lagged copies can now
care for themselves by invoking automatic log replay to play down the log files in certain
scenarios
- When a low disk space threshold is reached
- When the lagged copy has physical corruption and needs to be page patched
- When there are fewer than three available healthy copies (active or passive only; lagged
database copies are not counted) for more than 24 hours
In Exchange 2010, page patching wasn't available for lagged copies. In Exchange 2013 and later
versions, page patching is available for lagged copies through this automatic play down feature.
If the system detects that page patching is required for a lagged copy, the logs are automatically
replayed into the lagged copy to perform page patching. Lagged copies also invoke this auto
replay feature when a low disk space threshold has been reached, and when the lagged copy
has been detected as the only available copy for a specific period of time.
Note: The calculator prohibits replay lag manager when dedicated lagged copy servers are
deployed, or when there are less than 3 HA copies and multiple databases / volume is not the
volume architecture.
Exchange Data Configuration
1. What will be the Data Overhead Factor? Microsoft does not recommend utilizing this
factor at this time.
2. How many mailboxes do you move per week? In terms of transactions, you have to take
into account how many mailboxes you will either be moving to this server or within this
server, as transactions totaling the size of the mailbox will always get generated at the
target database.
3. Are you going to deploy a Dedicated Restore Volume? A dedicated restore volume is
used as a staging point for the restoration of data or could be used during maintenance
activities; if one is selected then additional capacity will not be factored into each
database Volume.
4. What percentage of disk space do you want to ensure remains free on the Volume? Most
operations management programs have capacity thresholds that alert when a Volume is
more than 80% utilized. This value allows you to ensure that each volume has a certain
percentage of disk space available so that the volume is not designed and implemented
at maximum capacity.
5. Do you have log shipping compression enabled within the DAG? By default, each DAG is
configured to compress and encrypt the socket connection used to ship logs across
different IP subnets (you can disable these features all together or enable them for all
communications regardless of subnet).
6. What is your compression rate? The compression capability that is obtained for the
socket connection used to ship logs will vary with each customer, based on the data
obtained in the transaction log files. By default, Microsoft recommends using a value of
30%, however, you can determine this value by analyzing your environment (e.g., once
Exchange is deployed you could evaluate the throughput rate with compression
disabled and then compare with compression is enabled).
Database Configuration
1. Do you want to follow Microsoft's recommendations regarding maximum database size?
For standalone mailbox server role solutions, Microsoft recommends that the database
size should not be more than 200GB in size. For solutions leveraging mailbox resiliency,
Microsoft recommends that the database size should not exceed 2TB. Neither of these
is by any means a hard limit, but a recommendation based on the impact database size
has to recovery times. If you want to follow Microsoft's recommendation, then select
Yes. Otherwise, select No.
2. Do you want to specify a custom Maximum Database Size? If you selected No for the
previous field, then you need to enter in a custom maximum database size.
3. Would you like the calculator to determine the optimum number of databases for the
design? By default the calculator will determine the optimum number of databases for
the architecture. In the event that you may want to have a defined number of
databases, select No to "Automatically Calculate Number of Databases" and enter in a
custom number of databases.
4. Do you want to specify the number of databases that should be deployed? If you
selected "No" to the previous field, then you need to enter in the number of databases
you would like to have deployed within your mailbox server or DAG architecture.
5. Do you want to design your database infrastructure such that you deploy the correct
number of databases to ensure symmetrical distribution during server failure events? If
possible, the calculator will deploy the correct number of databases such that you can
achieve a symmetrical distribution of the active copies across the remaining server
infrastructure as the DAG experiences server failures. Note that to use this option, you
must allow the calculator to automatically calculate the required number of databases.
IOPS Configuration
1. What will be the I/O Overhead Factor? Microsoft recommends using 20% to ensure
adequate headroom in terms of I/O to allow for abnormal spikes in I/O that may occur
from to time.
2. What additional I/O requirements do you need to factor into the solution for each
mailbox server's storage design? For example, let's say the solution requires 500 IOPS
for the mailboxes and you have decided you want to ensure there is extra I/O capacity
to support additional products (e.g. antivirus) to generate load during the peak user
usage window. So you enter 300 IOPS in this input factor. The result is that from a host
perspective, the solution needs to achieve 800 IOPS. This may require additional testing
by comparing a baseline system against a system that has the I/O generating application
installed and running.
Transport Configuration
1. What will be the message queue expiration value? By default, the MessageTimeOut
expiration property is set to two days.
2. What will be the Safety Net expiration value? By default, the SafetyNetHoldTime
property is set to two days. This value should equal or be higher than ReplayLagTime.
Mailbox Configuration
Within Step 2 you will define your user profile for up to four different tiers of user populations.
1. How many mailboxes will you deploy in the environment? If deploying a single server
environment, this is how many mailboxes you will deploy on this server. If you are
deploying multiple servers, then this is how many mailboxes you will deploy in the
environment. If you are deploying multiple DAGs, then this is how many mailboxes you
will deploy across all of the DAGs. For example, if you choose to deploy 5 servers, and
want 3000 mailboxes per server, then enter 15000 here. Or if you plan to deploy 2
DAGs, each with 6 servers, and you entered 24000 total mailboxes, then 12000
mailboxes will be deployed per DAG.
2. What is the solution's projected growth in terms of number of mailboxes over its
lifecycle? Enter in the total percentage by which you believe the number of mailboxes
will grow during the solution's lifecycle. For example, if you believe the solution will
increase by 30% over the lifecycle of the design and you are starting out with 1000
mailboxes, then at the end of the lifecycle, the solution will have 1300 mailboxes. The
calculator will utilize the projected growth plus the number of mailboxes to ensure that
the capacity and performance requirements can be sustained throughout the solution's
lifecycle.
3. How much mail do the users send and receive per day on average? The usage profiles
found here are based on the work done around the memory and processor scalability
requirements.
4. What is the average message size? For most customers the average message size is
around 75KB.
5. What is the initial mailbox size? Enter the average initial mailbox size for the mailbox tier
that is being migrated to Exchange platform. This value is only used for the Mailbox
Modeling prediction formulas.
6. What will be the prohibit send & receive mailbox size limit? If you want to adequately
control your capacity requirements, you need to set a hard mailbox size limit (prohibit
send and receive) for the majority of your users.
7. If deploying a personal archive mailbox, what will be the personal archive quota limit? If
you want to adequately control your capacity requirements, you need to set a hard
mailbox size limit (prohibit send and receive) for the majority of your users.
8. What is the deleted item retention period? Enter in the deleted item retention period
you plan to utilize within the environment. The default retention period is 14 days,
however, you should adjust this to match your policy concerning deleted item recovery
when enabling Single Item Recovery to eliminate going to backup media to recover
deleted items.
9. Are you deploying Single Item Recovery? Single Item Recovery ensures that all deleted
and modified items are preserved for the duration of the deleted item retention
window. By default in Exchange, this is not enabled. When enabled, this feature
increases the capacity requirements for the mailbox.
10. Will you have calendar version logging enabled? By default, all changes to a calendar
item are recorded in the mailbox of a user to keep older versions of meeting items for
31 days and can be used to repair the calendar in the event of an issue. This data is
stored in the mailbox's dumpster folder. When enabled, this feature increases the
capacity requirements for the mailbox.
11. How many users within this tier should have IOPS and Megacycles Multiplication Factor
applied? By default, when you specify an IOPS or Megacycles Multiplication Factor, all
users within the mailbox tier will the multiplication factor applied; however you may
only want a percentage of users within the tier to get that multiplication factor.
12. Do you want to include an IOPS Multiplication Factor in the prediction or custom I/O
profile? The IOPS Multiplication Factor can be used to increase the IOPS/mailbox
footprint for mailboxes that require additional I/O (for example, these mailboxes may
use third-party mobile devices). The way this value is used is as follows: (IOPS value *
Multiplication Factor) = new IOPS value.
13. Do you want to include a Megacycle Multiplication Factor when sizing the CPU
requirements for the mailbox tier? The Megacycles Multiplication Factor can be used to
increase the CPU cost/mailbox footprint for mailboxes that require perform more CPU
work than a typical mailbox (for example, these mailboxes may use third-party mobile
devices). The way this value is used is as follows: (Megacycles value * Multiplication
Factor) = new Megacycles value.
14. Do your Outlook Online Mode clients have versions of Windows Desktop Search older
than 4.0 or third-party desktop search engines deployed? The addition of these indexing
tools to the online mode clients incur additional read I/O penalties to the mailbox server
storage subsystem. Care should be taken when enabling these desktop search engines.
Windows Desktop Search 4.0 and later utilizes synchronization protocols that are similar
to how Outlook operates in cached mode to index the mailbox contents, and thus has a
very minor impact in terms of disk read I/O.
15. Are you planning to use the I/O prediction formula or define your own IOPS profile to
design toward? This question asks whether you want to override the calculator in
determining the IOPS / mailbox value. By default, the calculator will predict the IOPS /
mailbox value based on the number of messages per mailbox, and the user memory
profile. For some customers that want to design toward a specific I/O profile, this option
will not be viable. Therefore, if you want to design toward a specific I/O profile, select
No.
16. What is your custom IOPS profile / mailbox? Only enter a value in this field if you
selected "No" to the "Predict IOPS Value" question.
17. What will be the database read:write ratio for your custom IOPS profile? Only adjust this
value if you selected "No" to the "Predict IOPS Value" parameter. When IOPS prediction
is enabled, the calculator will calculate the read:write ratio based on the user profile.
Backup Configuration
Within Step 3 you will define your backup model and your tolerance settings, as well as, choose
whether to isolate the transaction logs from the database.
1. What backup methodology will be used to backup the solution? You have several
options for a backup methodology, including leveraging a VSS solution (hardware or
software based) or leveraging the native data protection features that Exchange
provides. The solution you choose will depend on many factors. For example, if you are
deploying the mailbox resiliency and single item recovery features, you may be able to
forgo a traditional backup architecture in favor of leveraging Exchange as its own
backup. Or if you still require a backup (e.g. legal/compliance reasons), then you need to
deploy a VSS solution. The type of VSS solution you deploy will depend on your storage
architecture. Hardware VSS solutions are available with storage area networks. Software
VSS solutions can be leveraged against either storage area networks or direct attached
storage architectures. Also, the backup methodology will affect the Volume design; for
example, hardware VSS solutions require a Volume architecture that is 2 Volumes /
Database.
2. What will be the backup frequency? You can choose Daily Full, Weekly Full with Daily
Differential, Weekly Full with Daily Incremental, or Bi-Monthly Full with Daily
Incremental. The backup frequency will affect the Volume design and the disk space
requirements (e.g. if performing daily differentials, then you need to account for 7 days
of log generation in your capacity design).
3. How many times can you operate without log truncation? Select how many times you
can survive without a full backup or an incremental backup (the minimum value is 1).
For example, if you are a performing weekly full backup and daily differential backups,
the only time log truncation occurs is during the full backup. If the full backup fails, then
you have to wait an entire week to perform another full backup or perform an
emergency full backup. This parameter allows you to ensure that you have enough
capacity to not have to perform an immediate full backup. If you are leveraging the
native data protection features within Exchange as your backup mechanism, then you
should enter 3 here to ensure you have enough capacity to allow for 3 days' worth of log
generation to occur as a result of potential log replication issues.
4. How long can you survive a network outage? When a network outage occurs, log
replication cannot occur. As a result, the copy queue length will increase on the source;
in addition, log truncation cannot occur on the source. For geographically dispersed DAG
deployments, network outages can seriously affect the solution's usefulness. If the
outage is too long, log capacity on the source may become compromised and as result,
capacity must be increased or a manual log truncation event must occur. Once that
happens, the remote copies must be reseeded. The Network Failure Tolerance
parameter ensures there is enough capacity on the log Volumes so that you can survive
an excessive network outage.
Storage Configuration
Within Step 4 you will define your storage configuration.
Storage Options
1. Do you want to consider storage designs that leverage JBOD? JBOD storage refers to
placing a database and its transaction logs on a single disk without leveraging RAID. In
order to deploy this type of storage solution for your mailbox server environment, you
must have 3 or more HA database copies and have a Volume architecture that is equal
to 1 Volume / Database. If you select yes for this input, the calculator will attempt to
design the solution so that it can be deployed on JBOD storage. Please note that other
factors may alter the viability of JBOD, however (e.g. deploying a single lagged database
copy on the same mailbox servers hosting your HA database copies).
2. Do you want to deploy multiple databases per volume? In Exchange 2010, JBOD
configurations recommended a single database per volume. Due to the optimizations
made in Exchange 2013 and later, you can now deploy multiple databases per volume in
JBOD configurations.
3. Do you want the number of copies per volume to match the copy count? Typically, the
number of copies you deploy on a single volume matches the number of copies you
deploy per database. However, you may find that due to your environment (e.g.,
capacity constraints), it may be better suited to deploy half the number of copies per
volume.
4. Do you want the calculator to automatically determine the number of volumes that
should be deployed on each Exchange server? By default, the calculator will recommend
a certain number of volumes to be configured on each Exchange server based on a
number of factors. As a result, the calculator can recommend more volumes than you
have available in your server configuration. If you have a defined number of volumes
you would like to support for placement of Exchange data, you can select No to this
question and then enter the number of volumes in “Number of Exchange Data Volumes
per Server” field.
5. How many AutoReseed volumes per server should be deployed? When leveraging
AutoReseed, you will want at least one AutoReseed volume per server.
Primary Datacenter Disk Configuration
1. What are the disk capacities and types you plan to deploy? For each type of Volume
(system, database, log, and restore volume) you plan to deploy, select the appropriate
capacity and disk type model.
Secondary Datacenter Disk Configuration
1. What are the disk capacities and types you plan to deploy? For each type of Volume
(system, database, log, and restore volume) you plan to deploy, select the appropriate
capacity and disk type model.
Processor Configuration
Within Step 5, you will define the number of processor core you have deployed for each
mailbox server within your primary and secondary datacenters, as well as, enter the
SPECint2006 Rate Value for the system you have selected.
When you enable virtualization, you must be sure to configure the processor architecture
correctly. In particular, you must enter in the correct number of processor cores that the guest
machine will support, as well as, the correct SPECInt2006 rating for these virtual processor
cores. To calculate the SPECInt2006 rate value, you can utilize the following formula:
X/(N*Y) = per virtual processor SPECInt2006 Rate value
Where X is the SPECInt2006 rate value for the hypervisor host server
Where N = the number of physical cores in the hypervisor host
Where Y = 1 if you will be deploying 1:1 virtual processor-to-physical processor on the
hypervisor host
Where Y = 2 if you will be deploying up to 2:1 virtual processor-to-physical processor on the
hypervisor host
For example, let’s say I am deploying an HP ProLiant DL380p G8 (2.90 GHz, Intel E5-2690)
system which includes two sockets, each containing a 16-core processor, then the SPECInt2006
rate value for the system is 693.
If I am deploying my Mailbox server role as a guest machine using Hyper-V, and following best
practices where I do not oversubscribed the number of virtual CPUs to physical processors,
then
693/(32*1) = 21.66
Since each Mailbox server will have a maximum of 4 virtual processors, this means the
SPECInt2006 rate value I would enter into the calculator would be 21.66*4 = 86.625.
In addition, if you are deploying your Exchange servers as guest machines, you can specify the
Hypervisor CPU Adjustment Factor to take into account the overhead of deploying guest
machines.
Server Configuration
1. How many processor cores and what is their megacycle capability are you planning to
deploy in each server? For each server type (primary datacenter, secondary datacenter,
and lagged copy server) you plan to deploy, select the number of processor cores and
the server’s SPECint2006 rate value. To determine your SPECint2006 Rate Value:
1. Open a web browser and got to www.spec.org.
2. Click on Results, highlight CPU2006 and then select Search CPU2006 Results.
3. Under Available Configurations, select SPECint2006 Rates and click Go. Under
Simple Request, enter the search criteria (e.g., Processor matches x5550).
4. Find the server and processor you are planning to deploy and take note of the
result value. For example, let's say you are deploying a HP Dl380p G8 server with
two 16-core servers (2.9 GHz, E5-2690); the SPECint_rate2006 results value is
693.
Alternatively, you can use Scott Alexander’s fantastic Processor Query Tool to get the
per-server score and processor core count for your hardware platform
Log Replication Configuration
Within Step 6, you will define your hourly log generation rate, the network link, and the
network link latency you expect to have within your site resilient architecture.
1. How many transaction logs are generated for each hour in the day? Enter in the
percentage of transaction logs that are generated for each hour in the day by measuring
an existing Exchange 2003 or Exchange 2007 server in your environment. If the existing
messaging environment is not using Exchange, then evaluate the messaging
environment and enter in the rate of change per hour here.
Now you may be wondering how you can collect this data. We've written a simple VBS script
that will collect all files in a folder and output it to a log file. You can use Task Scheduler to
execute this script at certain intervals in the day (e.g. every 15 minutes). Once you have
generated the log file for a 24 hour period, you can import it into Excel, massage the data (i.e.
remove duplicate entries) and determine how many logs are generated for each hour. If you do
this for each storage group, you will be able to determine your log generation rate for each
hour in the day. This script is named collectlogs.vbsrename (just rename it to collectlogs.vbs)
and you can find it here: Collectlogs VBS script
Network Configuration
1. What type of network link will you be using between the servers? Select the appropriate
network link you will be using between the two datacenters.
2. What is the latency on the network link? Enter in the latency (in milliseconds) that exists
on the network link.
Environment Customizations
Within Step 7, you can define the server names and database naming prefix.
Role Requirements
This section provides the solution's I/O, capacity, memory, and CPU requirements.
Based on the above input factors the calculator will recommend the following architecture,
broken down into four sections:
Environment Configuration
Active Database Copy Configuration
Server Configuration
Log, Disk Space, and IO Requirements
Processor Core Ratio Requirements
This table identifies the required number of processor cores required to support the activated
databases. This table is only populated if you populate the processor core megacycle
information on the Input tab.
The Recommended Minimum Number of Global Catalog Cores identifies the minimum
number of processor cores required to sustain the load for global catalog related
activities and is based on the number of processor cores required to support the
activated databases.
Client Access Server Requirements
This table identifies the memory and CPU requirements for dedicated Client Access servers if
you choose to not co-locate the server roles. This table is only populated if you populate the
processor core megacycle information on the Input tab.
The Recommended Minimum Number of Client Access Processor Cores identifies the
minimum number of processor cores required per server to sustain the load for client
related activities.
The Recommended Minimum Number of Client Access Server RAM Configuration
identifies the minimum amount of memory required per server to sustain the load for
client related activities. This number is scaled to a multiple that can be typically
installed in a server.
Environment Configuration
The Environment Configuration table identifies the number of mailboxes being deployed in
each datacenter, as well as, how many mailbox servers and lagged copy servers you will deploy
in each datacenter. In addition, if you choose to not co-locate server roles, this table will also
identify the minimum number of dedicated Client Access servers you should deploy in each
datacenter (taking into account worst case failure mode of two simultaneous server failures).
User Mailbox Configuration
The Mailbox Configuration table provides you with:
The Number of Mailboxes that you entered in the Input section (this value will include
the projected growth).
The Number of Mailboxes / Database provides a breakdown of how many mailboxes
from each mailbox tier will be stored within a database.
The User Mailbox Size within Database is the actual mailbox size on disk that factors in
the prohibit send/receive limit, the number of messages the user sends/receives per
day, the deleted item retention window (with or without calendar version logging and
single item recovery enabled), and the average database daily churn per mailbox. It is
important to note that the Mailbox size on disk is actually higher than your mailbox size
limit; this is to be expected.
The Transaction Logs Generated / Mailbox value is based on the message profile
selected and the average message size and indicates how many transaction logs will be
generated per mailbox per day. The log generation numbers per message profile
account for:
Message size impact. In our analysis of the databases internally we have found
that 90% of the database is the attachments and message tables (message
bodies and attachments). So if the average message size doubles (from 75 to
150), the worst case scenario would be for the log traffic to increase by 1.9
times. Thereafter, as message size doubles, the impact doubles.
o Amount of data Sent/received.
o Database health maintenance operations.
o Records Management operations
o Data stored in mailbox that is not a message (tasks, local calendar appts,
contacts, etc).
o Forced log rollover (a mechanism that periodically closes the current transaction
log file and creates the next generation).
The IOPS / Mailbox value is the calculated IOPS / Mailbox value that is based on the
number of messages per mailbox, the user memory profile, and desktop search engine
choices. If you had chosen to enter in a specific IOPS / mailbox value rather than
allowing the calculator determining the value based on the above requirements, then
this value will be that custom value.
The Read:Write ratio / Mailbox value defines the ratio of the mailbox's IOPS that are
read I/Os. This information is required to accurately design the storage subsystem I/O
requirements.
o
Database Copy Instance Configuration
This table highlights how many HA mailbox database copy instances and lagged database copy
instances your solution will have within each datacenter for a given DAG.
Database Configuration
The Database Configuration table provides you with:
The Number of Databases is the calculated number of databases required to support the
mailbox population within a standalone server or DAG.
The Recommended Number of Mailboxes / Database is the calculated number of
mailboxes per database ensuring that the database size does not go above the
recommended database size limit.
The Available Database Cache / Mailbox value is the amount of database cache memory
that is available per mailbox. A large database cache ensures that read I/Os can be
reduced.
Database Copy Configuration
The Database Copy Configuration table provides you with the number of database copies being
deployed within each server and the total number of database copies within the DAG.
Server Configuration
The Server Configuration table provides you with the following:
The Recommended RAM Configuration for the primary datacenter mailbox servers,
secondary datacenter mailbox servers, and lagged copy servers. This is the amount of
RAM needed to support the number of maximum activated database copies on a given
server, in addition to, the number of mailboxes based on their memory profile.
The number of processor cores utilized during the worst failure mode scenario.
The CPU Utilization value is the expected CPU Utilization for a fully utilized server based
on the megacycles associated with the user profile and the number of database copies.
Depending on the environment, this will either be for a standalone server hosting 100%
active databases, or a server participating in a DAG that is dealing with a single or
double server failure event (or secondary datacenter activation). It is recommended that
servers not exceed 80% utilization during peak period. The CPU utilization value is
determined by taking the CPU Megacycle Requirements and dividing it by the total
number of megacycles available on the server (which is based on the CPU and number
of cores). If the calculator highlights the CPU utilization with a red background, then this
means the design may not be able to sustain the load - either you must change the
design (number of mailboxes, number of copies, etc.) or change the server CPU
platform.
The CPU Megacycle Requirements value defines the amount of megacycles the primary
datacenter servers must be able to sustain when either all mailbox databases are active
or the number of mailbox database copies that are activated based on a single server or
double server failure event. For secondary servers hosting HA copies, this value defines
the amount of megacycles required to support the activation of all databases after
datacenter activation. For lagged copy servers, this value defines the amount of
megacycles required to support all of the passive lagged copies.
The Server Total Available Adjusted Megacycles value defines the total available
megacycles the server platform is capable of delivering at 100% CPU utilization. This
value has been normalized against the baseline server platform (DL380p G8 2GHz E52650 processors).
The Possible Storage Architecture outlines whether the solution could utilize RAID or
JBOD for the primary datacenter servers, secondary datacenter servers, and lagged copy
servers. JBOD is only considered under the following conditions (this assumes you
configured the calculator to consider JBOD):
o In order to deploy on JBOD in the primary datacenter servers: You need a total of 3 or
more HA copies within the DAG. If you are mixing lagged copies on the same server
that is hosting your HA copies (i.e. not using dedicated lagged copy servers), then you
need at least 2 lagged copies if you are deploying the 1 database / volume
architecture.
o For the secondary datacenter servers to use JBOD: You should have at least 2 HA
copies in secondary datacenter. That way loss of a copy in the secondary datacenter
doesn't result in requiring a reseed across the WAN or loss of data (in the datacenter
activation case). If you are mixing lagged copies on the same server that is hosting
your HA copies (i.e. not using dedicated lagged copy servers), then you need at least 2
lagged copies.
o For dedicated lagged copy servers: You should have at least 2 lagged copies within a
datacenter in order to use JBOD when deploying the 1 database / volume
architecture. Otherwise loss of disk results in loss of your lagged copy (and whatever
protection mechanism that was providing).
The Recommended Transport Database Location identifies whether the transport
database should be deployed on a dedicated disk, or whether it can be deployed on the
system disk.
The CPU Utilization / DAG table provides you with information on the expected theoretical CPU
utilization during various modes:
Normal Run Time (where the active copies are distributed according to
ActivationPreference=1)
Single Server Failure (redistribution of active copies based on a single server failure
event)
Double Server Failure (redistribution of active copies based on a double server failure
event)
Site Failure (datacenter activation)
Worst Failure Mode (in some cases, this value will equal one of the previous scenarios, it
could also be a scenario like Site Failure + 1 server failure; the worst failure mode is
what is used to calculate memory and CPU requirements)
Transaction Log Requirements
The Transaction Log Requirements table provides you with:
The User Transaction Logs Generated / Day indicates how many transaction logs will be
generated during the day for each active database, each server, within the DAG, and
within the environment.
The Average Mailbox Move Transaction Logs Generated / Day indicates how many
transaction logs will be generated during the day for active database, each server,
within a DAG, and within the environment. This number is an assumption and assumes
that an equal percentage of mailboxes will be moved each day, as opposed to moving all
mailboxes on the same day.
The Average Transaction Logs Generated / Day is the total number of transaction logs
that are generated per day for active database, each server, within a DAG, and within
the environment (includes user generated logs and mailbox move generated logs).
Disk Space Requirements
The Disk Space Requirements table provides you with:
The Transport Database Space Required is the amount of space required to support the
transport database on a Mailbox server. The value is derived from the message profile,
the message queue expiration, and the Safety Net hold time.
The Database Space Required is the amount of space required to support each database
and its corresponding copies. This value is derived from the mailbox size on disk, the
data overhead factor, whether a dedicated restore Volume is available. This row also
shows you the space requirements for each server (based on the total number of
database copies), each DAG, and within the environment.
The Log Space Required is the amount of space required to support each database log
stream and the corresponding copies. This value takes into account the number of
mailboxes moved per week (assumes worst case and that all mailboxes are moved on
the same day), the type of backup frequency in use, the number of days that can be
tolerated without log truncation and the number of transaction logs generated per day.
This row also shows you the space requirements for each server (based on the total
number of database copies), each DAG, and within the environment.
The Database Volume Space Required is the Volume size required to support the
database (and potentially its log stream). This calculation takes the total disk space
required for the database and adds to it the size of a database plus 110% (if a dedicated
restore Volume does not exist) for offline maintenance operations, an additional 10% of
the database size for content indexing (if enabled), and includes an amount of free
space to ensure the Volume is not 100% utilized (based on Volume Free Space
Percentage). This row also shows you the space requirements for each server (based on
the total number of database copies), each DAG, and within the environment.
The Log Volume Space Required is the Volume size required to support the databases
log stream. This field lists the amount of space required to support the transaction logs
for a given set of databases and includes an amount of free space to ensure the Volume
is not 100% utilized (based on Volume Free Space Percentage). This row also shows you
the space requirements for each server (based on the total number of database copies),
each DAG, and within the environment.
The Restore Volume Space Required is the amount of space needed to support a restore
Volume if the option was selected in the Input Factor section; this will include space for
up to 7 databases and 7 transaction log sets. Each server will be provisioned with a
restore Volume. This row also shows you the space requirements for each server (based
on the total number of database copies), each DAG, and within the environment.
Host IO and Throughput Performance Requirements
The Host IO and Throughput Performance Requirements table provides you with:
The Total Required Database IOPS is the amount of read and write host I/O the database
disk set must sustain during peak load (this does not factor in any RAID penalties). This
row also shows you the IOPS requirements for each server (based on the total number
of database copies), each DAG, and within the environment
The Total Required Log IOPS is the amount of read and write host I/O that will occur
against the transaction log disk set. This row also shows you the IOPS requirements for
each server (based on the total number of database copies), each DAG, and within the
environment.
The Database Read I/O Percentage defines the percentage of database required IOPS
that are read I/Os. This information is required to accurately design the storage
subsystem I/O requirements.
The amount of throughput required for Background Database Maintenance operations.
Special Notes
The Special Notes table will provide you with additional information about your design:
When to use GPT disks (when a Volume size is greater than 2TB).
Activation Scenarios
If you are deploying a highly available and/or site resilient architecture, then this section will
break down the failure scenarios. The section is broken up into two scenarios:
1. Scenario 1 - Deploying a DAG architecture within a single datacenter or deploying it in a
site resilient Active/Passive user distribution model.
2. Scenario 2 - Deploying a site resilient DAG architecture with an Active/Active user
distribution model.
Important: For the purposes of this calculator, the term "primary datacenter" refers to the
datacenter that is preferred for hosting the active copies for a given set of databases, while the
term "secondary datacenter" refers to the disaster recovery datacenter that is used for
datacenter activation and cross-site database failover events.
Single Datacenter and Active/Passive Environments
The DAG Member Layout table identifies the number of Active Mailbox servers (those that are
hosting active mailboxes within the primary datacenter), the Disaster Recovery Mailbox Servers
(those that host passive database copies in the second datacenter), and any Lagged Copy
Mailbox servers you may be deploying.
There are two tables that provide data around the Active Database configuration, one for the
primary datacenter, which outlines the single or double server events, and one for the
secondary datacenter, which outlines the activation of that datacenter when the primary
datacenter is lost. Both tables provide you with:
The Number of Active Databases (Normal Run Time) value defines the number of active
databases hosted on each server when there are no server outages. Unlike Exchange
2007, Exchange is no longer bound by an active/passive high availability model. Instead,
each server within a DAG can host active mailbox database copies. The calculator
distributes the number of unique databases across the primary datacenter servers
within the DAG, ensuring an equal distribution of mailbox database copies are activated
on each server. This row exposes the total number of mailboxes that are accessible on
each server as a result of the activated database copies. In addition, this row highlights
the total number of databases deployed within each datacenter, in the event that crosssite database *overs are allowed to occur.
The Number of Active Databases (After First Server Failure) value defines the number of
active databases hosted on each server when there is a single server outage. As a result
of the single server outage, the database copies that were activated on the failed server
are equally redistributed across all remaining server nodes. This row exposes the total
number of mailboxes that are accessible on each server as a result of the activated
database copies. In addition, this row highlights the total number of databases deployed
within each datacenter, in the event that cross-site database *overs are allowed to
occur.
The Number of Active Databases (After Double Server Failure) value is populated when
you have at least 3 HA mailbox copies and at least 4 mailbox servers within your design.
It defines the number of active databases hosted on each server when there are two
server outages. As a result of the double server outage, the database copies that were
activated on the failed servers are equally redistributed across all remaining server
nodes. This row exposes the total number of mailboxes that are accessible on each
server as a result of the activated database copies. In addition, this row highlights the
total number of databases deployed within each datacenter, in the event that cross-site
database *overs are allowed to occur.
Active/Active Environments
This section breaks out the architecture into two perspectives, the layout of Datacenter 1 and
the layout of Datacenter 2 with respect to the DAG architecture. Recall that
1. Active/Active (Single DAG) - This model stretches a DAG across the two datacenters and
has active mailboxes located in each datacenter. A corresponding passive copy is
located in the alternate datacenter. This scenario does have a single point of failure
(potentially), the WAN connection. Loss of the WAN connection will result in the
mailbox servers in one of the datacenters going into a failed state from a failover cluster
perspective (due to loss of quorum):
---DC1---
---DC2---
DAG1
DAG1
Active Copies
Passive Copies
Passive Copies
Active Copies
2. Active/Active (Multiple DAGs) - This model leverages multiple DAGs to remove single
points of failure (e.g., the WAN). In this model, there are least two DAGs, with each
DAG having its active copies in the alternate datacenter:
---DC1---
---DC2---
DAG-A
(active/passive
copies)
DAG-A (Passive
Copies)
DAG-B (Passive
Copies
DAG-B
(Active/Passive
Copies)
The DAG Member Layout table identifies the number of Active Mailbox servers (those that are
hosting active mailboxes within the primary datacenter), the Disaster Recovery Mailbox Servers
(those that host passive database copies in the second datacenter), and any Lagged Copy
Mailbox servers you may be deploying.
There are two tables that provide data around the Active Database configuration, one for the
primary datacenter, which outlines the single or double server events, and one for the
secondary datacenter, which outlines the activation of that datacenter when the primary
datacenter is lost. Both tables provide you with:
The Number of Active Databases (Normal Run Time) value defines the number of active
databases hosted on each server when there are no server outages. Unlike Exchange
2007, Exchange is no longer bound by an active/passive high availability model. Instead,
each server within a DAG can host active mailbox database copies. The calculator
distributes the number of unique databases across the primary datacenter servers
within the DAG, ensuring an equal distribution of mailbox database copies are activated
on each server. This row exposes the total number of mailboxes that are accessible on
each server as a result of the activated database copies. In addition, this row highlights
the total number of databases deployed within each datacenter, in the event that crosssite database *overs are allowed to occur.
The Number of Active Databases (After First Server Failure) value defines the number of
active databases hosted on each server when there is a single server outage. As a result
of the single server outage, the database copies that were activated on the failed server
are equally redistributed across all remaining server nodes. This row exposes the total
number of mailboxes that are accessible on each server as a result of the activated
database copies. In addition, this row highlights the total number of databases deployed
within each datacenter, in the event that cross-site database *overs are allowed to
occur.
The Number of Active Databases (After Double Server Failure) value is populated when
you have at least 3 HA mailbox copies and at least 4 mailbox servers within your design.
It defines the number of active databases hosted on each server when there are two
server outages. As a result of the double server outage, the database copies that were
activated on the failed servers are equally redistributed across all remaining server
nodes. This row exposes the total number of mailboxes that are accessible on each
server as a result of the activated database copies. In addition, this row highlights the
total number of databases deployed within each datacenter, in the event that cross-site
database *overs are allowed to occur.
Distribution
The calculator includes a new worksheet, Distribution. Within the Distribution worksheet, you
will find the layout we recommend based on the database copy layout principles.
The Distribution worksheet includes several options to help you with designing and deploying
your database copies:
You can determine what the active copy distribution will be like as server or WAN
failures occur within your environment.
You can choose the location for your file share witness (primary, secondary or tertiary
datacenter).
You can export a set of CSV and PowerShell scripts that perform the following actions:
o Diskpart.ps1 (uses Servers.csv) - Formats the physical disks, and mounts them as
mount points under an anchor directory on each server.
o CreateDAG.ps1 (uses DAGInfo.csv) – Creates the DAG and sets the DAG associated
properties like Witness and Auto-Reseed settings.
o CreateMBDatabases.ps1 (uses MailboxDatabases.csv) - Creates the mailbox database
copies with activation preference value of 1.
o CreateMBDatabaseCopies.ps1 (uses MailboxDatabaseCopies.csv) - Creates the
mailbox database copies with the appropriate activation preference values across the
server infrastructure.
Important: The database copy layout the tool provides assumes that each server and its
associated database copies are isolated from each other server/copies. It is important to take
into account failure domain aspects when planning your database copy layout architecture so
that you can avoid having multiple copy failures for the same database.
Volume Requirements
The Volume Requirements section is really a continuation of the Storage Requirements section.
It outlines what we believe is the appropriate Volume design based on the input factors and the
analysis performed in the previous sections.
Note: The term Volume utilized in the calculator refers only the representation of the disk that
is exposed to the host operating system. It does not define the disk configuration.
The Volume Design highlights the Volume architecture chosen for this server solution. The
architecture is derived from the backup type, backup frequency, and high availability
architecture that were chosen in the Storage Requirements section.
There are four types of Volume architecture that can be leveraged within Exchange 2013 and
later:
Multiple DBs / Volume
1 Volume / Database
2 Volumes / Database
2 Volumes / Backup Set
Multiple DBs / Volume
A Multiple DBs / Volume architecture enables you to support multiple databases (mixtures of
active and passive copies of different databases) on the same JBOD disk, thereby leveraging
larger disks in terms of capacity and IOPS as efficiently as possible. Typically the number of
copies you deploy on a single disk matches the number of copies you have for a database (for
example, if you are deploying 4 copies of each database, the recommendation is to deploy 4
copies on a JBOD disk). Like with the 1 Volume / Database architecture, the database and its
search index and transaction logs are all deployed on the volume.
1 Volume / Database
A single Volume per Database architecture means that both the database and its corresponding
log files are placed on the same Volume. In order to deploy a Volume architecture that only
utilizes a single Volume per database, you must have a Database Availability Group that has 2 or
more copies and not be utilizing a hardware based VSS solution.
Some of the benefits of this strategy include:
Simplified storage administration. Fewer Volumes to manage.
Potentially reduce the number of backup jobs.
Flexibility to isolate the performance between Databases when not sharing spindles
between Volumes.
Some of the concerns with this strategy include:
Limits the ability to take hardware based VSS backup and restores (e.g., clone
snapshots). See Best Practices for Using Volume Shadow Copy Service with Exchange
Server 2003 for more VSS details.
2 Volumes / Database
With Exchange 2013 and later, in the maximum case of 100 Databases, the number of Volumes
you provision will depend upon your backup strategy. If your recovery time objective (RTO) is
very small, or if you use VSS clones for fast recovery, it may be best to place each Database on
its own transaction log Volume and database Volume. Because doing this will exceed the
number of available drive letters, volume mount points must be used.
Some of the benefits of this strategy include:
Enables hardware-based VSS at a database level, providing single database backup and
restore.
Flexibility to isolate the performance between databases when not sharing spindles
between Volumes.
Increased reliability. A capacity or corruption problem on a single Volume will only
impact one database. This is an important consideration when you are not leveraging
the built-in mailbox resiliency features.
Some of the concerns with this strategy include:
50 databases require 100 Volumes which could exceed some storage array maximums.
A separate Volume for each database causes more Volumes per server increasing the
administrative costs and complexity.
2 Volumes / Backup Set
A backup set is the number of databases that are fully backed up in a night. A solution that
performs a full backup on 1/7th of the databases nightly (i.e. using a weekly or bi-monthly full
backup with daily incrementals or differentials) can reduce complexity by placing all of the
databases to be backed up on the same log and database Volume. This can reduce the number
of Volumes on the server.
Some of the benefits of this strategy include:
Simplified storage administration. Fewer Volumes to manage.
Potentially reduce the number of backup jobs.
Some of the concerns with this strategy include:
Limits the ability to take hardware based VSS backup and restores (e.g., clone
snapshots). See Best Practices for Using Volume Shadow Copy Service with Exchange
Server 2003 for more VSS details.
A capacity or corruption problem on a single Volume could impact more than one
Database.
Results Pane
Based on the above input factors the calculator will recommend the following architecture:
Volume Design
The Volume Design table highlights the recommended Volume architecture.
Volume Configuration
The Volume Configuration table highlights the number of databases that should be placed on a
single Volume. This is derived from Volume Architecture model.
This section also documents how many Volumes will be required for the entire solution, broken
out by Database and Log sets, the number of restore Volumes per server, and the number of
spare volumes (or disks) that should be deployed to support Auto-Reseed.
Database and Log Configuration
The Database and Log Configuration table outlines the number of databases (or copies) per
server, the number of mailboxes per database, the size of each database, and the transaction
log size required for each database.
Database and Log Volume Design
The database and log Volume Design table outlines the physical Volume layout and follows the
recommended number of databases per Volume approach based on the Volume Architecture
model. It also documents the Volume size required to support layout (this is where we factor in
the additional capacity for content indexing, the Volume Free Space Percentage, and whether
you are using a Restore Volume), as well as the transaction log Volume.
Important: The DB and Log Volume Design Table identify databases by a unique number.
However, databases copies are distributed across the servers, and thus, these numbers hold no
significance and are used solely as an example to show a server's Volume layout.
Backup Requirements
The Backup Requirements section is really a continuation of the Role Requirements section. It
outlines what we believe is the appropriate backup design based on the input factors and the
analysis performed in the previous sections.
Backup Configuration
The Backup Configuration table outlines the number of databases that will be placed within a
single Volume and the type of backup methodology and frequency in which the backups will
occur.
Backup Frequency Configuration
The Backup Frequency Configuration section will provide you with an outline on how you
should perform the backups for each server, utilizing either a daily full backup or weekly or bimonthly full backup frequency.
Log Replication Requirements
The Log Replication Requirements section is another continuation of the Role Requirements
section. It outlines what we believe is the throughput required to replicate the transaction logs
to each target database copy in the secondary datacenter.
Peak Log and Content Index Replication Throughput Requirements
The Peak Log and Content Index Replication Throughput Requirements table provides you with:
The Peak Log & Content Index Throughput Required / Database is the total throughput
required for a single log stream and content index. This value is based on the peak log
generation hour.
The Peak Log & Content Index Throughput Required between Datacenters / DAG is the
total throughput required to replicate the transaction logs and content index to all
database copies (lagged and non-lagged) that exist within the alternate datacenter for
the database availability group.
The Peak Log & Content Index Throughput Required Between Datacenters /
Environment is the total throughput required to replicate the transaction logs and
content index to all database copies (lagged and non-lagged) that exist within the
alternate datacenter for all database availability groups.
RPO Log and Content Index Replication Throughput Requirements
In terms of log replication, RPO means how behind can you get in log shipping? The lower the
RPO (a value of 0 or 1 essentially means you want to only lose the open log file), the higher the
bandwidth you need because you cannot get behind in log replication. The higher the RPO
(approaching 24) less bandwidth is needed as you are expecting to be behind (up to x hours) in
log replication and to catch up at some point in the day.
The RPO Log and Content Index Replication Throughput Requirements table provides you with:
The RPO Log & Content Index Throughput Required / Database is the required
throughput necessary to replicate the transaction logs and content index based on the
RPO to the mailbox servers that are located within the secondary datacenter per
database.
The RPO Log & Content Index Throughput Required Between Datacenters / DAG is the
RPO total throughput required to replicate the transaction logs and content index to all
database copies (lagged and non-lagged) that exist within the alternate datacenter for
the database availability group.
The RPO Log & Content Index Throughput Required Between Datacenters / Environment
is the RPO total throughput required to replicate the transaction logs and content index
to all database copies (lagged and non-lagged) that exist within the alternate datacenter
for all database availability groups.
Chosen Network Link Suitability
The Chosen Network Link Suitability table will dictate whether the chosen network link has
sufficient capacity to sustain the peak replication throughput requirements and/or the RPO
replication throughput requirements. If the network link cannot sustain the log replication
traffic, then you will need to either upgrade the network link to the recommended network link
throughput, or adjust the design appropriately.
Recommended Network Link
The Recommended Network Link table recommends an appropriate network link if the chosen
network link does not have sufficient capacity to sustain log replication for solution for both the
peak and RPO throughput requirements.
Note: The Network Link recommendations do not take into account database seeding or any
other data that may also utilize the link.
Storage Design
The Storage Design worksheet is designed to take the data collected from the Input worksheet
and Storage Requirements worksheet and help you determine the number of physical disks
needed to support the databases, transaction logs, and Restore Volume configurations.
Storage Design Input Factors
In order to determine the physical disk requirements, you must enter in some basic information
about your storage solution.
RAID Parity Configuration
For the Database/Log RAID Parity Configuration table you need to select the type of RAID
building block your storage solution utilizes. For example, some storage vendors build the
underlying storage in sets of data+parity (d+p) groups. A RAID-5 3+1 configuration means that 3
disks will be used for capacity and 1 disk will be used for parity, even though parity is
distributed across all the disks. So if you had a capacity requirement that would utilize 15 disks,
then you would need to deploy 5 3+1 groups to build that RAID-5 array.
RAID-1/0 supports 1d+1p, 2d+2p, and 4d+4p groupings
RAID-5 supports 3d+1p through 20d+1p groupings (though storage solutions could
support more than that).
RAID-6 supports 6d+2p groupings.
Database/Log RAID Rebuild Overhead
When a disk is lost, the disk needs to be replaced and rebuilt. During this time, the performance
of the RAID group is affected. This impact as a result can affect user actions. Therefore, to
ensure that RAID rebuilds do not affect the overall performance of the mailbox server,
Microsoft recommends that you should ensure sufficient overhead is provisioned into the
performance calculations when designing for RAID parity. Most RAID-1/0 implementations will
suffer a 25% performance penalty during a rebuild. Most RAID-5 and RAID-6 implementations
will suffer a 50% performance penalty during a rebuild.
The calculator defaults with the following as Microsoft recommendations, but they are
adjustable:
For RAID-1/0 implementations, ensure that you factor in an additional 35% performance
overhead.
For RAID-5/RAID-6 implementations, ensure that you factor in an additional 100%
performance overhead.
In addition, you should consult with your storage vendor to determine the appropriate RAID
rebuild penalty.
Database RAID Configuration
By default, for RAID storage solutions, the calculator will recommend either RAID-1/0 or RAID-5
by evaluating capacity and I/O factors and determining which configuration utilizes the least
amount of disks while satisfying the requirements. If you would like to override this and force
the calculator to utilize a particular RAID configuration for your databases (e.g., RAID-0 or RAID6), select "Yes" to this option and then select the appropriate RAID configuration in the cell
labeled "Desired RAID Configuration." Note that while you can potentially override the
database RAID configuration, you cannot do so for the log RAID configuration - that will always
be RAID-1/0.
Note: The calculator prevents the use of RAID-5 or RAID-6 with 5.2K, 5.4K, 5.9K and 7.2K disk
types, due to performance implications.
Restore Volume RAID Configuration
You can select the type of parity you will be utilizing and the RAID configuration you will be
deploying for your Restore Volume.
Results Pane
The Storage Design Results section outputs the recommended configuration for the solution.
The recommendations made are for implementing the solution potentially on RAID and JBOD
storage.
RAID Storage Architecture
The RAID Storage Architecture Table outlines which servers (primary datacenter servers,
secondary datacenter servers, or lagged copy servers) should be deployed on RAID storage.
The RAID Storage Architecture / Server table recommends the optimum RAID configuration and
number of disks for each Volume (database, log and restore Volume) for each mailbox server
ensuring that performance and capacity requirements are met within the design.
JBOD Storage Architecture
The JBOD Storage Architecture Table outlines which servers (primary datacenter servers,
secondary datacenter servers, or lagged copy servers) could be deployed on JBOD storage.
The JBOD Storage Architecture / Server table recommends the optimum JBOD configuration
and number of disks for each Volume (database, log and restore Volume) for each mailbox
server ensuring that performance and capacity requirements are met within the design.
Total Disks Required
By default, the calculator will determine the storage architecture that should be utilized to
reduce the total number of disks required to support the design, in addition, to still ensuring
you have minimized single points of failure by utilizing RAID and/or JBOD based on the
decisions found in the "RAID Storage Architecture" and "JBOD Storage Architecture" tables.
However, you can change the storage architecture to be built entirely on RAID or entirely on
JBOD (if the design supports JBOD as a possible solution; also keep in mind that certain
scenarios (e.g., a single database copy in a datacenter) may result in a single point of failure) by
selecting the appropriate value in the "Storage Architecture will be Deployed:" drop-down.
The Storage Configuration table will output the total number of disks required for each mailbox
server that requires RAID or JBOD storage, as well as, identify the total number of disks
requiring RAID or JBOD storage in each datacenter.
Conclusion
Hopefully you will find this calculator invaluable in helping to determine your Exchange server
role requirements. If you have any questions or suggestions, please email
strgcalc@microsoft.com.
Download