Microsoft SQL Server Scalability Project White Paper 1 – Basic

1
Microsoft SQL Server 2000 Scalability
Project – Server Consolidation
Author: Man Xiong
Microsoft Corporation
March 2002
Abstract: In this paper we focus on a common scenario where a single system hosts
many databases with a relatively small number of users per database. The effects of the
parameters mentioned above are studied on an actual application, called the "PACE."
PACE is a financial application from Microsoft® bCentral™ that is running on a
combination of servers: Microsoft® SQL Server™ 2000 Enterprise Edition, Microsoft
Windows® 2000 Datacenter™ Server, and Dell PowerEdge 8450 servers.
The information contained in this document represents the current view of Microsoft
Corporation on the issues discussed as of the date of publication. Because Microsoft
must respond to changing market conditions, it should not be interpreted to be a
commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of
any information presented after the date of publication.
This White Paper is for informational purposes only. MICROSOFT MAKES NO
WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS
DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without
limiting the rights under copyright, no part of this document may be reproduced, stored
in or introduced into a retrieval system, or transmitted in any form or by any means
(electronic, mechanical, photocopying, recording, or otherwise), or for any purpose,
without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other
intellectual property rights covering subject matter in this document. Except as
expressly provided in any written license agreement from Microsoft, the furnishing of
this document does not give you any license to these patents, trademarks, copyrights,
or other intellectual property.
Unless otherwise noted, the example companies, organizations, products, domain
names, e-mail addresses, logos, people, places and events depicted herein are fictitious,
and no association with any real company, organization, product, domain name, email
address, logo, person, place or event is intended or should be inferred.
© 2002 Microsoft Corporation. All rights reserved.
Microsoft, bCentral, Windows, are either registered trademarks or trademarks of
Microsoft Corporation in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks
of their respective owners.
i
Table of Contents
Executive Summary ........................................................................................... 1
Introduction ...................................................................................................... 2
PACE Application................................................................................................ 2
Use Multiple Instances....................................................................................... 3
Memory Configuration for Multiple Instances .................................................... 6
Processor Affinity .............................................................................................. 7
Disk Layout ........................................................................................................ 7
Recovery Models ................................................................................................ 9
Conclusion ......................................................................................................... 9
Appendix A: A Dell Solution for Scalable Enterprise Computing ......................... 9
Appendix B: Hardware/Software Configuration ............................................... 10
Appendix C: PACE Workload Simulation Configuration..................................... 11
Appendix D: Test Configuration ....................................................................... 11
Appendix E: Disk Configuration ........................................................................ 12
Appendix F: Optimal System Configuration for 500 databases on each of 8
instances ......................................................................................................... 13
1
Executive Summary
This white paper is a joint effort by Microsoft and Dell to demonstrate the scalability of
Microsoft SQL Server™ 2000 and Dell hardware. SQL Server 2000 running on a Dell
enterprise eight-way server can support thousands of databases, and their users, while
providing the performance necessary to allow for centralization. SQL Server 2000
maximizes return on investments in symmetric multiprocessing (SMP) systems, enabling
users to add processors, memory, and disks to build large centrally managed enterprise
servers.
More companies are outsourcing database services to Application Service Providers
(ASP). Internal information technology (IT) organizations are consolidating their
database services as justified by total cost of ownership (TCO) and manageability
benefits. This paper focuses on a common ASP scenario, where a single system hosts
many databases with a relatively small number of users, per database. This scenario is
also applicable to companies looking to consolidate databases from across an enterprise
of servers onto a centralized server. Therefore, the goal of this paper is to demonstrate
an approach to successfully scaling increased workloads on a single server using multiple
instances. An application from Microsoft® bCentral™ is used to study the effects of
different configurations on workload performance as measured by transactions per
minute (TPM).
The following list provides you with the benefits for using multiple instances on a single
server:

Ability to support larger workloads on a single server.

Flexibility to separate databases based on meeting the requirements of different
Service Level Agreements (SLA).

Ability to separate databases with different performance requirements.

Ability to separate databases with different backup and recovery requirements.

Ability to separate databases with different security requirements.

Ability to separate database based on change control, operational and maintenance
requirements.
In this case study, the results show that:

Using multiple instances versus a single instance to support a large number of
databases increased the workload supported by a single server by a factor of eight.

Setting processor affinity increased the workload supported by 80 percent over the
default setting, when used with multiple instances.

Separating transaction logs from data files increases the workload by 10 percent.
This provides additional support for the best practice of not placing logs and data on
the same device.

Best results are achieved when a server is dedicated to SQL Server.
This paper provides general guidelines for understanding the criteria involved in
successfully configuring multiple instances for optimal throughput.
2
Introduction
As more and more customers move to server consolidation, we expect ASP and
corporate IT to deploy multiple instances of SQL Server 2000 for very large numbers of
databases.
The ability to separate databases allows an ASP or corporate IT provider more flexibility
in providing different service levels to customers without requiring separate machines.
Some of the challenges of this approach include determining when to use multiple
instances and how to configure them for optimal performance.
Optimal configuration is defined by studying the impacts of several parameters,
including multiple instances, memory configuration, CPU affinity, disk layout and
recovery models on TPM.
In this paper we focus on a common scenario where a single system hosts many
databases with a relatively small number of users per database. The effects of the
parameters mentioned above are studied on an actual application, called the "PACE."
PACE is a financial application from Microsoft bCentral that is running on a combination
of servers: Microsoft SQL Server 2000 Enterprise Edition, Microsoft Windows® 2000
Datacenter™ Server, and Dell PowerEdge 8450 servers.
PACE Application
The PACE application is an accounting and financial management product deployed as
one of the service offerings available to customers of Microsoft bCentral. Designed to
help small businesses to complete everyday tasks more efficiently, PACE includes
financial, banking, payroll, sales, and purchasing capabilities as well as quick and easy
reporting.
To offer every customer entity-secure accounting controls and reliable multiuser access,
the application has a large number of small financial databases on a single server, one
database per customer. This also allows a more granular control over security, backup
and recovery, change control, and maintenance operations. The application has more
than 200 stored procedures per database to support the Web service.
This non-traditional design brings new challenges for system management and
performance tuning for SQL Server. The biggest challenge is the amount of memory
required to support a large number of stored procedures, multiplied by the number of
databases. For SQL Server, virtual memory space is required so that an execution plan
can be compiled for each procedure on each database, with that plan being kept in the
procedure cache. For the PACE application, the number of the execution plans cached for
500 databases is 200*500 (or 100,000 cache entries). The greater the number of PACE
databases, the greater the required size of the server procedure cache. When there are
more execution plans than the procedure cache can keep in memory, execution
recompilations occur, which can reduce query-processing throughput. Traditional
methods, such as parameterization of the stored procedures, do not address the
problem.
Effective tuning requires specific configurations to increase effective memory space for
the procedure cache. Other challenges include how to efficiently use CPUs for the high
degree of concurrent activity due to the number of databases and users, the optimal
disk layout, and the optimal recovery model.
3
Use Multiple Instances
Multiple instances can scale up the number of databases and
workload per server.
When the number of databases and resultant workload reach a certain level, it is a good
practice to group databases in multiple SQL Server instances as it relieves the pressure
on memory. Good performance is accomplished by enabling the use of more memory for
the server procedure cache for each instance; and provides better operational and
security isolation.
Our tests demonstrate that using multiple instances allows for scale up of the total
number of databases and resultant workload hosted on the system while maintaining
good throughput per database.
Figure 1 shows the performance degradation on a single instance when the number of
PACE databases increases from 500 to 4000. It also shows that running 500 databases
per each of the 8 instances gives 8 times the number of transactions per minute as
running 500 databases on one single instance. Running 500 databases, per each of 8
instances, with a heavy workload comes close to utilizing the total CPU capacity at this
hardware configuration so further increasing number of total databases by running 500
databases on each of 16 instances can not increase the total throughput (however the
system still demonstrates an acceptable throughput).
Figure 2 shows that the workload throughput per database decreases when the number
of databases (also the number of client connections) increases on a single SQL Server.
When the number of databases is kept at 500 per instance, throughput per database
stays the same on a single instance and on 8 instances, but decreases when increased
to 16 instances due to CPU constraints.
Figure 1 Total workload throughput on the system versus total number of PACE
databases on the system.
3
4
Figure 2 Workload throughput on each database versus the number of PACE databases
on the system.
Why we need multiple instances for this scenario
Performance degradation for many thousands of PACE databases on a single instance.
SQL Server 2000 can access up to 2 gigabyte (GB) of virtual memory (or 3 GB if the
/3GB switch is set in boot.ini) for the procedure cache. When the number of databases
on a single instance increases from 500 to 1000, there is not enough virtual memory for
the procedure cache to hold all of the increased number of query execution plans in
memory. Some execution plans in the procedure cache are dropped in order to free
memory to make room for the plans of other stored procedures; but doing this requires
the dropped stored procedure plans to be recompiled when needed. Note that the
resulting frequent recompilations can hurt workload performance.
Using multiple instances relieves the pressure on memory
As shown in Figure 3, with greater than 4 GB physical memory, running the databases
on multiple instances makes more memory available for procedure cache (each instance
has its own virtual address space and procedure cache).
When the number of PACE databases increases to many thousands for a single instance,
the sheer number of database objects consumes so much memory space available to
procedure cache that the performance will degrade. The procedure cache will be
overwhelmed and recompilation will occur. Therefore, we recommend multiple instances.
5
Figure 3 Effective memory for procedure cache in different configurations
Same challenge for other similar designs
The key determinant of whether multiple instances benefit performance is the total
amount of memory space needed for all the execution plans. The total amount of
memory space needed is determined by the average size of execution plans, number of
stored procedures per database, and the number of databases. The same challenge can
occur when using a smaller number of databases if there are more procedures in each
database. Additionally, the same challenge can also occur with very complicated
procedures on a smaller number of databases, with a smaller number of stored
procedures, per database.
5
6
Memory Configuration for Multiple
Instances
To achieve great performance for multiple instances, we only needed to specify a
reasonable minimum server memory without additional tuning on memory configuration.
By reserving 1 GB minimum server memory for each instance, and keeping the
maximum server memory open, we observed a 25 percent performance gain over using
the default dynamic memory allocation. The performance is as good as using optimal
static memory allocation without the high cost of constant recalibration. A customer
using this method should be aware of the possible impact on memory allocation, for
other applications on the same server. Other applications competing for memory impact
dynamic memory allocation. For this reason, Microsoft recommends dedicating the
system to SQL Server.
Another advantage of dedicating the system to a SQL server is that it allows for the
configuration to support differing workloads on each instance without special testing and
tuning to determine the optimum memory configuration. The practice of dedicating the
system to a SQL Server reduces the need for outages to reconfigure memory to achieve
performance goals if workloads are varied.
Simple memory configuration gives a performance boost
The use of a minimum floor for memory for each instance achieved the same
performance as a perfect static allocation. The burden of testing to determine the ideal
static allocation for each instance can be reduced.
Tests have been run on 8 and 16 instance configurations using both static and dynamic
memory allocation settings to determine if one method provides better performance.
Because every database has the same workload and every instance has the same
number of databases, the workload is even among the instances.
However, when running 16 instances, the use of static memory provides 25 percent
better performance than using dynamic allocation.
For the 16-instance configuration, reserving 1 GB minimum of server memory for each
instance, and keeping the maximum server memory open, we observed the same
performance as using optimal static memory allocation.
Why this memory configuration is desired
When multiple instances of SQL Server are running on the same computer, each
instance independently uses the standard algorithm for dynamic memory management.
The amount of memory allocated to each specific SQL Server instance is driven by the
relative workload of each instance. This is designed to ensure that the instances with
higher workloads acquire more memory while instances processing lighter workloads
acquire less memory.
When running 16 instances, there is no longer enough physical memory to satisfy the
targets of all of the instances. The instances of SQL Server begin to compete for the
limited memory available, and it takes much longer to reach equilibrium. In this
scenario, the use of static memory allocation to ensure initial allocation of the optimal
size of memory to instances provides better performance.
7
Note that every real world scenarios differs — they require experimentation to find the
right size of memory to allocate to each instance. Also, when the workload level, on the
different instances changes, additional experiments are required to determine the new
optimal memory allocation among the instances — this ongoing experimentation is not
practical in most real applications.
The more realistic way to allocate memory among multiple instances is to combine the
static memory allocation and dynamic allocation. Reserving reasonable, and minimum
server memory for each instance can reduce the overhead to achieve equilibrium.
Keeping the maximum server memory open allows instances to adjust memory
allocation dynamically, based on their workload.
Processor Affinity
Our tests demonstrate that manually allocating processors to specific instances of SQL
Server with affinity mask can give a performance boost of up to 80 percent, assuming
the workload is consistent, which is typical for ASP workloads. The best results were
achieved when an instance of SQL Server did not have to share processors with other
server application processes.
Performance gain with CPU affinity
Tests were run on 500 databases, on each, of 8 instances with optimal memory
configuration. The workload for each instance is identical. Allocating each of the 8
instances to one of the 8 CPUs gives an 80 percent throughput improvement compared
to the default processor affinity setting.
Why using CPU affinity can improve performance
By default, each thread of an instance of SQL Server is scheduled to the next available
processor. The CPU affinity mask setting can be used to restrict an instance to only a
subset of CPUs, and also ensures that each thread always uses the same processor
between interrupts. This reduces the swapping of the same thread among multiple
processors, and increases the cache hit ratio on the second-level cache. However, CPU
affinity setting needs to be used carefully because workloads on different CPUs cannot
be balanced dynamically if the workloads on each instance are not even.
When running multiple instances on the same server with multiple processors, assigning
processors to specific instances by setting CPU affinity can reduce the number of active
threads per processor and also reduce context switches, thus better utilizing the secondlevel cache.
Disk Layout
For recoverability, logs should never be placed on the same device as data. In addition,
separate physical disks for log files improves performance.
In this test scenario, separating log files from data files on physical disks was found to
give 10 percent performance gain over placing the logs and data files on the same
(larger) volume.
7
8
Tests were run by using 500 databases on each of 8 instances with optimal memory and
CPU configurations with two different disk layouts as shown in Figure 4:

Scenario 1:
Data files, log files (including tempdb log), and tempdb files of each instance on the
same disk array, as described in Disk Configuration2 of Appendix E.

Scenario2:
Data files, log files (including tempdb log), and tempdb files of each instances on
three separate disk arrays, as described in Disk Configuration3 of Appendix E.
Separating log files from data files on physical disk gives a 10 percent performance gain.
Figure 4 Disk layouts
Why separating log files on physical disks improves performance
It is a common practice to put log files on different physical disks from data files to
isolate sequential disk I/O on log files from random disk I/O on data file. This practice is
still valid with hundreds of log files in this test scenario. The benefit of more spindles in
the larger volume outweighs the benefit of separating random data I/O from sequential
log I/O.
9
Recovery Models
Our tests demonstrate that running the workload under the full recovery model can
achieve 90 percent performance of the bulk logged recovery model. In the given
scenario, it makes sense to use the full recovery model to provide the most flexibility in
recovery while maintaining good performance.
Full recovery model provides good performance
To determine the impact of different recovery models on the workload performance,
tests were run on 500 databases, on each of 8 instances, with optimal memory, disk
configuration, and CPU affinity with two recovery models: full and bulk-logged. Running
the workload using the bulk logged model results in 10 percent better performance than
the full recovery model.
Additional considerations
Microsoft recommends that all production online transaction processing (OLTP) systems
make use of the full recovery model, and the data protection it provides. Bulk-logged
recovery model can be used temporarily when doing large operations such as index
creation or bulk data loading. Performance of the large operations improves, but at the
cost of increased risk of data loss. For more information, refer to SQL Server Books
Online.
Conclusion
With the combination of Microsoft® SQL Server™ 2000, Microsoft® Windows™ 2000
Datacenter Server, and the Dell PowerEdge 8450 servers, excellent and scalable
performance is observed with many thousands of databases across multiple instances of
SQL Server on a single server. The optimal configuration is defined by studying the
impact of several parameters, including multiple instances, memory configuration, CPU
affinity, disk layout, and recovery models.
In general, to successfully scale workloads on a single server for a large number of
databases, and provide data protection:

Use multiple instances to scale up the workload supported by a single server.

Set processor affinity to increase performance for multiple instances.

Configure memory per instance using a minimum floor and leave maximum server
memory open.

Separate logs from data on physical disk.
Appendix A: A Dell Solution for Scalable
Enterprise Computing
The Dell PowerEdge 8450 is an ideal solution for Scalable Enterprise Computing (SEC)
environments because it is can deliver high levels of scalable performance running
solutions on Microsoft Windows 2000 Datacenter Server and Windows 2000 Advanced
Server. The PowerEdge 8450 is designed to support enterprise applications and
consolidate server resources in datacenter environments.
9
10
PowerEdge 8450
This provides up to 8 Intel Pentium III Xeon processors at 700 MHz and 900 MHz (1-MB
and 2-MB cache available) for ultimate scalability.

Hot-swap drives, power supplies, cooling fans and PCI slots help improve reliability
and performance.

Four-peer PCI buses and 10 64-bit PCI slots provide outstanding I/O bandwidth.

Premier enterprise services for consulting, deployment, and support.

Windows 2000 Datacenter implementation plan and services.

Dell Open Manage server management software for ease of use.
Dell|EMC FC4700
High-availability midrange storage platform presenting exceptional
price/performance/protection value for use in direct attach or large SAN environments.

Fully redundant storage for high data availability

Accelerates server deployments through storage consolidation and multiple-host
connectivity

Up to 8.7TB of fully protected raw storage capacity per storage subsystem

Nondisruptive Upgrade (NDU) support - online upgrades of storage software and
microcode

Can be redeployed as a network attached IP4700

SnapView software provides point-in-time backup and recovery and facilitates
increased data availability

MirrorView software provides highly available data storage across geographically
separate locations

Advanced enterprise features with multiprocessor storage processors, and four
connections per storage array
Appendix B: Hardware/Software
Configuration
The tests were conducted using the following environment:
Server
1 Dell PowerEdge 8450
8 Intel® Pentium® III Xeon™ processors at 700 MHz
32 GB of RAM
8 Qlogic QLA2200 PCI Host Bus Adapters
4 Dell|EMC2 FC4700 Disk Processor Enclosures, each with 10, 32.9-GB, 10,000-RPM
disks and 512-KB write-read cache
12 Dell|EMC2 FC4700 Disk Array Enclosuers, each with 10, 32.9-GB, 10,000-RPM
disks
Total Disk Space = 5 TB – (160) 32.9 GB, 10,000-RPM disks
11
Operating system
Microsoft Windows 2000 Datacenter Server, Build 5.00.2195, Service Pack 1
Database server
Microsoft SQL Server 2000 Enterprise Edition, Build 2000.80.194.0 SP1
Appendix C: PACE Workload Simulation
Configuration
1. Always use one connection per database.
2. Set very short think time (18ms) for each client connection (to eliminate very heavy
stress level) — this scales the single user connection to simulate the activity of 500
users with a think time of 10 seconds.
3. Use similar workloads among databases.
4. Start each database with a standard size of 30MB.
Appendix D: Test Configuration
1. The /3GB switch, in boot.ini, is used to change the size of virtual memory of
processes.
2. /PAE is used in boot.ini to allow OS to use more than 4 GB memory.
3. The maximum option in boot.ini is set to the appropriate size for the test purpose.
4. AWE is enabled in conjunction with the /PAE boot.ini setting for AWE testing in single
instance tests.
5.
AWE is disabled for all the instances for multi-instance testing.
6. Min server memory option and max server memory option for SQL Server are set to
the same size for the static memory allocation.
7. Min server memory option for SQL Server is set to 1 GB, and max server memory
option is set to default for the recommended combination of static and dynamic
memory allocation in the multi-instance scenario.
8. Fiber mode is always used.
9. Database recovery models are set to full for most of the tests, and bulk-logged for
the specific recovery model comparison tests.
11
12
Appendix E: Disk Configuration
Disk configuration 1 for single instance:
LUNs
Number of
physical disks
RAID
configuration
Capacity
(GB)
LUN0_00
16
10
256
For the data files of all the
PACE databases.
LUN0_01
16
10
256
For the data files of
tempdb.
LUN0_02
6
10
96
For the log files of all PACE
databases and tempdb.
Comment
LUN 00, 01 and 02 are connected to the server through two host bus adapter (HBA)s.
Disk configuration 2 for each of the 8 instances (all the data files and log files
on the same LUN for each instance):
LUNs
LUN0_00
Number of
physical disks
RAID
configuration
Capacity
(GB)
16
10
256
Comment
For all the files for one
instance.
Each instance has one LUN and each LUN is connected to the server through one HBA.
Disk configuration 3 for each of the 8 instances (all the data files and log files
on the same LUN for each instance):
LUNs
Number of
physical disks
RAID
configuration
Capacity
(GB)
Comment
LUN0_00
8
10
128
PACE databases data files
LUN0_01
4
10
128
Tempdb data file and/or
backup files.
LUN0_02
4
10
64
PACE databases log files
and Tempdb log files.
Each instance has a group of three LUNs and each group is connected to the server via
one HBA.
13
Disk configuration 4 for every two of the 16 instances (all the data files and log
files on the same LUN for each instance):
LUNs
Number of
physical disks
RAID
configuration
Capacity
(GB)
LUN0_00
8
10
128
For the data files of all
PACE databases for 2
instances.
LUN0_01
4
10
128
For the data file of
Tempdb for 2 instances.
LUN0_02
4
10
64
For the log files of all PACE
databases and Tempdb for
2 instances.
Comment
Every two instances share a group of three LUNs, and each group is connected to the
server through one HBA.
Appendix F: Optimal System Configuration
for 500 databases on each of 8 instances
Parameter
Setting
Boot.ini
/PAE
sp_configure 'min server memory'
1024
sp_configure 'affinity mask'
0×01 for instance 1 0×02 for instance 2
0×04 for instance 3 …
Sp_configure 'max worker thread'
1000
Recovery model
Full
Disk layout
Disk configuration 3
13