Buying Database Hardware

advertisement

Buying Database Hardware

Adam Backman – President

White Star Software, LLC.

About the speaker

 President – White Star Software

One of the oldest and most respected consulting and training companies in the Progress OpenEdge sector

 Vice President – DBAppraise

Managed database services backed up by experienced Progress

OpenEdge professionals not rookies off the bench

 Author – Progress Software’s Expert Series

 Over 25 years of Progress OpenEdge experience

− Technical support

− Training

− Consulting (Database and System configuration, management and tuning)

No need to buy hardware – Progress Pacific will take care of it!

Agenda

 Understanding system resources

 Picking the right vendor

 Where to spend your money

− CPU fast vs. many

− Memory – can you ever have too much

− Disk – where all the data starts

− Network and other parts of the system

 Conclusion

Understanding system resources

 Supported architectures

 Understand your options

 Performance tradeoffs

Main types of architectures supported by OpenEdge

 Database engine

− Database with no portion of the application

 Host-based system

− Database, clients and background all on one system

 Pure client/server

− Database on one machine and clients on other machines

 Part of an n-tier architecture

− Database and background on Machine A

− AppServers on Machine B

− Clients on individual machines

Understand your options

 Single large system vs. 2 or more smaller machines

 Virtualization

 Single platform or multi-platform

 Cloud vendors

 SAN vs. Direct attached storage

 Network considerations

Single large machine vs. 2 or more smaller machines

 Single large machine

− Pros

 Highest potential performance by eliminating network layer

 Easier to manage as everything is in one place

− Cons

 A single machine will have limited scalability

 Usually two mid-range systems are more cost effective than a single high-end system

 Potential license cost issues (CPU-Based pricing)

Single large machine vs. 2 or more smaller machines (cont.)

 Multi-machine

− Pros

 Flexibility – ability to repurpose machines

 Scalability – ability to add additional machines to solution

 Recoverability – ability to use AppServer machine as the database engine

− Cons

 Cost – duplication of items, power, maintnenace

 Adding network layer can hurt performance

 Management – more machines to manage

 Maintenance – more things to break

Purchase guidance

 Databases tend to use disk extensively

− Spend on disk subsystem

− Allow for a minimum of 10% of the database size for database buffers (-B memory)

− Do not forget other memory allocations

 OS buffers can be reduced to 10% or less of total memory

 Applications are memory and CPU intensive

− Generally better to buy more cores vs. fewer faster cores but not always some apps have major single-threaded operations

− Memory can greatly reduce I/O via –B -Bp -Bt, -mmax, …

 Examine your use cases for the machine and buy with both primary use and most likely alternative uses in mind

Purchase guidance

 Most people over spend on CPU

 You can have all the CPU in the world but it will do you no good unless you can get data to them efficiently

 People should focus on the performance “food chain”

− Network

− Disk

− Memory

− CPU

 Slower resources should be addressed before faster resources

Virtualization

 Everyone is doing it but why?

− Ability to build new environments

− Ability to recover quickly (part of a DR solution)

− Reduction in common resource use per server

 Power

 Cooling

 Floor/rack space

− Potential for better resource saturation (unused CPU)

 Why not?

− Complexity

− Cost (VMWare is not free :-)

− More applications affected by an outage

Options: N-tier option

 Database engine

− Fast Disk

− Moderate memory (over 10% of DB + OS and extras)

− Relatively little CPU

 AppServer machine

− Internal disk – setup well but not crazy

− Higher memory usage

− CPU intensive

 Client machine

− Web/Mobile

− Desktops

− Citrix/Windows terminal server

Cloud: Make it someone else’s problem

Cloud

 Watch for variable performance

− Measure throughput (Disk and memory)

− Measure compute capacity

− Measure at different days/times

 Performance guaranty from vendor

 Iops/sec. vs. perception (real measurements)

 Amazon (HPC) high performance computing

Why is disk important

 CPU capacity doubles every 18 months

 Network bandwidth doubling every 12 months

 Memory is 37,000+ times faster than disk

 Disk (per disk I/O rate) fairly static (150 – 200 iops/sec.)

 Storage will generally cost more than servers and this is particularly true for database servers

Buy better storage

 Many disks

− 150 iops/sec. per disk

− Look at you buffer hit rate and total request load

− Don’t forget temporary file I/O which can account for a significant percentage of your total I/O load

 Larger cache

− Some systems require you to expand cache when you expand your storage but most don’t

− Adding cache is akin to adding database buffers to a database

 SSD – save money buy fewer devices

− SSDs are a real solution now and prices are competitive though not cheap when compared to conventional storage on a per GB basis

Do better disk configuration

 Still no RAID 5, No RAID S, No RAID 6, No RAID 7

 RAID 10 still king for database storage – really there are a bunch of really cool stats to prove this out

 Large stripe widths

− Performance improved with stripe width through 2MB

 Use best portion of rotating disk (rotating rust)

− Using outer edge of disk will provide the best performance which may be as much as 15% better vs. inner portion of disk

 Even usage across all disks

− Eliminate disk variance

− Think of ALL sources of I/O (DB, BI, AI, Temp files, OS, …)

Storage

 Direct attached

− Less expensive in most cases

− Less complex – Single machine tuning OS and Array

− High performance – Disks dedicated

 SAN – generalized business storage

 NAS – file optimized storage

 SAN – Purpose-built high performance

Why SAN twice? There is a huge difference in SANs and you need to buy for your need not for their marketing

Direct-attached storage

 Pros

− Not shared with other hosts (isolation is bliss)

− Easier problem resolution

− Massive controller throughput for little money

− Cheaper to maintain

 Cons

− Not shared with other hosts (no cost sharing)

SAN: Generalized business storage

 Pros

− Best option in virtualized environment

− Share one powerful storage system with many hosts

− One stop storage system for all hosts

 Cons

− High initial cost

− Single point of failure unless array mirroring/clustering is in place

− Not optimized to individual tasks

− Complex

SAN: Purpose-built

 Pros:

− Excellent performance

− Additional control at array level

− Massively scalable

− Ability to dedicate resources to hosts

− Reliable (fault tolerant)

 Cons

− Single point of failure unless array mirroring is in place

− Cost

− Complexity

SAN monitoring

 More difficult as there are many moving parts

 Multiple hosts need to be monitored

 SAN needs to be monitored

 Monitoring data needs to be synchronized

 Work loads need to be balanced across hosts

NAS: file optimized storage

 Pros

− Sharable across hosts

− Generally cheaper than SAN

− Good service for application files

 Cons

− File optimized not block optimized

− Not database optimized

− Not client temporary file optimized

Storage network

 Should be isolated

− Physically

− Separate vlan if physical is not possible

 Use large MTU size (ALL must be the same)

− Host

− Guest

− Switch

− Array

Network options

 Simple

− Put a single quad card in the server and bind the ports for performance

 Moderate

− Multiple cards bound with a two networks. One for Data and the other for client traffic

 Complex

− Multiple machine

− Multiple networks (vlan)

− Dedicated networks for DB, replication, client traffic, AppServer

Network

 Try to use your network efficiently

− -Mm 8192 to increase throughput

− Remember to move to jumbo frames (client, server, switches, …)

 Move invasive processes to separate network

− Backup

− Replication

− System syncronizations

Picking the right vendor – The less of two evils

Picking the right vendor

 Better support nearly always beats a better upfront price

 Look at quality of “local” support infrastructure

− Response time (SLA)

− In country

− In the correct language

 Always comparison shop even if you “know” what you want

− This keeps vendors honest

− Choosing historic rivals helps drive down price

 Simplify to enhance support

− Bundle Linux support under hardware contract

− Single vendor simplicity

Paying for support

 Buy all support with the initial purchase

 Allows easier (capital) write-off

 Years 4+ of support can cost as much as the initial price if purchased later

Picking the wrong solution

 NetApp for database storage. Performance will be nonoptimal

 NUMA Architecture – Good vendors make bad solutions

− All CPUs allocated to a Progress domain must come from the same book/shelf/node

− All Memory must meet the same criteria as CPU

 Using client/server for reporting

− Kill the network access whenever possible

− Use AppServer for complex OLTP

Where to spend your money

 Disks

 Storage

 SAN

 SSD

 Really, look at storage first then concern yourself with other trivial issues such as memory and CPU

 This is the problem over 9 out of 10 times

Questions, Comments, …

Thank you for your time

THANK YOU

Download