Building Tungsten Clusters with PostgreSQL Hot Standby

advertisement
Building Tungsten Clusters
with PostgreSQL Hot Standby
and Streaming Replication
Linas Virbalas and Alex Alexander
Continuent, Inc.
© Continuent 2010
Agenda
/ Introductions
/ PG 9 Hot Standby and Streaming Replication
/ What is Tungsten?
/ Tungsten + PostgreSQL Hot Standby
and Streaming Replication
/ Demo!
/ Questions and Comments
© Continuent 2010
Introductions
© Continuent 2010
About Continuent
/ Our Business: Continuous Data Availability
/ Our Solution
• Continuent Tungsten (Master/Slave Database Replication)
/ Our Value:
• Ensure data are available when and where you need them
• TCO less than 20% of comparable solutions
/ Our Technical Expertise
•
•
•
•
Database replication
Database cluster management
Application connectivity
Software-as-a-Service (SaaS)
© Continuent 2010
PostgresSQL 9: Hot
Standby and
Log Streaming
© Continuent 2010
PostgreSQL 8.4 Warm Standby
Master
Standby
PostgreSQL
PostgreSQL
Continuous
recovery
rsync to standby
WAL
Files
Archived
WAL
Files
WAL
Files
pg_xlogs
Directory
Archive
Directory
pg_xlogs
Directory
pg_standby
© Continuent 2010
Limitations of Warm Standby
1. Utilization -- Cannot open the standby
•
•
•
To bring up the standby for queries you must end recovery
Standby hardware is idle
Difficult to track state of recovery since you cannot query log
position
2. Data Loss -- Warm standby transfers only full WAL
files
•
•
•
© Continuent 2010
Can bound loss using archive_timeout
Low values create large numbers of WAL files; complicate pointin-time recovery
Workarounds using DRBD, etc. are complex
Introducing Hot Standby
/ Allows users to connect to standby in read-only mode
• Allowed: SELECT, SET, LOAD, COMMIT/ROLLBACK
• Disallowed: INSERT, UPDATE, DELETE, CREATE, 2PC,
SELECT … FOR SHARE/UPDATE, nextval(), LISTEN, LOCK,
• No admin commands: ANALYZE, VACUUM, REINDEX, GRANT
/ Can come out of recovery while queries are running
/ Thanks to Simon Riggs for this description
© Continuent 2010
Introducing Log Streaming
Master
Standby
PostgreSQL
PostgreSQL
WAL
Sender
Recovery
WAL
Receiver
Continuous replication to
standby
Archiving
Archived
WAL
Files
Archive
Directory
© Continuent 2010
Configuration and Usage
/ Log streaming layers on top of existing warm standby
log shipping
/ Multiple standby servers allowed
/ Failure of one standby does not affect others
/ Management is not simple - must coordinate
provisioning & WAL shipping to set up/restart
© Continuent 2010
What is Tungsten?
© Continuent 2010
What Is Tungsten?
/ Tungsten implements master/slave clusters to:
•
•
•
•
Protect data
Maintain high availability
Improve resource utilization
Raise performance
/ Install and set up in a few minutes
/ Integrated backup/restore and data integrity checks
/ Efficient failover operations
/ Distributed, rule-driven management
/ No/minimal application changes
/ Highly pluggable
/ No specialized hardware requirements
© Continuent 2010
What’s Inside Tungsten?
/ Replication - Making copies
• Tungsten Replicator -- Database-neutral, platform independent
master/slave replication
/ Connectivity -- Finding databases
• Tungsten Connector -- Fast MySQL/PostgreSQL client to JDBC
proxying
• Tungsten SQL Router --JDBC wrapper for high-performance and
transparent failover, load-balancing, and partitioning (no proxy
required)
/ Management -- Administering the database
• Tungsten Manager -- Distributed administration with autonomic,
rule-based configuration and no single point of failure
• Tungsten Monitor -- Track resource status and
© Continuent 2010
Tungsten Clustering In Action
Application Server
Application Server
SQL Router/Connector
SQL Router/Connector
Manager
Manager
Replicator
Replicator
Monitor
Monitor
Manager
Manager
Master DB
Master Host
© Continuent 2010
Slave DB
Management
Client
Management
Client
Slave Host
Distributed Rule-Based Management
Business
Rules
Admin Client
Manager
Group
Communications
Manager
Local Services
(Coordinator)
Broadcast commands
and monitoring data
Admin Client
Local Services
Manager
Admin Client
Local Services
© Continuent 2010
Multiple Routes to Databases
Java App Server
Tungsten SQL Router
PHP Application
libpq.a
PostgreSQL JDBC Driver
Admin &
Monitoring
Virtual IP
Address
Tungsten Cluster
© Continuent 2010
Tungsten
Connector
Admin &
Monitoring
Tungsten
+
PostgreSQL Hot Standby
and
Streaming Replication
© Continuent 2010
Moving Tungsten to PostgreSQL
/ Problem: We can’t read PostgreSQL logs (yet)
Tungsten solution is to manage:
/ Warm Standby + WAL Shipping
(for PostgreSQL <9)
• Good basic availability/fast failover
• Slaves open up for reads only after failover
• No load balancing/scaling
/ Hot Standby + Streaming Replication
(for PostgreSQL >=9)
• Slaves opened up for reads = Tungsten scaling facilities work
• Add Streaming Replication = minimal delay in replicating data
• A fully fledged clustering solution
© Continuent 2010
What is Tungsten’s Added Value To
PostgreSQL?
/ 15 minute cluster installation
/ Single commands to:
•
•
•
•
•
View cluster status
Provision a new standby
Confirm liveness of replication
Switch servers safely for maintenance
Failover a dead server to most current replica
/ Automatic discovery of new database replicas
/ Automatic failover when databases fail
/ Simple procedures for provisioning
/ Transparent application routing
/ Easy scaling
© Continuent 2010
Streaming Replication Setup (By Hand)
/ Configure master postgresql.conf and reboot
archive_mode = on
max_wal_senders = 10
recovery_connections = on
archive_command =‘rsync -cz $1 ${STANDBY}:${PGHOME}/archive/$2
%p %f'
archive_timeout = 60
/ Set up standby recovery.conf
standby_mode = 'on’
primary_conninfo = 'host=${MASTER} port=5432 user=postgres’
trigger_file = '/tmp/pgsql_stop_recovery'
/ Provision standby
psql# select pg_switch_xlog();
psql# select pg_xlogfile_name(pg_start_backup('base_backup'));
rsync –azv --delete --exclude=*pg_xlog*
--exclude=postgresql.conf ${PGHOME}/
$STANDBY:$PGHOME/archive
psql# select pg_xlogfile_name(pg_stop_backup());
/ Start standby, recovery starts
© Continuent 2010
Manual Failover (By Hand)
/ Standby’s postgresql.conf should be prepared from
the start to act as a master when needed:
archive_mode = on
max_wal_senders = 10
/ Touch /tmp/pgsql_stop_recovery on a standby to
snap out of recovery mode
/ Wait for recovery to finish
• ERROR: recovery is in progress …
LOG: database system is ready to accept connections
/ Reroute applications to the new master
And to return to original cluster’s state:
/ Fix the failed master. Bring it online as a standby
/ Switch master with standby roles
© Continuent 2010
Gotchas (1/2)
/ Wait for standby to finish initial recovery before
routing applications to it
/ “psql: FATAL: the database system is starting up” –
bad check for that. E.g.: try under French locale
/ After a switch/failover, need to wait for the new
master to come up from recovery too
/ rsync of master data folder might return a non-zero
exit code: 24 - “Partial transfer due to vanished
source files”
/ Exclude pg_xlog, pg_log and postgresql.conf from
rsync!
/ Careful: queries might get canceled on a standby!
(E.g. DROP TABLE on a master doesn’t wait for
SELECT queries to finish on a standby)
© Continuent 2010
Captchas (2/2)
/ Need to know current progress?
• Use pg_current_xlog_location() on a master
(“ERROR: recovery is in progress" if used on a standby),
pg_last_xlog_receive_location() and
pg_last_xlog_replay_location() on a standby.
/ Using pg_standby? Ensure it is available in the path for
the user you’re running
• If it’s for the root, ‘sudo which pg_standby’ is not enough!
/ Running Debian? Check for PostgreSQL status before
starting it (it fails, if server is already running):
• sudo /etc/init.d/postgresql-8.4 start
Starting PostgreSQL 8.4:
pg_ctl: another server might be running; trying to start server anyway
pg_ctl: could not start server
/ Etc.
© Continuent 2010
Ultimately
/ Hot Standby & Streaming Replication is awesome,
but
/ Set-up/management is harder than it looks.
/ Monitoring is critical.
/ Transparent application rerouting is essential.
© Continuent 2010
How Tungsten covers all this?
Tungsten Manager
Replicator JMX Interface
Monitor
Replication State Model
Open Script
pg_dump/
Backup
Plugin
Pg-wal Scripts
postgresql.conf
recovery.conf
pg_standby
rsync
© Continuent 2010
pg_restore
Plug-In
DBMS
Storage
Plugin
DBMS
Checker
Plugin
DEMO
or
A Summary in Action
© Continuent 2010
Questions?
© Continuent 2010
Contact Information
HQ and Americas
EMEA and APAC
560 S. Winchester Blvd., Suite 500
San Jose, CA 95128
Tel (866) 998-3642
Fax (408) 668-1009
Lars Sonckin kaari 16
02600 Espoo, Finland
Tel +358 50 517 9059
Fax +358 9 863 0060
e-mail: alex.alexander@continuent.com, linas.virbalas@continuent.com
Continuent Web Site:
http://www.continuent.com
© Continuent 2010
Download