Building Tungsten Clusters with PostgreSQL Hot Standby

Download Report

Transcript Building Tungsten Clusters with PostgreSQL Hot Standby

Building Tungsten Clusters
with PostgreSQL Hot Standby
and Streaming Replication
Linas Virbalas and Alex Alexander
Continuent, Inc.
© Continuent 2010
Agenda
/ Introductions
/ PG 9 Hot Standby and Streaming Replication
/ What is Tungsten?
/ Tungsten + PostgreSQL Hot Standby
and Streaming Replication
/ Demo!
/ Questions and Comments
© Continuent 2010
Introductions
© Continuent 2010
About Continuent
/ Our Business: Continuous Data Availability
/ Our Solution
• Continuent Tungsten (Master/Slave Database Replication)
/ Our Value:
• Ensure data are available when and where you need them
• TCO less than 20% of comparable solutions
/ Our Technical Expertise
•
•
•
•
Database replication
Database cluster management
Application connectivity
Software-as-a-Service (SaaS)
© Continuent 2010
PostgresSQL 9: Hot
Standby and
Log Streaming
© Continuent 2010
PostgreSQL 8.4 Warm Standby
Master
Standby
PostgreSQL
PostgreSQL
Continuous
recovery
rsync to standby
WAL
Files
Archived
WAL
Files
WAL
Files
pg_xlogs
Directory
Archive
Directory
pg_xlogs
Directory
pg_standby
© Continuent 2010
Limitations of Warm Standby
1. Utilization -- Cannot open the standby
•
•
•
To bring up the standby for queries you must end recovery
Standby hardware is idle
Difficult to track state of recovery since you cannot query log
position
2. Data Loss -- Warm standby transfers only full WAL
files
•
•
•
© Continuent 2010
Can bound loss using archive_timeout
Low values create large numbers of WAL files; complicate pointin-time recovery
Workarounds using DRBD, etc. are complex
Introducing Hot Standby
/ Allows users to connect to standby in read-only mode
• Allowed: SELECT, SET, LOAD, COMMIT/ROLLBACK
• Disallowed: INSERT, UPDATE, DELETE, CREATE, 2PC,
SELECT … FOR SHARE/UPDATE, nextval(), LISTEN, LOCK,
• No admin commands: ANALYZE, VACUUM, REINDEX, GRANT
/ Can come out of recovery while queries are running
/ Thanks to Simon Riggs for this description
© Continuent 2010
Introducing Log Streaming
Master
Standby
PostgreSQL
PostgreSQL
WAL
Sender
Recovery
WAL
Receiver
Continuous replication to
standby
Archiving
Archived
WAL
Files
Archive
Directory
© Continuent 2010
Configuration and Usage
/ Log streaming layers on top of existing warm standby
log shipping
/ Multiple standby servers allowed
/ Failure of one standby does not affect others
/ Management is not simple - must coordinate
provisioning & WAL shipping to set up/restart
© Continuent 2010
What is Tungsten?
© Continuent 2010
What Is Tungsten?
/ Tungsten implements master/slave clusters to:
•
•
•
•
Protect data
Maintain high availability
Improve resource utilization
Raise performance
/ Install and set up in a few minutes
/ Integrated backup/restore and data integrity checks
/ Efficient failover operations
/ Distributed, rule-driven management
/ No/minimal application changes
/ Highly pluggable
/ No specialized hardware requirements
© Continuent 2010
What’s Inside Tungsten?
/ Replication - Making copies
• Tungsten Replicator -- Database-neutral, platform independent
master/slave replication
/ Connectivity -- Finding databases
• Tungsten Connector -- Fast MySQL/PostgreSQL client to JDBC
proxying
• Tungsten SQL Router --JDBC wrapper for high-performance and
transparent failover, load-balancing, and partitioning (no proxy
required)
/ Management -- Administering the database
• Tungsten Manager -- Distributed administration with autonomic,
rule-based configuration and no single point of failure
• Tungsten Monitor -- Track resource status and
© Continuent 2010
Tungsten Clustering In Action
Application Server
Application Server
SQL Router/Connector
SQL Router/Connector
Manager
Manager
Replicator
Replicator
Monitor
Monitor
Manager
Manager
Master DB
Master Host
© Continuent 2010
Slave DB
Management
Client
Management
Client
Slave Host
Distributed Rule-Based Management
Business
Rules
Admin Client
Manager
Group
Communications
Manager
Local Services
(Coordinator)
Broadcast commands
and monitoring data
Admin Client
Local Services
Manager
Admin Client
Local Services
© Continuent 2010
Multiple Routes to Databases
Java App Server
Tungsten SQL Router
PHP Application
libpq.a
PostgreSQL JDBC Driver
Admin &
Monitoring
Virtual IP
Address
Tungsten Cluster
© Continuent 2010
Tungsten
Connector
Admin &
Monitoring
Tungsten
+
PostgreSQL Hot Standby
and
Streaming Replication
© Continuent 2010
Moving Tungsten to PostgreSQL
/ Problem: We can’t read PostgreSQL logs (yet)
Tungsten solution is to manage:
/ Warm Standby + WAL Shipping
(for PostgreSQL <9)
• Good basic availability/fast failover
• Slaves open up for reads only after failover
• No load balancing/scaling
/ Hot Standby + Streaming Replication
(for PostgreSQL >=9)
• Slaves opened up for reads = Tungsten scaling facilities work
• Add Streaming Replication = minimal delay in replicating data
• A fully fledged clustering solution
© Continuent 2010
What is Tungsten’s Added Value To
PostgreSQL?
/ 15 minute cluster installation
/ Single commands to:
•
•
•
•
•
View cluster status
Provision a new standby
Confirm liveness of replication
Switch servers safely for maintenance
Failover a dead server to most current replica
/ Automatic discovery of new database replicas
/ Automatic failover when databases fail
/ Simple procedures for provisioning
/ Transparent application routing
/ Easy scaling
© Continuent 2010
Streaming Replication Setup (By Hand)
/ Configure master postgresql.conf and reboot
archive_mode = on
max_wal_senders = 10
recovery_connections = on
archive_command =‘rsync -cz $1 ${STANDBY}:${PGHOME}/archive/$2
%p %f'
archive_timeout = 60
/ Set up standby recovery.conf
standby_mode = 'on’
primary_conninfo = 'host=${MASTER} port=5432 user=postgres’
trigger_file = '/tmp/pgsql_stop_recovery'
/ Provision standby
psql# select pg_switch_xlog();
psql# select pg_xlogfile_name(pg_start_backup('base_backup'));
rsync –azv --delete --exclude=*pg_xlog*
--exclude=postgresql.conf ${PGHOME}/
$STANDBY:$PGHOME/archive
psql# select pg_xlogfile_name(pg_stop_backup());
/ Start standby, recovery starts
© Continuent 2010
Manual Failover (By Hand)
/ Standby’s postgresql.conf should be prepared from
the start to act as a master when needed:
archive_mode = on
max_wal_senders = 10
/ Touch /tmp/pgsql_stop_recovery on a standby to
snap out of recovery mode
/ Wait for recovery to finish
• ERROR: recovery is in progress …
LOG: database system is ready to accept connections
/ Reroute applications to the new master
And to return to original cluster’s state:
/ Fix the failed master. Bring it online as a standby
/ Switch master with standby roles
© Continuent 2010
Gotchas (1/2)
/ Wait for standby to finish initial recovery before
routing applications to it
/ “psql: FATAL: the database system is starting up” –
bad check for that. E.g.: try under French locale
/ After a switch/failover, need to wait for the new
master to come up from recovery too
/ rsync of master data folder might return a non-zero
exit code: 24 - “Partial transfer due to vanished
source files”
/ Exclude pg_xlog, pg_log and postgresql.conf from
rsync!
/ Careful: queries might get canceled on a standby!
(E.g. DROP TABLE on a master doesn’t wait for
SELECT queries to finish on a standby)
© Continuent 2010
Captchas (2/2)
/ Need to know current progress?
• Use pg_current_xlog_location() on a master
(“ERROR: recovery is in progress" if used on a standby),
pg_last_xlog_receive_location() and
pg_last_xlog_replay_location() on a standby.
/ Using pg_standby? Ensure it is available in the path for
the user you’re running
• If it’s for the root, ‘sudo which pg_standby’ is not enough!
/ Running Debian? Check for PostgreSQL status before
starting it (it fails, if server is already running):
• sudo /etc/init.d/postgresql-8.4 start
Starting PostgreSQL 8.4:
pg_ctl: another server might be running; trying to start server anyway
pg_ctl: could not start server
/ Etc.
© Continuent 2010
Ultimately
/ Hot Standby & Streaming Replication is awesome,
but
/ Set-up/management is harder than it looks.
/ Monitoring is critical.
/ Transparent application rerouting is essential.
© Continuent 2010
How Tungsten covers all this?
Tungsten Manager
Replicator JMX Interface
Monitor
Replication State Model
Open Script
pg_dump/
Backup
Plugin
Pg-wal Scripts
postgresql.conf
recovery.conf
pg_standby
rsync
© Continuent 2010
pg_restore
Plug-In
DBMS
Storage
Plugin
DBMS
Checker
Plugin
DEMO
or
A Summary in Action
© Continuent 2010
Questions?
© Continuent 2010
Contact Information
HQ and Americas
EMEA and APAC
560 S. Winchester Blvd., Suite 500
San Jose, CA 95128
Tel (866) 998-3642
Fax (408) 668-1009
Lars Sonckin kaari 16
02600 Espoo, Finland
Tel +358 50 517 9059
Fax +358 9 863 0060
e-mail: [email protected], [email protected]
Continuent Web Site:
http://www.continuent.com
© Continuent 2010