How to relicate data from RDS PostgreSQL to on premise instance with logical replication
In this article, I'll explain how to replicate all tables from one database on a RDS Instance (AWS) to another database on an on premise PostgreSQL instance ,a PostgreSQL you manage yourself, with built-in logical replication from PostgreSQL. Please note it will only replicate data, it won't replicate sequences nor DDL commands.
🙌 I'll ask to do your own research regarding the settings I advised you to set and how to set under the Optional settings sections. While other settings are mandatory in order to get it working, those aren't mandatory but settings you should pay attention to. The way I recommend you to set those settings is the result of several researchs mainly on the PostgreSQL official documentation, postgresqlco.nf and my own experiences. It doesn't mean it is the best way, I only wish to give you a starting point.
Notes
- I've done this migration from a Postgres 10.6 on RDS to a Postgres 12 on premise
- The on premise server is a Debian 10 (buster)
- ⚠️ The built-in logical replication doesn't replicate sequences, you won't able to promote the on premise servers to master. If you need it, take a look at Pglogical's github page
- ⚠️ The built-in logical replication doesn't replicate DDL commands such as
CREATE TABLE
,ALTER TABLE
. If you need it, take a look at Pglogical's github page
Tips - check your logs 👋
Check your logs ! They are your best friend and they provide a lot of useful informations. Wether the PostgreSQL isn't working or when some settings are wrongly set.
RDS
You can find them in the AWS console, to find them:
- Go to
RDS
>databases
>your_database
>logs & events
⚠️ By default the latest logs are on the last page, since it is filtered by name ASC and not DESC.
On premise
To find your log folder
-- psql connected as your admin user on any database
-- Show the directory where logs are stored
postgres=> SHOW log_directory;
-- If the value was relative,
-- it is relative to the data directory
-- to find it:
postgres=> SHOW data_directory;
💡 If the value is relative like
log
and not absolute like/var/log/postgresql
. It is relative to the data directory:{{ data_directory }}/log
For PostgreSQL 12 on **Debian 10 **you can also find the logs at /var/log/postgresql/postgresql-12-main.log
Requirements
In order to follow this tutorial you'll need:
- A RDS database hosted by AWS
- PostgreSQL version > 10
- A user with
rds_superuser
permission granted to him
- An on premise PostgreSQL
- PostgreSQL version > 10
- A user with the
SUPERUSER
flag granted to him
RDS PostgreSQL (publisher)
We will setup first the RDS PostgreSQL as the publisher. We wish to transfer data from RDS to the on premise.
Parameter group
We need to ajust some settings in its parameter group in order to enable the logical replication.
Mandatory settings
- Set
rds.logical_replication
to 1. This will enable WAL logs at logical level.
Optional settings
those settings depends on how much tables you wish to replicate and your RDS instance in some cases, the default values are good, in anothers you should adjust them.
👋 Tip: Check your logs ! PostgreSQL will tell you if a parameter is wrongly set in the logs. In a case where I wished to replicate 9 databases between 2 hosts, I did set max_wal_senders to 15 and in RDS logs I found that PostgreSQL automatically rose it to 20 as it "wasn't enough / incompatible". See postgresqlco.nf for more informations regarding your variables
Specifies the maximum number of concurrent connections from standby servers or streaming base backup clients (i.e., the maximum number of simultaneously running WAL sender processes).
- Set it to twice as many replication process you expect, if you expect more than 5 replications otherwise left it to the default value.
- < 5 standby servers: Leave the default value 10
- Example: 9 databases: Set it to 18
Specifies the maximum number of replication slots (see streaming-replication-slots) that the server can support.
- Set it to twice as many replication process you expect, if you expect more than 5 replications otherwise left it to the default value.
- < 5 standby servers: Leave the default value 10
- Example: 9 databases: Set it to 18
Sets the maximum number of background processes that the system can support.
- Set it to the number of cores you have, if you have more than 8.
- <= 8 vCPU: Leave the default value 8
- Example: 32 vCPU: Set it to 32
Finally
- Reboot the database to take effects
Create the replication user
We'll create the role being ussed for the replication it self.
- Create a user, let's call it rds_replication_user
- GRANT
rds_replication
to it - For each database you wish to replicate, GRANT its owner, the role which have access to the data to the replication_user to have the desired rights on the database, all tables and all schemas.
-- Psql connected as the admin user (has rds_superuser) on any database
-- Create a user, let's call it *rds_replication_user*
postgres=> CREATE ROLE rds_replication_user WITH LOGIN ENCRYPTED PASSWORD 'xxxx';
-- Grant permissions to replicate
postgres=> GRANT rds_replication TO rds_replication_user;
-- Grant database's owner to rds_replication_user to have the desired rights on the database, all tables and all schemas. Present and future.
postgres=> GRANT mydb_owner TO rds_replication_user;
Create publication
Connect to the database you wish to replicate with our rds_replication_user. The publication is what allow the tables to be streamed outside the PostgreSQL.
-- Psql connected as rds_replication_user on the database we wish to replicate
mydb=> CREATE PUBLICATION mydb_publication FOR ALL TABLES;
Let's check everything is in order, this query should show us our recently create publication:
-- Psql connected as rds_replication_user on the database we wish to replicate
-- List all publications
mydb=> \dRp+
Now RDS is ready to stream our data through logical replication. 🚀
On premise PostgreSQL (subscriber)
We will setup ou on premise PostgreSQL as the subscriber. It is this PostgreSQL instance that will receive the data from RDS (the publisher).
postgresql.conf (For a fast initialization)
those settings depends on how much tables you wish to replicate and your RDS instance in some cases, the default values are good, in anothers you should adjust them.
Mandatory settings
It determines how much information is written to the WAL.We need to have enough informations in the WAL for the replication.
- Set it to
logical
.
Specifies the TCP/IP address(es) on which the server is to listen for connections from client applications.
- If you have
'localhost'
:- Set it to
'localhost,rds_host'
- Set it to
- If you have
'*'
:- Nothing to do 👍
Optional settings
👋 Tip: Check your logs ! Again PostgreSQL will tell you if a parameter is wrongly set in the logs.
⚠️ Please note: Those settings are tailored to initialize quickly your subscriber, if you wish to use the replicated server as a read replica for example, please read FINE TUNING AFTER INITIALIZATION at the end of the article.
Sets the maximum number of background processes that the system can support.
- Set it to the number of cores you have, if you have more than 8.
- <= 8 vCPU: Leave the default value 8
- Example: 32 vCPU: Set it to 32
max_logical_replication_workers
Specifies maximum number of logical replication workers. This includes both apply workers and table synchronization workers. Depends on the number of subscribers you have being replicated, consider setting it to (vCPU / 2) + 2
- Set it to (vCPU / 2) + 2 if you have more than 4 vCPU:
- Default: 4, good for 4 vCPU or less
- Example: 8 vCPU = 6
max_sync_workers_per_subscription
This parameter controls the amount of parallelism of the initial data copy during the subscription initialization or when new tables are added.
- Set it to vCPU / 2 (Must be lower than
max_logical_replication_workers
) if you have more than 4 vCPU- Default: 2, good for 4 vCPU or less
- Example: 8 vCPU = 4
Sets the maximum number of workers that the system can support for parallel operations.
- Set it to (vCPU / 2) - 2 if you have 16 vCPU or less
- Default value: 8 if have 16 or less cores (vCPU)
- Example: 33 vCPU = 16
Here's an example with a PostgreSQL with 56 vCPU and 64Go RAM:
# /etc/postgresql/12/main/postgresql.conf
# postgresql listen to RDS
listen_addresses = 'localhost,rds_host'
# We wish to use Logical replication
wal_level = logical
max_worker_processes=56
max_parallel_workers=26 # the number of cores
max_logical_replication_workers=30 # depends than number of subscribers (or replicas)
max_sync_workers_per_subscription=28 # depends on number of tables being replicated
pg_hba.conf
We need to authorize the RDS database to connect from its host with the premise_replication_user
, If it isn't already allowed. Here's a list of different entries from most restrictive to most permissive.
💡 Unlike physical replication, you don't need a line with
replication
as host. Only access to the database as a regular user.
# Only allow connection to the database_a from RDS for the role premise_replication_user with password
host database_a premise_replication_user rds_host md5
#### OR
# Allow connection from RDS for the role premise_replication_user with password
host all premise_replication_user rds_host md5
#### OR
# Allow remote connection with premise_replication_user role from everywhere with password.
host all premise_replication_user 0.0.0.0/0 md5
#### OR
# Allow remote connection with any role from everywhere with password.
host all all 0.0.0.0/0 md5
Restart postgresql for changes to take effect !
systemctl restart postgresql.service
Create the user
-- Psql connected as admin user on any databases
-- Doesn't have to match the user from RDS
-- TODO: NOT SURE REPLICATION is mandatory
postgres=> CREATE ROLE premise_replication_user WITH REPLICATION LOGIN ENCRYPTED PASSWORD 'xxxx';
-- Create a user to the future owner of the database
postgres=> CREATE ROLE mydb_owner WITH LOGIN ENCRYPTED PASSWORD 'xxxx'
-- Grant mydb_owner to premise_replication_user to have the desired rights on the database, all tables and all schemas. Present and future.
postgres=> GRANT mydb_owner TO premise_replication_user;
Create the database
-- Psql connected as admin user on any databases
postgres=> CREATE DATABASE mydb WITH OWNER mydb_owner;
Export the database schema from RDS to the on premise
Dump the RDS mydb's schema to our newly created mydb on our PostgreSQL on premise.
#!/bin/bash
# -s only the schema, not the data
# --no-owner & --no-privileges to not dump RDS's database owner and role permissions. Do not use it if you plan on using the very same roles between the two instances.
PGPASSWORD=rds_replication_user_password pg_dump -h rds_host -d mydb -U rds_replication_user --no-owner --no-privileges -s | PGPASSWORD=premise_mydb_owner_password psql -U premise_mydb_owner -h premise_host -d mydb
Create the subscription and start the replication
-- Psql connected as replication_user on mydb
-- Might have to wait a little, do not worry.
mydb=> CREATE SUBSCRIPTION mydb_subscription CONNECTION 'dbname=mydb host=rds_host user=rds_replication_user password=rds_replication_user_password' PUBLICATION mydb_publication;
Let's check everything is working as expected:
-- Psql connected as premise_replication_user on mydb
-- List of subscription on the current database
mydb=> \dRs+
On the subscriber, the on premise, you can even see the database size growing
-- Psql connected as any user on any database
postgres=> SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database WHERE datname = 'mydb' ORDER BY datname DESC;
Or on the publisher, the RDS, see the replication in action and the lag
-- Psql connected as any user on any database
postgres=> SELECT pid, application_name, client_addr, state, sync_state, pg_wal_lsn_diff(sent_lsn, write_lsn) AS write_lag, pg_wal_lsn_diff(sent_lsn, flush_lsn) AS flush_lag, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag FROM pg_stat_replication;
It's all done 🎉 ! Your database has started to replicating from RDS to your on premise PostgreSQL.
Monitoring
Here are several commands, that can help you monitor your replications
On the publisher, the RDS
-- On any database - See current replication tasks
postgres=> SELECT * FROM pg_stat_replication;
-- See the different kind of lags for each subscription connected to it. 0 means it has been fully replicated, no lag
postgres=> SELECT pid, application_name, client_addr, state, sync_state, pg_wal_lsn_diff(sent_lsn, write_lsn) AS write_lag, pg_wal_lsn_diff(sent_lsn, flush_lsn) AS flush_lag, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag FROM pg_stat_replication;
-- See all replications
postgres=> SELECT * FROM pg_replication_slots;
-- List all publications for the current Database. (Will show even if there is no subscribers)
mydb=> \dRp+
On the subscriber, the on premise
-- On any database - See current subscription tasks and their status and details (no data regarding the lag)
postgres=> SELECT * FROM pg_stat_subscription;
-- See databases' size
postgres=> SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database WHERE datname = 'mydb' ORDER BY datname DESC;
-- nearly same as above
postgres=> \l+ mydb
-- List of subscription on the current database
mydb=> \dRs+
(Optional) The initialization is finished, I wish to use this database as a read replica
⚠️ Please note: The built-in logical replication doesn't replicate DDL commands such as CREATE TABLE
, ALTER TABLE
. Any migration on the master won't be reflected on this replica.
Back to the on premise (subscriber) postgresql.conf
👋 Tip: Check your logs ! Again PostgreSQL will tell you if a parameter is wrongly set in the logs.
- No need to change it 👍
max_logical_replication_workers
- Should be greater than your number of subscriptions and even:
- Default: 4, good if you have less than 4 subscriptions
- Example: 8 subscriptions = 10
max_sync_workers_per_subscription
- Set it to max_logical_replication_workers / 2 rounded to lowest even number
- Default: 2, good for
max_logical_replication_workers=4
- Example:
max_logical_replication_workers=10
= 4
- Default: 2, good for
- Set it to
max_worker_processes - max_logical_replication_workers
.- Default value: 8, good if
max_worker_processes - max_logical_replication_workers
<= 8 - Example:
max_worker_processes - max_logical_replication_workers =
- Default value: 8, good if
Here's the same example with a PostgreSQL with 56 vCPU and 64Go RAM and 6 subscriptions:
# /etc/postgresql/12/main/postgresql.conf
# [NO CHANGES] postgresql listen to RDS
listen_addresses = 'localhost,rds_host'
# [NO CHANGES] We wish to use Logical replication
wal_level = logical
# [NO CHANGES] Still the number of cores
max_worker_processes=56
# All workers left by max_logical_replication_workers
max_parallel_workers=48
# depends than number of subscribers (or replicas)
max_logical_replication_workers=8
# depends on number of tables being replicated
max_sync_workers_per_subscription=4
Restart postgresql
systemctl restart postgresql.service
Resources
Here's a list of articles and documentations that helped me write this article
- Setting up PostgreSQL Write Ahead Logs for Logical Replication
- How to setup a logical replication on postgresql 10
- How to configure replication from PostgreSQL (rds) to vanilla PostgreSQL
- AWS - Logical replication for PostgreSQL on Amazon RDS
- How To Set Up Logical Replication with PostgreSQL 10 on Ubuntu 18.04
- How to Optimize PostgreSQL Logical Replication
- Using postgresql logical replication maintain always date readwrite test server
- Quick Setup