February 14, 2022

How to relicate data from RDS PostgreSQL to on premise instance with logical replication

In this article, I'll explain how to replicate all tables from one database on a RDS Instance (AWS) to another database on an on premise PostgreSQL instance ,a PostgreSQL you manage yourself, with built-in logical replication from PostgreSQL. Please note it will only replicate data, it won't replicate sequences nor DDL commands.

🙌 I'll ask to do your own research regarding the settings I advised you to set and how to set under the Optional settings sections. While other settings are mandatory in order to get it working, those aren't mandatory but settings you should pay attention to. The way I recommend you to set those settings is the result of several researchs mainly on the PostgreSQL official documentation, postgresqlco.nf and my own experiences. It doesn't mean it is the best way, I only wish to give you a starting point.

Notes

I've done this migration from a Postgres 10.6 on RDS to a Postgres 12 on premise
The on premise server is a Debian 10 (buster)
⚠️ The built-in logical replication doesn't replicate sequences, you won't able to promote the on premise servers to master. If you need it, take a look at Pglogical's github page
⚠️ The built-in logical replication doesn't replicate DDL commands such as CREATE TABLE, ALTER TABLE. If you need it, take a look at Pglogical's github page

Tips - check your logs 👋

Check your logs ! They are your best friend and they provide a lot of useful informations. Wether the PostgreSQL isn't working or when some settings are wrongly set.

RDS

You can find them in the AWS console, to find them:

Go to RDS > databases > your_database > logs & events

⚠️ By default the latest logs are on the last page, since it is filtered by name ASC and not DESC.

On premise

To find your log folder

-- psql connected as your admin user on any database

-- Show the directory where logs are stored
postgres=> SHOW log_directory;

-- If the value was relative,
-- it is relative to the data directory
-- to find it:
postgres=> SHOW data_directory;

💡 If the value is relative like log and not absolute like /var/log/postgresql. It is relative to the data directory: {{ data_directory }}/log

For PostgreSQL 12 on **Debian 10 **you can also find the logs at /var/log/postgresql/postgresql-12-main.log

Requirements

In order to follow this tutorial you'll need:

A RDS database hosted by AWS
- PostgreSQL version > 10
- A user with rds_superuser permission granted to him
An on premise PostgreSQL
- PostgreSQL version > 10
- A user with the SUPERUSER flag granted to him

RDS PostgreSQL (publisher)

We will setup first the RDS PostgreSQL as the publisher. We wish to transfer data from RDS to the on premise.

Parameter group

We need to ajust some settings in its parameter group in order to enable the logical replication.

Mandatory settings

Set rds.logical_replication to 1. This will enable WAL logs at logical level.

Optional settings

those settings depends on how much tables you wish to replicate and your RDS instance in some cases, the default values are good, in anothers you should adjust them.

👋 Tip: Check your logs ! PostgreSQL will tell you if a parameter is wrongly set in the logs. In a case where I wished to replicate 9 databases between 2 hosts, I did set max_wal_senders to 15 and in RDS logs I found that PostgreSQL automatically rose it to 20 as it "wasn't enough / incompatible". See postgresqlco.nf for more informations regarding your variables

max_wal_senders:

Specifies the maximum number of concurrent connections from standby servers or streaming base backup clients (i.e., the maximum number of simultaneously running WAL sender processes).

Set it to twice as many replication process you expect, if you expect more than 5 replications otherwise left it to the default value.
- < 5 standby servers: Leave the default value 10
- Example: 9 databases: Set it to 18

max_replication_slots:

Specifies the maximum number of replication slots (see streaming-replication-slots) that the server can support.

Set it to twice as many replication process you expect, if you expect more than 5 replications otherwise left it to the default value.
- < 5 standby servers: Leave the default value 10
- Example: 9 databases: Set it to 18

max_worker_processes:

Sets the maximum number of background processes that the system can support.

Set it to the number of cores you have, if you have more than 8.
- <= 8 vCPU: Leave the default value 8
- Example: 32 vCPU: Set it to 32

Finally

Reboot the database to take effects

Create the replication user

We'll create the role being ussed for the replication it self.

Create a user, let's call it rds_replication_user
GRANT rds_replication to it
For each database you wish to replicate, GRANT its owner, the role which have access to the data to the replication_user to have the desired rights on the database, all tables and all schemas.

-- Psql connected as the admin user (has rds_superuser) on any database

-- Create a user, let's call it *rds_replication_user*
postgres=> CREATE ROLE rds_replication_user WITH LOGIN ENCRYPTED PASSWORD 'xxxx';

-- Grant permissions to replicate
postgres=> GRANT rds_replication TO rds_replication_user;

-- Grant database's owner to rds_replication_user to have the desired rights on the database, all tables and all schemas. Present and future.
postgres=> GRANT mydb_owner TO rds_replication_user;

Create publication

Connect to the database you wish to replicate with our rds_replication_user. The publication is what allow the tables to be streamed outside the PostgreSQL.

-- Psql connected as rds_replication_user on the database we wish to replicate

mydb=> CREATE PUBLICATION mydb_publication FOR ALL TABLES;

Let's check everything is in order, this query should show us our recently create publication:

-- Psql connected as rds_replication_user on the database we wish to replicate

-- List all publications
mydb=> \dRp+

Now RDS is ready to stream our data through logical replication. 🚀

On premise PostgreSQL (subscriber)

We will setup ou on premise PostgreSQL as the subscriber. It is this PostgreSQL instance that will receive the data from RDS (the publisher).

postgresql.conf (For a fast initialization)

those settings depends on how much tables you wish to replicate and your RDS instance in some cases, the default values are good, in anothers you should adjust them.

Mandatory settings

wal_level:

It determines how much information is written to the WAL.We need to have enough informations in the WAL for the replication.

Set it to logical.

listen_addresses:

Specifies the TCP/IP address(es) on which the server is to listen for connections from client applications.

If you have 'localhost' :
- Set it to 'localhost,rds_host'
If you have '*' :
- Nothing to do 👍

Optional settings

👋 Tip: Check your logs ! Again PostgreSQL will tell you if a parameter is wrongly set in the logs.

⚠️ Please note: Those settings are tailored to initialize quickly your subscriber, if you wish to use the replicated server as a read replica for example, please read FINE TUNING AFTER INITIALIZATION at the end of the article.

max_worker_processes:

Sets the maximum number of background processes that the system can support.

Set it to the number of cores you have, if you have more than 8.
- <= 8 vCPU: Leave the default value 8
- Example: 32 vCPU: Set it to 32

max_logical_replication_workers

Specifies maximum number of logical replication workers. This includes both apply workers and table synchronization workers. Depends on the number of subscribers you have being replicated, consider setting it to (vCPU / 2) + 2

Set it to (vCPU / 2) + 2 if you have more than 4 vCPU:
- Default: 4, good for 4 vCPU or less
- Example: 8 vCPU = 6

max_sync_workers_per_subscription

This parameter controls the amount of parallelism of the initial data copy during the subscription initialization or when new tables are added.

Set it to vCPU / 2 (Must be lower than max_logical_replication_workers) if you have more than 4 vCPU
- Default: 2, good for 4 vCPU or less
- Example: 8 vCPU = 4

max_parallel_workers

Sets the maximum number of workers that the system can support for parallel operations.

Set it to (vCPU / 2) - 2 if you have 16 vCPU or less
- Default value: 8 if have 16 or less cores (vCPU)
- Example: 33 vCPU = 16

Here's an example with a PostgreSQL with 56 vCPU and 64Go RAM:

# /etc/postgresql/12/main/postgresql.conf

# postgresql listen to RDS
listen_addresses = 'localhost,rds_host'

# We wish to use Logical replication
wal_level = logical

max_worker_processes=56

max_parallel_workers=26 #  the number of cores

max_logical_replication_workers=30 # depends than number of subscribers (or replicas)

max_sync_workers_per_subscription=28 # depends on number of tables being replicated

pg_hba.conf

We need to authorize the RDS database to connect from its host with the premise_replication_user, If it isn't already allowed. Here's a list of different entries from most restrictive to most permissive.

💡 Unlike physical replication, you don't need a line with replication as host. Only access to the database as a regular user.

# Only allow connection to the database_a from RDS for the role premise_replication_user with password
host    database_a        premise_replication_user  rds_host      md5

#### OR

# Allow connection from RDS for the role premise_replication_user with password
host    all               premise_replication_user  rds_host      md5

#### OR

# Allow remote connection with premise_replication_user role from everywhere with password.
host    all               premise_replication_user  0.0.0.0/0     md5

#### OR

# Allow remote connection with any role from everywhere with password.
host    all               all                       0.0.0.0/0     md5

Restart postgresql for changes to take effect !

systemctl restart postgresql.service

Create the user

-- Psql connected as admin user on any databases

-- Doesn't have to match the user from RDS
-- TODO: NOT SURE REPLICATION is mandatory
postgres=> CREATE ROLE premise_replication_user WITH REPLICATION LOGIN ENCRYPTED PASSWORD 'xxxx';

-- Create a user to the future owner of the database
postgres=> CREATE ROLE mydb_owner WITH LOGIN ENCRYPTED PASSWORD 'xxxx'

-- Grant mydb_owner to premise_replication_user to have the desired rights on the database, all tables and all schemas. Present and future.
postgres=> GRANT mydb_owner TO premise_replication_user;

Create the database

-- Psql connected as admin user on any databases

postgres=> CREATE DATABASE mydb WITH OWNER mydb_owner;

Export the database schema from RDS to the on premise

Dump the RDS mydb's schema to our newly created mydb on our PostgreSQL on premise.

#!/bin/bash

# -s only the schema, not the data
# --no-owner & --no-privileges to not dump RDS's database owner and role permissions. Do not use it if you plan on using the very same roles between the two instances.
PGPASSWORD=rds_replication_user_password pg_dump -h rds_host -d mydb -U rds_replication_user --no-owner --no-privileges -s | PGPASSWORD=premise_mydb_owner_password psql -U premise_mydb_owner -h premise_host -d mydb

Create the subscription and start the replication

-- Psql connected as replication_user on mydb

-- Might have to wait a little, do not worry.
mydb=> CREATE SUBSCRIPTION mydb_subscription CONNECTION 'dbname=mydb host=rds_host user=rds_replication_user password=rds_replication_user_password' PUBLICATION mydb_publication;

Let's check everything is working as expected:

-- Psql connected as premise_replication_user on mydb

-- List of subscription on the current database
mydb=> \dRs+

On the subscriber, the on premise, you can even see the database size growing

-- Psql connected as any user on any database

postgres=> SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database WHERE datname = 'mydb' ORDER BY datname DESC;

Or on the publisher, the RDS, see the replication in action and the lag

-- Psql connected as any user on any database

postgres=> SELECT   pid, application_name, client_addr, state, sync_state, pg_wal_lsn_diff(sent_lsn, write_lsn) AS write_lag, pg_wal_lsn_diff(sent_lsn, flush_lsn) AS flush_lag, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag FROM pg_stat_replication;

It's all done 🎉 ! Your database has started to replicating from RDS to your on premise PostgreSQL.

Monitoring

Here are several commands, that can help you monitor your replications

On the publisher, the RDS

-- On any database - See current replication tasks
postgres=> SELECT * FROM pg_stat_replication;

-- See the different kind of lags for each subscription connected to it. 0 means it has been fully replicated, no lag
postgres=> SELECT   pid, application_name, client_addr, state, sync_state, pg_wal_lsn_diff(sent_lsn, write_lsn) AS write_lag, pg_wal_lsn_diff(sent_lsn, flush_lsn) AS flush_lag, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag FROM pg_stat_replication;

-- See all replications
postgres=> SELECT * FROM pg_replication_slots;

-- List all publications for the current Database. (Will show even if there is no subscribers)
mydb=> \dRp+

On the subscriber, the on premise

-- On any database - See current subscription tasks and their status and details (no data regarding the lag)
postgres=> SELECT * FROM pg_stat_subscription;

-- See databases' size
postgres=> SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database WHERE datname = 'mydb' ORDER BY datname DESC;

-- nearly same as above
postgres=> \l+ mydb

-- List of subscription on the current database
mydb=> \dRs+

(Optional) The initialization is finished, I wish to use this database as a read replica

⚠️ Please note: The built-in logical replication doesn't replicate DDL commands such as CREATE TABLE, ALTER TABLE. Any migration on the master won't be reflected on this replica.

Back to the on premise (subscriber) postgresql.conf

👋 Tip: Check your logs ! Again PostgreSQL will tell you if a parameter is wrongly set in the logs.

max_worker_processes:

No need to change it 👍

max_logical_replication_workers

Should be greater than your number of subscriptions and even:
- Default: 4, good if you have less than 4 subscriptions
- Example: 8 subscriptions = 10

max_sync_workers_per_subscription

Set it to max_logical_replication_workers / 2 rounded to lowest even number
- Default: 2, good for max_logical_replication_workers=4
- Example: max_logical_replication_workers=10 = 4

max_parallel_workers

Set it to max_worker_processes - max_logical_replication_workers.
- Default value: 8, good if max_worker_processes - max_logical_replication_workers <= 8
- Example: max_worker_processes - max_logical_replication_workers =

Here's the same example with a PostgreSQL with 56 vCPU and 64Go RAM and 6 subscriptions:

# /etc/postgresql/12/main/postgresql.conf

# [NO CHANGES] postgresql listen to RDS
listen_addresses = 'localhost,rds_host'

# [NO CHANGES] We wish to use Logical replication
wal_level = logical

# [NO CHANGES] Still the number of cores
max_worker_processes=56

# All workers left by max_logical_replication_workers
max_parallel_workers=48

# depends than number of subscribers (or replicas)
max_logical_replication_workers=8

# depends on number of tables being replicated
max_sync_workers_per_subscription=4

Restart postgresql

systemctl restart postgresql.service

Resources

Here's a list of articles and documentations that helped me write this article

← Top →