February 10, 2022

How to setup PostgreSQL physical streaming replication

In this article I'll explain how to setup a PostgreSQL read replica using physical streaming replication asynchronous. It is as of today one of the easiest way to setup a read replica, in addition, it is easy to maintain over time.

🙌 Do your own research regarding how I advise you to set the settings under the Optional settings sections. While other settings are mandatory in order to get it working, those aren't mandatory but you should pay attention to them. The way I recommend you to set those settings is the result of several researchs mainly [PostgreSQL official documentation]((https://www.postgresql.org/), postgresqlco.nf and my own experience.** **It doesn't mean it is the best way, I only wish to give you a starting point.

Notes

The two servers are on Debian Buster 10.
We're using PostgreSQL 11.7

Tips - check your logs 👋

Check your logs ! They are your best friend and they provide a lot of useful informations. Wether the PostgreSQL isn't working or when some settings are wrongly set.

To find your log folder

-- psql connected as your admin user on any database

-- Show the directory where logs are stored
postgres=> SHOW log_directory;

-- If the value was relative,
-- it is relative to the data directory
-- to find it:
postgres=> SHOW data_directory;

💡 If the value is relative like log and not absolute like /var/log/postgresql. It is relative to the data directory: {{ data_directory }}/log

Terms used in this article

Master: We'll refer to the master as the PostgreSQL which already has the data you wish to replicate.
Standby: We'll refer to the standby as the empty PostgreSQL to which you wish to replicate data.

Requirements

Two seperate debian 10 servers running with PostgreSQL installed.
- ⚠️ they need both to have the same PostgreSQL version.
- This article is meant for **Debian Buster 10, **the file locations, bash commands and steps might defer on other OS.
A SUPERUSER role on the master.

Configuration - postgresql.conf

The master is the PostgreSQL which has all your current roles, databases, etc. It is the only instance which can run **write queries. **The read replica will be called standby. In this section we'll define the settings to apply to both **master **and standby.

On debian 10, PostgreSQL 11 you'll find this file here: /etc/postgresql/11/main/postgresql.conf

🤔 Why apply master and standby settings to both of them ? Because in the event the master goes down and become unreachable, you can easily promote the standby to master. Also the settings specific to standby or master are ignored when the instance isn't acting as. There is no inconvenience by doing so

Mandatory Settings

Those settings are mandatory to enable physical streaming replication between our two servers.

wal_level:

It determines how much information is written to the WAL.We need to have enough informations in the WAL for the replication.

Set it to replica. Or a higher value like logical

listen_addresses:

Specifies the TCP/IP address(es) on which the server is to listen for connections from client applications. You must ensure each server can communicate with the other.

If you have 'localhost' :
- On master: 'localhost,standby_host'
- On standby: 'localhost,master_host'
If you have '*' :
- Nothing to do 👍

hot_standby:

Specifies whether or not you can connect and run queries during recovery**. **Setting to on allow read queries on the standby (if off reject any queries)

Set it to on

Optional settings

Those settings varies depending on your needs, your use cases and your servers. In some cases the default values are perfect, but in other we should adjust them.

wal_log_hints

Writes full pages to WAL when first modified after a checkpoint, even for a non-critical modifications. It is required for pg_rewind capability when standby goes out of sync with master

Set it to on

wal_keep_segments:

This sets only the minimum number of segments retained in pg_wal. As we will be using replication slots (more about it later) we shouldn't in theory need to set it, but it is still advised to set a non-zero value for the initialization.

set it to 32 (about 16min of "safety")

max_wal_senders:

Specifies the maximum number of concurrent connections from standby servers or streaming base backup clients (i.e., the maximum number of simultaneously running WAL sender processes).

Set it to twice as many replication process you expect, if you expect more than 5 replications otherwise left it to the default value.
- < 5 standby servers: Leave the default value 10
- Example: 9 databases: Set it to 18

max_replication_slots:

Specifies the maximum number of replication slots (see streaming-replication-slots) that the server can support.

Set it to twice as many replication process you expect, if you expect more than 5 replications otherwise left it to the default value.
- < 5 standby servers: Leave the default value 10
- Example: 9 databases: Set it to 18

max_worker_processes:

Sets the maximum number of background processes that the system can support.

Set it to the number of cores you have, if you have more than 8.
- <= 8 vCPU: Leave the default value 8
- Example: 32 vCPU: Set it to 32

min_wal_size:

As long as WAL disk usage stays below this setting, old WAL files are always recycled for future use at a checkpoint, rather than removed. This setting has an impact on the health of your WAL

I recommend you to use pgtune for this setting

max_wal_size:

Maximum size to let the WAL grow to between automatic WAL checkpoints. This is a soft limit. This setting has an impact on the health of your WAL

I recommend you to use pgtune for this setting

max_standby_streaming_delay:

When Hot Standby is active, this parameter determines how long the standby server should wait before canceling standby queries that conflict with about-to-be-applied WAL entries.

A low delay: Your long queries on the standby will be canceled if the data is changed on the master
A high delay: Your long queries on the standby won't be canceled but your standby might not have up to date data.
If you prefer a replica as much up to date possible: Leave it to default 30000 (30 sec)
If you wish to run long queries on the replica: Set it to your longest query for example 30min

Apply changes

Restart the PostgreSQL instance to apply the changes we've made. (also if you did wrongly set a variable, you'll be able to debug it before going further)

👋 Tip: Check your logs after the restart ! PostgreSQL will tell you if a parameter is wrongly set in the logs. It might even advise you another value even if it is working.

systemctl restart postgresql.service

Configure the replication user

In this section we will configure the user that will be used by the standby to replicate data from master.

Create user

We need to create a role with the REPLICATION and LOGIN flags in order to replicate data with.

-- Psql connected as admin user on any databases on master

postgres=> CREATE ROLE replication_role WITH REPLICATION LOGIN ENCRYPTED PASSWORD 'xxxx';

Permissions: pg_hba.conf

Now we must tell PostgreSQL the role replication_role can connect to the PostgreSQL to replicate data. For this we'll add a line in the pg_hba.conf file with replication as value for the database.

On debian 10, PostgreSQL 11 you'll find this file here: /etc/postgresql/11/main/pg_hba.conf

👋 Note: using all as database value, doesn't cover replication, from postgres.org: "The value replication specifies that the record matches if a physical replication connection is requested (note that replication connections do not specify any particular database)."

[...]

# (On master) Allow replication connection only for replication_role from server_b host with password
host    replication       replication_role    standby_host      md5
# (On standby)
host    replication       replication_role    master_host      md5

#### OR

# (only IPv4) Allow replication with replication_role role from everywhere with password.
host    replication       replication_role    0.0.0.0/0      md5
# (only IPv6) Allow replication with replication_role role from everywhere with password.
host    replication       replication_role    ::/0      md5

Apply changes

Restart the PostgreSQL instance to apply the changes we've made.

systemctl restart postgresql.service

Setup replication slot

From the PostgreSQL documention "Replication slots provide an automated way to ensure that the master does not remove WAL segments until they have been received by all standbys, and that the master does not remove rows which could cause a recovery conflict even when the standby is disconnected."

Create the replication slot

This step isn't necessary as it can be automatically created by the pg_basebackup that we will use later. But this is an important element and I prefer to emphasize on it. We will call our replication slot replication_slot_1.

-- Psql connected as admin user on any database

postgres=> SELECT pg_create_physical_replication_slot('replication_slot_1');

-- You can find your recently created replication slot with this command
-- Notice the 'f' value in the column active
postgres=> SELECT * FROM pg_replication_slots;

Prepare standby to replicate data

In this section we will finish the standby setup and start the replication.

⚠️ Depending on the size of your database, the pg_basebackup may take a while, for example: It took me 2h ~ 3h to replicate an instance with 500Go worth of databases.

Find your data directory

On the standby server, we need to know where the PostgreSQL's data are stored. To find this run this query on your standby PostgreSQL:

-- Psql connected on the standy on any database

postgres=> SHOW data_directory;

Stop the standby PostgreSQL

Before begining the replication we need to stop PostgreSQL on the standby.

systemctl stop postgresql.service

Backup the old data directory

the command that we will use to initiate the replication will dump data in the PostgreSQL data_directory but it needs to be empty, so we will backup the current one. For the data_directory was /var/lib/postgresql/11/main

mv /var/lib/postgresql/11/main /var/lib/postgresql/11/main.bak

Clone the master to the standby

⚠️ Depending on the size of your database, the pg_basebackup may take a while, for example: It took me 2h ~ 3h to replicate an instance with 500Go worth of databases.

This is the part where we'll clone all master's data to the standby. For this we will use pg_basebackup, click here for its documentation. Here's a explanation of the arguments:

PGPASSWORD: the password to connect to the master
-h: the host of the master
-U: the user to connect with
-D: the data_directory path
-R: Write a minimal recovery.conf in the output directory
-X: includes the required write-ahead log files (WAL files) in the backup. Set it to stream
--checkpoint: Sets checkpoint mode. Set it to fast
--slot: It causes the WAL streaming to use the specified replication slot. Set it to replication_slot_1
-v: Enables verbose mode.
-P: Enables progress reporting.

# Clone the master data into the standby data directory
PGPASSWORD=replication_user_password pg_basebackup -h master_host -U replication_user -D /var/lib/postgresql/11/main -R -X stream --checkpoint fast --slot replication_slot_1 -v -P

Check the permissions !

Be sure that the user postgres has the right permissions to its data directory.

chown -R postgres:postgres /var/lib/postgresql/11/main

Edit the recovery.conf file

The recovery.conf file is the file that will tell PostgreSQL to be in recovery mode. The settings in it, will tell PostgreSQL to act as a standby and how he can replicate data from the master.

If you did use the -R argument with pg_basebackup, then you already have a minimal one. If not create it.

It is placed at {{ data_directory }}/recovery.conf , for me it is /var/lib/postgresql/11/main/recovery.conf.

For more information you can check the PostgreSQL's documentation - 26.2.4. Setting Up a Standby Server

# /var/lib/postgresql/11/main/recovery.conf

# [AUTO GENERATED]
# Act as a standby
standby_mode = 'on'

# [AUTO GENERATED]
# How to connect to the master
primary_conninfo = 'host=master_host port=5432 user=replication_user password=replication_user_password'

# [AUTO GENERATED]
# The replication slot on the master
primary_slot_name = 'replication_slot_1'

# The trigger is what enables us to quickly turn a standby to master
# {{ data_directory }}/failover.trigger
trigger_file = '/var/lib/postgresql/11/main/failover.trigger'

Start the standby

Now everything is ready, start the standby and we're all done ! 🚀

systemctl start postgresql.service

Monitoring

Here are several commands, that can help you monitor your replications.

💡 Must be executed on the master PostgreSQL

-- See the different kind of lags for each subscription connected to it. 0 means it has been fully replicated, no lag
postgres=> SELECT pid, application_name, client_addr, state, sync_state, pg_wal_lsn_diff(sent_lsn, write_lsn) AS write_lag, pg_wal_lsn_diff(sent_lsn, flush_lsn) AS flush_lag, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag FROM pg_stat_replication;

-- See all replications
-- Notice the 't' value in the active column, it is active now !
postgres=> SELECT * FROM pg_replication_slots;

(Bonus) Promote standby to master

In the event your master becomes unreachable and you wish to promote your standby as your new master simply do this on your stanby:

# On the standby server

# touch {{ data_directory }}/failover.trigger
touch /var/lib/postgresql/11/main/failover.trigger

Wait a few seconds and that's it !

After creating the failover.trigger file, PostgreSQL will in a matter of seconds become a master, allow write queries and stop replicating data from the former master.

The failover.trigger file will be automatically deleted
The recovery.conf file will be renamed recovery.done

Ressources

Here's a list of articles and documentations that helped me write this article

← Top