Setup pgBackRest Server

The {[project]} User Guide demonstrates how to quickly and easily setup {[project]} for your {[postgres]} database. Step-by-step instructions lead the user through all the important features of the fastest, most reliable {[postgres]} backup and restore solution.

debian rhel Debian & Ubuntu RHEL 12 10 none debian y n 1 1 '{[os-type]}' eq '{[os-debian]}' '{[os-type]}' eq '{[os-rhel]}' ubuntu:18.04 rockylinux/rockylinux:8 {[os-debian-title]} {[os-rhel-title]} Debian/Ubuntu RHEL 7-8 /pgbackrest /usr/local/share/ca-certificates /etc/pki/ca-trust/source/anchors resource/fake-cert {[host-repo-path]}/doc/{[fake-cert-path-relative]} '{[os-type]}' eq '{[os-debian]}' '{[os-type]}' eq '{[os-rhel]}' /build {[build-path]}/pgbackrest-release-{[version]} {[os-debian-pg-version]} {[os-rhel-pg-version]} my $version = '{[pg-version]}'; $version =~ s/\.//g; return $version; hot_standby replica 14 11 my $version = '{[pg-version-upgrade]}'; $version =~ s/\.//g; return $version; /usr/lib/postgresql/{[pg-version]}/bin /usr/pgsql-{[pg-version]}/bin /usr/lib/postgresql/{[pg-version-upgrade]}/bin /usr/pgsql-{[pg-version-upgrade]}/bin /var/lib/postgresql /var/lib/pgsql postgres /var/lib/pgbackrest aes-256-cbc zWaf6XtpjIVZC5444yXB+cgFDFl7MxGlgkZSaoPvTGirhPygu4jOKOXf9LO4vjfO /usr/bin/pgbackrest pgbackrest {[br-user]} /home/{[br-user]} demo /etc/{[project-exe]} {[backrest-config-path]}/conf.d {[backrest-config-path]}/{[project-exe]}.conf /etc/{[project-exe]}.conf /var/lib/postgresql/[version]/[cluster] /var/lib/pgsql/[version]/data /var/lib/postgresql/{[pg-version]}/{[postgres-cluster-demo]} /var/lib/pgsql/{[pg-version]}/data /var/lib/postgresql/{[pg-version-upgrade]}/{[postgres-cluster-demo]} /var/lib/pgsql/{[pg-version-upgrade]}/data /var/spool/pgbackrest /etc/postgresql/{[pg-version]}/{[postgres-cluster-demo]}/postgresql.conf {[pg-path]}/postgresql.conf /etc/postgresql/{[pg-version-upgrade]}/{[postgres-cluster-demo]}/postgresql.conf {[pg-path-upgrade]}/postgresql.conf /etc/postgresql/{[pg-version]}/{[postgres-cluster-demo]}/pg_hba.conf {[pg-path]}/pg_hba.conf /etc/postgresql/{[pg-version-upgrade]}/{[postgres-cluster-demo]}/pg_hba.conf {[pg-path-upgrade]}/pg_hba.conf {[pg-home-path]}/.pgpass /var/log/postgresql/postgresql-{[pg-version]}-{[postgres-cluster-demo]}.log {[pg-path]}/pg_log/postgresql.log {[pg-path]}/log/postgresql.log /var/lib/pgsql/{[pg-version]}/pgstartup.log recovery.conf postgresql.auto.conf {[pg-path]}/{[pg-recovery-file-demo]} pg_switch_xlog pg_switch_wal mcr.microsoft.com/azure-storage/azurite n y pgbackrest demo-container demo-repo shared YXpLZXk= n demo-bucket demo-repo service /etc/pgbackrest/gcs-key.json minio/minio:RELEASE.2022-07-30T05-21-40Z n y demo-bucket demo-repo us-east-1 s3.{[s3-region]}.amazonaws.com accessKey1 verySecretKey1 ('{[azure-all]}' eq 'y' || '{[gcs-all]}' eq 'y' || '{[s3-all]}' eq 'y') pgbackrest/doc:{[os-type]} -v /sys/fs/cgroup:/sys/fs/cgroup:rw -v /tmp/$(mktemp -d):/run y -m 512m use English; getpwuid($UID) eq 'root' ? 'vagrant' : getpwuid($UID) . '' {[host-repo-path]}:{[pgbackrest-repo-path]} pgbackrest/test azure azure-server s3 s3-server pg1 pg-primary {[host-user]} {[host-image]} {[host-mount]} build build {[host-user]} {[host-image]} {[host-mount]} pg2 pg-standby {[host-pg1-user]} {[host-image]} {[host-mount]} repo1 repository {[host-user]} {[host-image]} {[host-mount]} pgbackrest repo-ls backup/demo --filter="(F|D|I)$" --sort=desc | head -1 Important Data sleep 2 pg_createcluster {[pg-version]} {[postgres-cluster-demo]} pg_createcluster {[pg-version-upgrade]} {[postgres-cluster-demo]} pg_ctlcluster {[pg-version]} {[postgres-cluster-demo]} start systemctl start postgresql-{[pg-version]}.service pg_ctlcluster {[pg-version-upgrade]} {[postgres-cluster-demo]} start systemctl start postgresql-{[pg-version-upgrade]}.service pg_ctlcluster {[pg-version]} {[postgres-cluster-demo]} stop systemctl stop postgresql-{[pg-version]}.service pg_ctlcluster {[pg-version]} {[postgres-cluster-demo]} restart systemctl restart postgresql-{[pg-version]}.service pg_ctlcluster {[pg-version]} {[postgres-cluster-demo]} reload systemctl reload postgresql-{[pg-version]}.service pg_lsclusters systemctl status postgresql-{[pg-version]}.service pg_lsclusters systemctl status postgresql-{[pg-version-upgrade]}.service mkdir -p -m 700 /root/.ssh && \ echo '-----BEGIN RSA PRIVATE KEY-----' > /root/.ssh/id_rsa && \ echo 'MIICXwIBAAKBgQDR0yJsZW5d5LcqteiOtv8d+FFeFFHDPI0VTcTOdMn1iDiIP1ou' >> /root/.ssh/id_rsa && \ echo 'X3Q2OyNjsBaDbsRJd+sp9IRq1LKX3zsBcgGZANwm0zduuNEPEU94ajS/uRoejIqY' >> /root/.ssh/id_rsa && \ echo '/XkKOpnEF6ZbQ2S7TaE4sWeGLvba7kUFs0QTOO+N+nV2dMbdqZf6C8lazwIDAQAB' >> /root/.ssh/id_rsa && \ echo 'AoGBAJXa6xzrnFVmwgK5BKzYuX/YF5TPgk2j80ch0ct50buQXH/Cb0/rUH5i4jWS' >> /root/.ssh/id_rsa && \ echo 'T6Hy/DFUehnuzpvV6O9auTOhDs3BhEKFRuRLn1nBwTtZny5Hh+cw7azUCEHFCJlz' >> /root/.ssh/id_rsa && \ echo 'makCrVbgawtno6oU/pFgQm1FcxD0f+Me5ruNcLHqUZsPQwkRAkEA+8pG+ckOlz6R' >> /root/.ssh/id_rsa && \ echo 'AJLIHedmfcrEY9T7sfdo83bzMOz8H5soUUP4aOTLJYCla1LO7JdDnXMGo0KxaHBP' >> /root/.ssh/id_rsa && \ echo 'l8j5zDmVewJBANVVPDJr1w37m0FBi37QgUOAijVfLXgyPMxYp2uc9ddjncif0063' >> /root/.ssh/id_rsa && \ echo '0Wc0FQefoPszf3CDrHv/RHvhHq97jXDwTb0CQQDgH83NygoS1r57pCw9chzpG/R0' >> /root/.ssh/id_rsa && \ echo 'aMEiSPhCvz757fj+qT3aGIal2AJ7/2c/gRZvwrWNETZ3XIZOUKqIkXzJLPjBAkEA' >> /root/.ssh/id_rsa && \ echo 'wnP799W2Y8d4/+VX2pMBkF7lG7sSviHEq1sP2BZtPBRQKSQNvw3scM7XcGh/mxmY' >> /root/.ssh/id_rsa && \ echo 'yx0qpqfKa8SKbNgI1+4iXQJBAOlg8MJLwkUtrG+p8wf69oCuZsnyv0K6UMDxm6/8' >> /root/.ssh/id_rsa && \ echo 'cbvfmvODulYFaIahaqHWEZoRo5CLYZ7gN43WHPOrKxdDL78=' >> /root/.ssh/id_rsa && \ echo '-----END RSA PRIVATE KEY-----' >> /root/.ssh/id_rsa && \ echo 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDR0yJsZW5d5LcqteiOtv8d+FFeFFHDPI0VTcTOdMn1iDiIP1ouX3Q2OyNjsBaDbsRJd+sp9IRq1LKX3zsBcgGZANwm0zduuNEPEU94ajS/uRoejIqY/XkKOpnEF6ZbQ2S7TaE4sWeGLvba7kUFs0QTOO+N+nV2dMbdqZf6C8lazw== root@pgbackrest-doc' > /root/.ssh/authorized_keys && \ echo 'Host *' > /root/.ssh/config && \ echo ' StrictHostKeyChecking no' >> /root/.ssh/config && \ chmod 600 /root/.ssh/* echo 'listen_addresses = '\''*'\''' > /root/postgresql.common.conf && \ echo 'port = 5432' >> /root/postgresql.common.conf && \ echo 'shared_buffers = 16MB' >> /root/postgresql.common.conf && \ echo 'log_line_prefix = '\'''\''' >> /root/postgresql.common.conf && \ echo 'autovacuum = off' >> /root/postgresql.common.conf COPY {[fake-cert-path-relative]}/ca.crt {[ca-cert-path]}/pgbackrest-ca.crt RUN echo "Set disable_coredump false" >> /etc/sudo.conf

{[copy-ca-cert]} # Fix root tty RUN sed -i 's/^mesg n/tty -s \&\& mesg n/g' /root/.profile # Install base packages (suppress dpkg interactive output) RUN export DEBIAN_FRONTEND=noninteractive && \ rm /etc/apt/apt.conf.d/70debconf && \ apt-get update && \ apt-get install -y --no-install-recommends sudo ssh wget vim gnupg lsb-release iputils-ping ca-certificates \ tzdata locales 2>&1 {[sudo-disable-core-dump]} # Install CA certificate RUN update-ca-certificates # Install PostgreSQL RUN RELEASE_CODENAME=`lsb_release -c | awk '{print $2}'` && \ echo 'deb http://apt.postgresql.org/pub/repos/apt/ '${RELEASE_CODENAME?}'-pgdg main' | \ tee -a /etc/apt/sources.list.d/pgdg.list && \ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - && \ apt-get update && \ apt-get install -y --no-install-recommends postgresql-common 2>&1 && \ sed -i 's/^\#create\_main\_cluster.*$/create\_main\_cluster \= false/' \ /etc/postgresql-common/createcluster.conf && \ apt-get install -y --no-install-recommends postgresql-{[pg-version]} postgresql-{[pg-version-upgrade]} 2>&1 # Create an ssh key for root so all hosts can ssh to each other as root RUN \ {[ssh-key-install]} # Create common postgresql config RUN \ {[postgres-config-common-create]} # Add doc user with sudo privileges RUN adduser --disabled-password --gecos "" {[host-user]} && \ echo '%{[host-user]} ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers # Set UTF8 encoding RUN sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && \ dpkg-reconfigure --frontend=noninteractive locales && \ update-locale LANG=en_US.UTF-8 ENV LANG en_US.UTF-8 ENTRYPOINT service ssh restart && bash

ENV container docker {[copy-ca-cert]} RUN mkdir -p /lib/systemd/system/sysinit.target.wants && \ (cd /lib/systemd/system/sysinit.target.wants/; for i in *; do [ $i == \ systemd-tmpfiles-setup.service ] || rm -f $i; done); \ rm -f /lib/systemd/system/multi-user.target.wants/*;\ rm -f /etc/systemd/system/*.wants/*;\ rm -f /lib/systemd/system/local-fs.target.wants/*; \ rm -f /lib/systemd/system/sockets.target.wants/*udev*; \ rm -f /lib/systemd/system/sockets.target.wants/*initctl*; \ rm -f /lib/systemd/system/basic.target.wants/*;\ rm -f /lib/systemd/system/anaconda.target.wants/*; VOLUME [ "/sys/fs/cgroup" ] # Install packages RUN yum install -y openssh-server openssh-clients sudo wget vim openssl findutils dnf-plugins-core 2>&1 # Enable PowerTools repository (only available on RHEL8) RUN dnf config-manager --set-enabled powertools || true # Install CA certificate RUN update-ca-trust extract # Regenerate SSH keys RUN rm -f /etc/ssh/ssh_host_rsa_key* && \ ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key && \ rm -f /etc/ssh/ssh_host_dsa_key* && \ ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key # Install PGDG PostgreSQL repository RUN rpm --import http://yum.postgresql.org/RPM-GPG-KEY-PGDG-10 && \ rpm -ivh https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm # Disable default PostgreSQL repository RUN command -v dnf >/dev/null 2>&1 && dnf -qy module disable postgresql || true # Install PostgreSQL RUN yum install -y postgresql{[pg-version-nodot]}-server postgresql{[pg-version-upgrade-nodot]}-server # Create an ssh key for root so all hosts can ssh to each other as root RUN \ {[ssh-key-install]} # Create common postgresql config RUN \ {[postgres-config-common-create]} # Add doc user with sudo privileges RUN adduser -n {[host-user]} && \ echo '{[host-user]} ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/{[host-user]} # Enable the user session service so logons are allowed RUN echo "[Install]" >> /usr/lib/systemd/system/systemd-user-sessions.service && \ echo "[WantedBy=default.target]" >> /usr/lib/systemd/system/systemd-user-sessions.service && \ systemctl enable systemd-user-sessions.service && \ mkdir -p /etc/systemd/system/default.target.wants && \ ln -s /usr/lib/systemd/system/systemd-user-sessions.service \ /etc/systemd/system/default.target.wants/systemd-user-sessions.service # Set locale RUN echo en_US.UTF-8 UTF-8 > /etc/locale.conf # Add path to PostgreSQL and package ENV PATH=/usr/pgsql-{[pg-version]}/bin:$PATH ENV PKG_CONFIG_PATH=/usr/pgsql-{[pg-version]}/lib/pkgconfig:$PKG_CONFIG_PATH CMD ["/usr/sbin/init"]

can use passwordless SSH to enable communication between the hosts. It is also possible to use TLS, see Setup TLS.

Setup pgBackRest Server

mkdir -p -m 770 /etc/pgbackrest/cert && cp {[pgbackrest-repo-path]}/doc/{[fake-cert-path-relative]}/ca.crt /etc/pgbackrest/cert/ca.crt && openssl genrsa -out /etc/pgbackrest/cert/server.key 2048 2>&1 && chmod 600 /etc/pgbackrest/cert/server.key && openssl req -new -sha256 -nodes -out /etc/pgbackrest/cert/server.csr -key /etc/pgbackrest/cert/server.key -subj "/CN={[setup-tls-host]}" 2>&1 && openssl x509 -req -in /etc/pgbackrest/cert/server.csr -CA /etc/pgbackrest/cert/ca.crt -CAkey {[pgbackrest-repo-path]}/doc/{[fake-cert-path-relative]}/ca.key -CAcreateserial -out /etc/pgbackrest/cert/server.crt -days 9 2>&1 && openssl genrsa -out /etc/pgbackrest/cert/client.key 2048 2>&1 && chmod 600 /etc/pgbackrest/cert/client.key && openssl req -new -sha256 -nodes -out /etc/pgbackrest/cert/client.csr -key /etc/pgbackrest/cert/client.key -subj "/CN=pgbackrest-client" 2>&1 && openssl x509 -req -in /etc/pgbackrest/cert/client.csr -CA /etc/pgbackrest/cert/ca.crt -CAkey {[pgbackrest-repo-path]}/doc/{[fake-cert-path-relative]}/ca.key -CAcreateserial -out /etc/pgbackrest/cert/client.crt -days 9 2>&1 && chown -R {[setup-tls-user]} /etc/pgbackrest/cert

echo '[Unit]' | tee /etc/systemd/system/pgbackrest.service && echo 'Description=pgBackRest Server' | tee -a /etc/systemd/system/pgbackrest.service && echo 'After=network.target' | tee -a /etc/systemd/system/pgbackrest.service && echo 'StartLimitIntervalSec=0' | tee -a /etc/systemd/system/pgbackrest.service && echo '' | tee -a /etc/systemd/system/pgbackrest.service && echo '[Service]' | tee -a /etc/systemd/system/pgbackrest.service && echo 'Type=simple' | tee -a /etc/systemd/system/pgbackrest.service && echo 'Restart=always' | tee -a /etc/systemd/system/pgbackrest.service && echo 'RestartSec=1' | tee -a /etc/systemd/system/pgbackrest.service && echo 'User={[setup-tls-user]}' | tee -a /etc/systemd/system/pgbackrest.service && echo 'ExecStart=/usr/bin/pgbackrest server' | tee -a /etc/systemd/system/pgbackrest.service && echo 'ExecStartPost=/bin/sleep 3' | tee -a /etc/systemd/system/pgbackrest.service && echo 'ExecStartPost=/bin/bash -c "[ ! -z $MAINPID ]"' | tee -a /etc/systemd/system/pgbackrest.service && echo 'ExecReload=/bin/kill -HUP $MAINPID' | tee -a /etc/systemd/system/pgbackrest.service && echo '' | tee -a /etc/systemd/system/pgbackrest.service && echo '[Install]' | tee -a /etc/systemd/system/pgbackrest.service && echo 'WantedBy=multi-user.target' | tee -a /etc/systemd/system/pgbackrest.service

cat /etc/systemd/system/pgbackrest.service

systemctl enable pgbackrest

2>&1

systemctl start pgbackrest

Create <host>{[setup-ssh-host]}</host> host key pair

mkdir -m 750 -p {[setup-ssh-user-home-path]}/.ssh

ssh-keygen -f {[setup-ssh-user-home-path]}/.ssh/id_rsa -t rsa -b 4096 -N ""

Exchange keys between {[host-repo1]} and {[setup-ssh-host]}.

Copy <host>{[setup-ssh-host]}</host> public key to <host>{[host-repo1]}</host>

(echo -n 'no-agent-forwarding,no-X11-forwarding,no-port-forwarding,' && echo -n 'command="{[br-bin]} ${SSH_ORIGINAL_COMMAND#* }" ' && sudo ssh root@{[setup-ssh-host]} cat {[setup-ssh-user-home-path]}/.ssh/id_rsa.pub) | sudo -u pgbackrest tee -a {[br-home-path]}/.ssh/authorized_keys

Copy <host>{[host-repo1]}</host> public key to <host>{[setup-ssh-host]}</host>

(echo -n 'no-agent-forwarding,no-X11-forwarding,no-port-forwarding,' && echo -n 'command="{[br-bin]} ${SSH_ORIGINAL_COMMAND#* }" ' && sudo ssh root@{[host-repo1]} cat {[br-home-path]}/.ssh/id_rsa.pub) | sudo -u {[setup-ssh-user]} tee -a {[setup-ssh-user-home-path]}/.ssh/authorized_keys

Test that connections can be made from {[host-repo1]} to {[setup-ssh-host]} and vice versa.

Test connection from <host>{[host-repo1]}</host> to <host>{[setup-ssh-host]}</host>

ssh {[setup-ssh-user]}@{[setup-ssh-host]}

-o StrictHostKeyChecking=no

Test connection from <host>{[setup-ssh-host]}</host> to <host>{[host-repo1]}</host>

ssh pgbackrest@{[host-repo1]}

-o StrictHostKeyChecking=no

needs to be installed from a package or installed manually as shown here.

Install dependencies

apt-get install postgresql-client libxml2

-y 2>&1

yum install postgresql-libs

-y 2>&1

Copy <backrest/> binary from build host

scp {[host-build]}:{[build-br-path]}/src/pgbackrest /usr/bin

2>&1

chmod 755 /usr/bin/pgbackrest

requires log and configuration directories and a configuration file.

Create <backrest/> configuration file and directories

mkdir -p -m 770 /var/log/pgbackrest

chown {[br-install-user]}:{[br-install-group]} /var/log/pgbackrest

mkdir -p {[backrest-config-path]}

mkdir -p {[backrest-config-include-path]}

touch {[backrest-config-demo]}

chmod 640 {[backrest-config-demo]}

chown {[br-install-user]}:{[br-install-group]} {[backrest-config-demo]}

Install <backrest/> from package

dpkg -i {[pgbackrest-repo-path]}/{[package]}

2>&1

apt-get -y install -f

-y 2>&1

apt-get install pgbackrest

apt-get update

apt-get install pgbackrest

-y 2>&1

yum -y install {[pgbackrest-repo-path]}/{[package]}

-y 2>&1

yum install pgbackrest

-y 2>&1

Update permissions on configuration file and directories

chown {[br-install-user]}:{[br-install-group]} /var/log/pgbackrest

chown {[br-install-user]}:{[br-install-group]} {[backrest-config-demo]}

Create the <backrest/> repository

mkdir -p {[backrest-repo-path]}

chmod 750 {[backrest-repo-path]}

chown {[br-install-user]}:{[br-install-group]} {[backrest-repo-path]}

Update permissions on the <backrest/> repository

chown {[br-install-user]}:{[br-install-group]} {[backrest-repo-path]}

supports locating repositories in Azure-compatible object stores. The container used to store the repository must be created in advance — will not do it automatically. The repository can be located in the container root (/) but it's usually best to place it in a subpath so object store logs or other data can also be stored in the container without conflicts.

Configure <proper>Azure</proper>

azure

/{[azure-repo]}

{[azure-account]}

{[azure-key-type]}

{[azure-key]}

{[azure-container]}

Create the container

echo "{[host-azure-ip]} pgbackrest.blob.core.windows.net" | tee -a /etc/hosts

{[project-exe]} --repo={[azure-setup-repo-id]} repo-create

Shared access signatures may be used by setting the repo{[azure-setup-repo-id]}-azure-key-type option to sas and the repo{[azure-setup-repo-id]}-azure-key option to the shared access signature token.

supports locating repositories in GCS-compatible object stores. The bucket used to store the repository must be created in advance — will not do it automatically. The repository can be located in the bucket root (/) but it's usually best to place it in a subpath so object store logs or other data can also be stored in the bucket without conflicts.

Configure <proper>GCS</proper>

gcs

/{[gcs-repo]}

{[gcs-key-type]}

{[gcs-key]}

{[gcs-bucket]}

When running in GCE set repo{[gcs-setup-repo-id]}-gcs-key-type=auto to automatically authenticate using the instance service account.

supports locating repositories in S3-compatible object stores. The bucket used to store the repository must be created in advance — will not do it automatically. The repository can be located in the bucket root (/) but it's usually best to place it in a subpath so object store logs or other data can also be stored in the bucket without conflicts.

Configure <proper>S3</proper>

/{[s3-repo]}

{[s3-key]}

{[s3-key-secret]}

{[s3-bucket]}

{[s3-endpoint]}

{[s3-region]}

Create the bucket

echo "{[host-s3-ip]} {[s3-bucket]}.{[s3-endpoint]} {[s3-endpoint]}" | tee -a /etc/hosts

{[project-exe]} --repo={[s3-setup-repo-id]} repo-create

The region and endpoint will need to be configured to where the bucket is located. The values given here are for the {[s3-region]} region.

Introduction

This user guide is intended to be followed sequentially from beginning to end — each section depends on the last. For example, the Restore section relies on setup that is performed in the Quick Start section. Once is up and running then skipping around is possible but following the user guide in order is recommended the first time through.

Although the examples in this guide are targeted at {[user-guide-os]} and {[pg-version]}, it should be fairly easy to apply the examples to any Unix distribution and version. The only OS-specific commands are those to create, start, stop, and drop clusters. The commands will be the same on any Unix system though the location of the executable may vary. While strives to operate consistently across versions of , there are subtle differences between versions of that may show up in this guide when illustrating certain examples, e.g. path/file names and settings.

Configuration information and documentation for PostgreSQL can be found in the Manual.

A somewhat novel approach is taken to documentation in this user guide. Each command is run on a virtual machine when the documentation is built from the XML source. This means you can have a high confidence that the commands work correctly in the order presented. Output is captured and displayed below the command when appropriate. If the output is not included it is because it was deemed not relevant or was considered a distraction from the narrative.

All commands are intended to be run as an unprivileged user that has sudo privileges for both the root and postgres users. It's also possible to run the commands directly as their respective users without modification and in that case the sudo commands can be stripped off.

Concepts

The following concepts are defined as they are relevant to , , and this user guide.

Backup

A backup is a consistent copy of a database cluster that can be restored to recover from a hardware failure, to perform Point-In-Time Recovery, or to bring up a new standby.

Full Backup: copies the entire contents of the database cluster to the backup. The first backup of the database cluster is always a Full Backup. is always able to restore a full backup directly. The full backup does not depend on any files outside of the full backup for consistency.

Differential Backup: copies only those database cluster files that have changed since the last full backup. restores a differential backup by copying all of the files in the chosen differential backup and the appropriate unchanged files from the previous full backup. The advantage of a differential backup is that it requires less disk space than a full backup, however, the differential backup and the full backup must both be valid to restore the differential backup.

Incremental Backup: copies only those database cluster files that have changed since the last backup (which can be another incremental backup, a differential backup, or a full backup). As an incremental backup only includes those files changed since the prior backup, they are generally much smaller than full or differential backups. As with the differential backup, the incremental backup depends on other backups to be valid to restore the incremental backup. Since the incremental backup includes only those files since the last backup, all prior incremental backups back to the prior differential, the prior differential backup, and the prior full backup must all be valid to perform a restore of the incremental backup. If no differential backup exists then all prior incremental backups back to the prior full backup, which must exist, and the full backup itself must be valid to restore the incremental backup.

Restore

A restore is the act of copying a backup to a system where it will be started as a live database cluster. A restore requires the backup files and one or more WAL segments in order to work correctly.

Write Ahead Log (WAL)

WAL is the mechanism that uses to ensure that no committed changes are lost. Transactions are written sequentially to the WAL and a transaction is considered to be committed when those writes are flushed to disk. Afterwards, a background process writes the changes into the main database cluster files (also known as the heap). In the event of a crash, the WAL is replayed to make the database consistent.

WAL is conceptually infinite but in practice is broken up into individual 16MB files called segments. WAL segments follow the naming convention 0000000100000A1E000000FE where the first 8 hexadecimal digits represent the timeline and the next 16 digits are the logical sequence number (LSN).

Encryption

Encryption is the process of converting data into a format that is unrecognizable unless the appropriate password (also referred to as passphrase) is provided.

will encrypt the repository based on a user-provided password, thereby preventing unauthorized access to data stored within the repository.

Upgrading {[project]}

Upgrading {[project]} from v1 to v2

Upgrading from v1 to v2 is fairly straight-forward. The repository format has not changed and all non-deprecated options from v1 are accepted, so for most installations it is simply a matter of installing the new version.

However, there are a few caveats:

The deprecated thread-max option is no longer valid. Use process-max instead.

The deprecated archive-max-mb option is no longer valid. This has been replaced with the archive-push-queue-max option which has different semantics.

The default for the backup-user option has changed from backrest to pgbackrest.

In v2.02 the default location of the configuration file has changed from /etc/pgbackrest.conf to /etc/pgbackrest/pgbackrest.conf. If /etc/pgbackrest/pgbackrest.conf does not exist, the /etc/pgbackrest.conf file will be loaded instead, if it exists.

Many option names have changed to improve consistency although the old names from v1 are still accepted. In general, db-* options have been renamed to pg-* and backup-*/retention-* options have been renamed to repo-* when appropriate.

and repository options must be indexed when using the new names introduced in v2, e.g. pg1-host, pg1-path, repo1-path, repo1-type, etc.

Build

{[user-guide-os]} packages for are available at apt.postgresql.org. If they are not provided for your distribution/version it is easy to download the source and install manually.

{[user-guide-os]} packages for are available from Crunchy Data or yum.postgresql.org, but it is also easy to download the source and install manually.

When building from source it is best to use a build host rather than building on production. Many of the tools required for the build should generally not be installed in production. consists of a single executable so it is easy to copy to a new host once it is built.

Download version <id>{[version]}</id> of <backrest/> to <path>{[build-path]}</path> path

mkdir -p {[build-path]}

wget -q -O - {[github-url-release]}/{[version]}.tar.gz | tar zx -C {[build-path]}

mkdir -p {[build-br-path]}

cp -r {[pgbackrest-repo-path]}/src {[build-br-path]}

chown -R {[host-build-user]} {[build-br-path]}

Install build dependencies

apt-get update

apt-get install make gcc libpq-dev libssl-dev libxml2-dev pkg-config liblz4-dev libzstd-dev libbz2-dev libz-dev libyaml-dev

-y 2>&1

yum install make gcc postgresql{[pg-version-nodot]}-devel openssl-devel libxml2-devel lz4-devel libzstd-devel bzip2-devel libyaml-devel

-y 2>&1

Configure and compile <backrest/>

cd {[build-br-path]}/src && ./configure && make

-j 4

Installation

A new host named {[host-pg1]} is created to contain the demo cluster and run examples.

{[host-pg1]}

postgres

should now be properly installed but it is best to check. If any dependencies were missed then you will get an error when running from the command line.

Make sure the installation worked

{[project-exe]}

Quick Start

The Quick Start section will cover basic configuration of and and introduce the backup, restore, and info commands.

Setup Demo Cluster

Creating the demo cluster is optional but is strongly recommended, especially for new users, since the example commands in the user guide reference the demo cluster; the examples assume the demo cluster is running on the default port (i.e. 5432). The cluster will not be started until a later section because there is still some configuration to do.

Create the demo cluster

{[pg-bin-path]}/initdb -D {[pg-path]} -k -A peer

{[pg-cluster-create]}

cat /root/postgresql.common.conf >> {[postgres-config-demo]}

By default {[user-guide-os]} includes the day of the week in the log filename. This makes the user guide a bit more complicated so the log_filename is set to a constant.

Set <pg-option>log_filename</pg-option>

'postgresql.log'

Configure Cluster Stanza

The name 'demo' describes the purpose of this cluster accurately so that will also make a good stanza name.

needs to know where the base data directory for the cluster is located. The path can be requested from directly but in a recovery scenario the process will not be available. During backups the value supplied to will be compared against the path that is running on and they must be equal or the backup will return an error. Make sure that pg-path is exactly equal to data_directory in postgresql.conf.

By default {[user-guide-os]} stores clusters in {[pg-path-default]} so it is easy to determine the correct path for the data directory.

When creating the {[backrest-config-demo]} file, the database owner (usually postgres) must be granted read privileges.

Configure the <postgres/> cluster data directory

{[pg-path]}

off

configuration files follow the Windows INI convention. Sections are denoted by text in brackets and key/value pairs are contained in each section. Lines beginning with # are ignored and can be used as comments.

There are multiple ways the configuration files can be loaded:

config and config-include-path are default: the default config file will be loaded, if it exists, and *.conf files in the default config include path will be appended, if they exist.

config option is specified: only the specified config file will be loaded and is expected to exist.

config-include-path is specified: *.conf files in the config include path will be loaded and the path is required to exist. The default config file will be be loaded if it exists. If it is desirable to load only the files in the specified config include path, then the --no-config option can also be passed.

config and config-include-path are specified: using the user-specified values, the config file will be loaded and *.conf files in the config include path will be appended. The files are expected to exist.

config-path is specified: this setting will override the base path for the default location of the config file and/or the base path of the default config-include-path setting unless the config and/or config-include-path option is explicitly set.

The files are concatenated as if they were one big file; order doesn't matter, but there is precedence based on sections. The precedence (highest to lowest) is:

[stanza:command]

[stanza]

[global:command]

[global]

--config, --config-include-path and --config-path are command-line only options.

can also be configured using environment variables as described in the command reference.

Configure <br-option>log-path</br-option> using the environment

bash -c ' export PGBACKREST_LOG_PATH=/path/set/by/env && {[project-exe]} --log-level-console=error help backup log-path'

current\: \/path\/set\/by\/env

Create the Repository

For this demonstration the repository will be stored on the same host as the server. This is the simplest configuration and is useful in cases where traditional backup software is employed to backup the database host.

{[host-pg1]}

postgres

The repository path must be configured so knows where to find it.

Configure the <backrest/> repository path

{[backrest-repo-path]}

Multiple repositories may also be configured. See Multiple Repositories for details.

Azure-Compatible Object Store Support

{[host-pg1]}

postgres

postgres:postgres

GCS-Compatible Object Store Support

{[host-pg1]}

postgres

postgres:postgres

S3-Compatible Object Store Support

{[host-pg1]}

postgres

postgres:postgres

Configure Archiving

Backing up a running cluster requires WAL archiving to be enabled. Note that at least one WAL segment will be created during the backup process even if no explicit writes are made to the cluster.

Configure archive settings

'{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} archive-push %p'

{[wal-level]}

%p is how specifies the location of the WAL segment to be archived. Setting wal_level to at least {[wal-level]} and increasing max_wal_senders is a good idea even if there are currently no replicas as this will allow them to be added later without restarting the primary cluster.

The cluster must be restarted after making these changes and before performing a backup.

Restart the {[postgres-cluster-demo]} cluster

{[pg-cluster-restart]}

{[pg-cluster-wait]}

psql -c " create or replace function create_test_table(prefix int, scale int, data bool) returns void as \$\$ declare index int; begin for index in 1 .. scale loop execute 'create table test_' || prefix || '_' || index || ' (id int)'; if data then execute 'insert into test_' || prefix || '_' || index || ' values (' || (prefix * index) || ')'; end if; end loop; end \$\$ LANGUAGE plpgsql;"

When archiving a WAL segment is expected to take more than 60 seconds (the default) to reach the repository, then the archive-timeout option should be increased. Note that this option is not the same as the archive_timeout option which is used to force a WAL segment switch; useful for databases where there are long periods of inactivity. For more information on the archive_timeout option, see Write Ahead Log.

The archive-push command can be configured with its own options. For example, a lower compression level may be set to speed archiving without affecting the compression used for backups.

Config <cmd>archive-push</cmd> to use a lower compression level

This configuration technique can be used for any command and can even target a specific stanza, e.g. demo:archive-push.

Configure Retention

expires backups based on retention options.

Configure retention to 2 full backups

More information about retention can be found in the Retention section.

Configure Repository Encryption

The repository will be configured with a cipher type and key to demonstrate encryption. Encryption is always performed client-side even if the repository type (e.g. S3 or other object store) supports encryption.

It is important to use a long, random passphrase for the cipher key. A good way to generate one is to run: openssl rand -base64 48.

Configure <backrest/> repository encryption

{[backrest-repo-cipher-type]}

{[backrest-repo-cipher-pass]}

Once the repository has been configured and the stanza created and checked, the repository encryption settings cannot be changed.

Create the Stanza

The stanza-create command must be run to initialize the stanza. It is recommended that the check command be run after stanza-create to ensure archiving and backups are properly configured.

Create the stanza and check the configuration

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-log-level-console=info stanza-create

completed successfully

Check the Configuration

Check the configuration

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-log-level-console=info check

successfully archived to

Example of an invalid configuration

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --archive-timeout=.1 check

could not find WAL segment|did not reach the archive

Perform a Backup

By default will wait for the next regularly scheduled checkpoint before starting a backup. Depending on the checkpoint_timeout and checkpoint_segments settings in it may be quite some time before a checkpoint completes and the backup can begin. Generally, it is best to set start-fast=y so that the backup starts immediately. This forces a checkpoint, but since backups are usually run once a day an additional checkpoint should not have a noticeable impact on performance. However, on very busy clusters it may be best to pass {[dash]}-start-fast on the command-line as needed.

Configure backup fast start

To perform a backup of the cluster run with the backup command.

Backup the {[postgres-cluster-demo]} cluster

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --log-level-console=info backup

no prior backup exists|full backup size

{[cmd-backup-last]}

By default will attempt to perform an incremental backup. However, an incremental backup must be based on a full backup and since no full backup existed ran a full backup instead.

The type option can be used to specify a full or differential backup.

Differential backup of the {[postgres-cluster-demo]} cluster

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-type=diff --log-level-console=info backup

diff backup size

This time there was no warning because a full backup already existed. While incremental backups can be based on a full or differential backup, differential backups must be based on a full backup. A full backup can be performed by running the backup command with {[dash]}-type=full.

During an online backup waits for WAL segments that are required for backup consistency to be archived. This wait time is governed by the archive-timeout option which defaults to 60 seconds. If archiving an individual segment is known to take longer then this option should be increased.

Schedule a Backup

Backups can be scheduled with utilities such as cron.

In the following example, two cron jobs are configured to run; full backups are scheduled for 6:30 AM every Sunday with differential backups scheduled for 6:30 AM Monday through Saturday. If this crontab is installed for the first time mid-week, then pgBackRest will run a full backup the first time the differential job is executed, followed the next day by a differential backup.

#m h dom mon dow command 30 06 * * 0 pgbackrest --type=full --stanza=demo backup 30 06 * * 1-6 pgbackrest --type=diff --stanza=demo backup

Once backups are scheduled it's important to configure retention so backups are expired on a regular schedule, see Retention.

Backup Information

Use the info command to get information about backups.

Get info for the {[postgres-cluster-demo]} cluster

{[project-exe]} info

(full|incr|diff) backup

Restore a Backup

Backups can protect you from a number of disaster scenarios, the most common of which are hardware failure and data corruption. The easiest way to simulate data corruption is to remove an important cluster file.

Stop the {[postgres-cluster-demo]} cluster and delete the <file>pg_control</file> file

{[pg-cluster-stop]}

rm {[pg-path]}/global/pg_control

Starting the cluster without this important file will result in an error.

Attempt to start the corrupted {[postgres-cluster-demo]} cluster

{[pg-cluster-start]}

could not find the database system

{[pg-cluster-start]}

{[pg-cluster-check]}

Failed to start PostgreSQL

To restore a backup of the cluster run with the restore command. The cluster needs to be stopped (in this case it is already stopped) and all files must be removed from the data directory.

Remove old files from {[postgres-cluster-demo]} cluster

find {[pg-path]} -mindepth 1 -delete

Restore the {[postgres-cluster-demo]} cluster and start <postgres/>

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} restore

{[pg-cluster-start]}

{[pg-cluster-wait]}

This time the cluster started successfully since the restore replaced the missing pg_control file.

More information about the restore command can be found in the Restore section.

Monitoring

Monitoring is an important part of any production system. There are many tools available and can be monitored on any of them with a little work.

can output information about the repository in JSON format which includes a list of all backups for each stanza and WAL archive info.

In <postgres/>

The COPY command allows info to be loaded into a table. The following example wraps that logic in a function that can be used to perform real-time queries.

Load <backrest/> info function for <postgres/>

mkdir -p {[pg-home-path]}/pgbackrest/doc/example

cp -r {[pgbackrest-repo-path]}/doc/example/* {[pg-home-path]}/pgbackrest/doc/example

cat {[pg-home-path]}/pgbackrest/doc/example/pgsql-pgbackrest-info.sql

psql -f {[pg-home-path]}/pgbackrest/doc/example/pgsql-pgbackrest-info.sql

Now the monitor.pgbackrest_info() function can be used to determine the last successful backup time and archived WAL for a stanza.

Query last successful backup time and archived WAL

cat {[pg-home-path]}/pgbackrest/doc/example/pgsql-pgbackrest-query.sql

psql -f {[pg-home-path]}/pgbackrest/doc/example/pgsql-pgbackrest-query.sql

Using <proper>jq</proper>

jq is a command-line utility that can easily extract data from JSON.

Install <proper>jq</proper> utility

apt-get install jq

-y 2>&1

Now jq can be used to query the last successful backup time for a stanza.

Query last successful backup time

pgbackrest --output=json --stanza=demo info | jq '.[0] | .backup[-1] | .timestamp.stop'

Or the last archived WAL.

Query last archived WAL

pgbackrest --output=json --stanza=demo info | jq '.[0] | .archive[-1] | .max'

This syntax requires jq v1.5. jq may round large numbers such as system identifiers. Test your queries carefully.

Backup

File Bundling

Bundling files together in the repository saves time during the backup and some space in the repository. This is especially pronounced when the repository is stored on an object store such as S3. Per-file creation time on object stores is higher and very small files might cost as much to store as larger files.

The file bundling feature is enabled with the repo-bundle option.

Configure <br-option>repo1-bundle</br-option>

A full backup without file bundling will have 1000+ files in the backup path, but with bundling the total number of files is greatly reduced. An additional benefit is that zero-length files are not stored (except in the manifest), whereas in a normal backup each zero-length file is stored individually.

Perform a full backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=full backup

Check file total

find {[backrest-repo-path]}/backup/demo/latest/ -type f | wc -l

The repo-bundle-size and repo-bundle-limit options can be used for tuning, though the defaults should be optimal in most cases.

While file bundling is generally more efficient, the downside is that it is more difficult to manually retrieve files from the repository. It may not be ideal for deduplicated storage since each full backup will arrange files in the bundles differently. Lastly, file bundles cannot be resumed, so be careful not to set repo-bundle-size too high.

Backup Annotations

Users can attach informative key/value pairs to the backup. This option may be used multiple times to attach multiple annotations.

Perform a full backup with annotations

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-annotation=source="demo backup" {[dash]}-annotation=key=value {[dash]}-type=full backup

Annotations are output by the info command text output when a backup is specified with --set and always appear in the JSON output.

Get info for the {[postgres-cluster-demo]} cluster

{[cmd-backup-last]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-set={[backup-annotate-last]} info

annotation

Annotations included with the backup command can be added, modified, or removed afterwards using the annotate command.

Change backup annotations

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-set={[backup-annotate-last]} {[dash]}-annotation=key= {[dash]}-annotation=new_key=new_value annotate

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-set={[backup-annotate-last]} info

annotation

Retention

Generally it is best to retain as many backups as possible to provide a greater window for Point-in-Time Recovery, but practical concerns such as disk space must also be considered. Retention options remove older backups once they are no longer needed.

Full Backup Retention

The repo1-retention-full-type determines how the option repo1-retention-full is interpreted; either as the count of full backups to be retained or how many days to retain full backups. New backups must be completed before expiration will occur — that means if repo1-retention-full-type=count and repo1-retention-full=2 then there will be three full backups stored before the oldest one is expired, or if repo1-retention-full-type=time and repo1-retention-full=20 then there must be one full backup that is at least 20 days old before expiration can occur.

Configure <br-option>repo1-retention-full</br-option>

Backup repo1-retention-full=2 but currently there is only one full backup so the next full backup to run will not expire any full backups.

Perform a full backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=full --log-level-console=detail backup

archive retention on backup {[backup-full-first]}|remove archive

{[cmd-backup-last]}

Archive is expired because WAL segments were generated before the oldest backup. These are not useful for recovery — only WAL segments generated after a backup can be used to recover that backup.

Perform a full backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=full --log-level-console=info backup

expire full backup set {[backup-full-first]}|archive retention on backup {[backup-full-second]}|remove archive

The {[backup-full-first]} full backup is expired and archive retention is based on the {[backup-full-second]} which is now the oldest full backup.

Differential Backup Retention

Set repo1-retention-diff to the number of differential backups required. Differentials only rely on the prior full backup so it is possible to create a rolling set of differentials for the last day or more. This allows quick restores to recent points-in-time but reduces overall space consumption.

Configure <br-option>repo1-retention-diff</br-option>

Backup repo1-retention-diff=1 so two differentials will need to be performed before one is expired. An incremental backup is added to demonstrate incremental expiration. Incremental backups cannot be expired independently — they are always expired with their related full or differential backup.

Perform differential and incremental backups

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=diff backup

{[cmd-backup-last]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=incr backup

Now performing a differential backup will expire the previous differential and incremental backups leaving only one differential backup.

Perform a differential backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=diff --log-level-console=info backup

expire diff backup set {[backup-diff-second]}

Archive Retention

Although automatically removes archived WAL segments when expiring backups (the default expires WAL for full backups based on the repo1-retention-full option), it may be useful to expire archive more aggressively to save disk space. Note that full backups are treated as differential backups for the purpose of differential archive retention.

Expiring archive will never remove WAL segments that are required to make a backup consistent. However, since Point-in-Time-Recovery (PITR) only works on a continuous WAL stream, care should be taken when aggressively expiring archive outside of the normal backup expiration process. To determine what will be expired without actually expiring anything, the dry-run option can be provided on the command line with the expire command.

Configure <br-option>repo1-retention-diff</br-option>

Perform differential backup

{[cmd-backup-last]}

psql -c " select pg_create_restore_point('generate WAL'); select {[pg-switch-wal]}(); select pg_create_restore_point('generate WAL'); select {[pg-switch-wal]}();"

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=diff --log-level-console=info backup

new backup label

{[cmd-backup-last]}

Expire archive

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --log-level-console=detail --repo1-retention-archive-type=diff --repo1-retention-archive=1 expire

archive retention on backup {[backup-diff-first]}|remove archive

The {[backup-diff-first]} differential backup has archived WAL segments that must be retained to make the older backups consistent even though they cannot be played any further forward with PITR. WAL segments generated after {[backup-diff-first]} but before {[backup-diff-second]} are removed. WAL segments generated after the new backup {[backup-diff-second]} remain and can be used for PITR.

Since full backups are considered differential backups for the purpose of differential archive retention, if a full backup is now performed with the same settings, only the archive for that full backup is retained for PITR.

Restore

The following sections introduce additional restore command features.

File Ownership

If a restore is run as a non-root user (the typical scenario) then all files restored will belong to the user/group executing . If existing files are not owned by the executing user/group then an error will result if the ownership cannot be updated to the executing user/group. In that case the file ownership will need to be updated by a privileged user before the restore can be retried.

If a restore is run as the root user then will attempt to recreate the ownership recorded in the manifest when the backup was made. Only user/group names are stored in the manifest so the same names must exist on the restore host for this to work. If the user/group name cannot be found locally then the user/group of the data directory will be used and finally root if the data directory user/group cannot be mapped to a name.

Delta Option

Restore a Backup in Quick Start required the database cluster directory to be cleaned before the restore could be performed. The delta option allows to automatically determine which files in the database cluster directory can be preserved and which ones need to be restored from the backup — it also removes files not present in the backup manifest so it will dispose of divergent changes. This is accomplished by calculating a SHA-1 cryptographic hash for each file in the database cluster directory. If the SHA-1 hash does not match the hash stored in the backup then that file will be restored. This operation is very efficient when combined with the process-max option. Since the server is shut down during the restore, a larger number of processes can be used than might be desirable during a backup when the server is running.

Stop the {[postgres-cluster-demo]} cluster, perform delta restore

{[pg-cluster-stop]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta --log-level-console=detail restore

demo\/PG_VERSION - exists and matches backup|remove invalid files|rename global\/pg_control

Restart <postgres/>

{[pg-cluster-start]}

{[pg-cluster-wait]}

Restore Selected Databases

There may be cases where it is desirable to selectively restore specific databases from a cluster backup. This could be done for performance reasons or to move selected databases to a machine that does not have enough space to restore the entire cluster backup.

To demonstrate this feature two databases are created: test1 and test2.

Create two test databases

psql -c "create database test1;"

psql -c "create database test2;"

Each test database will be seeded with tables and data to demonstrate that recovery works with selective restore.

Create a test table in each database

psql -c "create table test1_table (id int); insert into test1_table (id) values (1);" test1

psql -c "create table test2_table (id int); insert into test2_table (id) values (2);" test2

A fresh backup is run so is aware of the new databases.

Perform a backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=incr backup

One of the main reasons to use selective restore is to save space. The size of the test1 database is shown here so it can be compared with the disk utilization after a selective restore.

Show space used by test1 database

psql -Atc "select oid from pg_database where datname = 'test1'"

du -sh {[pg-path]}/base/{[database-test1-oid]}

If the database to restore is not known, use the info command set option to discover databases that are part of the backup set.

Show database list for backup

{[cmd-backup-last]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-set={[backup-last-incr]} info

database list

Stop the cluster and restore only the test2 database. Built-in databases (template0, template1, and postgres) are always restored.

Recovery may error unless --type=immediate is specified. This is because after consistency is reached will flag zeroed pages as errors even for a full-page write. For ≥ 13 the ignore_invalid_pages setting may be used to ignore invalid pages. In this case it is important to check the logs after recovery to ensure that no invalid pages were reported in the selected databases.

Restore from last backup including only the test2 database

{[pg-cluster-stop]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta {[dash]}-db-include=test2 {[dash]}-type=immediate {[dash]}-target-action=promote restore

{[pg-cluster-start]}

{[pg-cluster-wait]}

Once recovery is complete the test2 database will contain all previously created tables and data.

Demonstrate that the test2 database was recovered

psql -c "select * from test2_table;" test2

The test1 database, despite successful recovery, is not accessible. This is because the entire database was restored as sparse, zeroed files. can successfully apply WAL on the zeroed files but the database as a whole will not be valid because key files contain no data. This is purposeful to prevent the database from being accidentally used when it might contain partial data that was applied during WAL replay.

Attempting to connect to the test1 database will produce an error

psql -c "select * from test1_table;" test1

relation mapping file.*contains invalid data

Since the test1 database is restored with sparse, zeroed files it will only require as much space as the amount of WAL that is written during recovery. While the amount of WAL generated during a backup and applied during recovery can be significant it will generally be a small fraction of the total database size, especially for large databases where this feature is most likely to be useful.

It is clear that the test1 database uses far less disk space during the selective restore than it would have if the entire database had been restored.

Show space used by test1 database after recovery

du -sh {[pg-path]}/base/{[database-test1-oid]}

At this point the only action that can be taken on the invalid test1 database is drop database. does not automatically drop the database since this cannot be done until recovery is complete and the cluster is accessible.

Drop the test1 database

psql -c "drop database test1;"

Now that the invalid test1 database has been dropped only the test2 and built-in databases remain.

List remaining databases

psql -c "select oid, datname from pg_database order by oid;"

test2

Point-in-Time Recovery

Restore a Backup in Quick Start performed default recovery, which is to play all the way to the end of the WAL stream. In the case of a hardware failure this is usually the best choice but for data corruption scenarios (whether machine or human in origin) Point-in-Time Recovery (PITR) is often more appropriate.

Point-in-Time Recovery (PITR) allows the WAL to be played from the last backup to a specified lsn, time, transaction id, or recovery point. For common recovery scenarios time-based recovery is arguably the most useful. A typical recovery scenario is to restore a table that was accidentally dropped or data that was accidentally deleted. Recovering a dropped table is more dramatic so that's the example given here but deleted data would be recovered in exactly the same way.

Backup the {[postgres-cluster-demo]} cluster and create a table with very important data

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=diff backup

psql -c "begin; create table important_table (message text); insert into important_table values ('{[test-table-data]}'); commit; select * from important_table;"

{[test-table-data]}

It is important to represent the time as reckoned by and to include timezone offsets. This reduces the possibility of unintended timezone conversions and an unexpected recovery result.

Get the time from <postgres/>

sleep 1

psql -Atc "select current_timestamp"

sleep 1

Now that the time has been recorded the table is dropped. In practice finding the exact time that the table was dropped is a lot harder than in this example. It may not be possible to find the exact time, but some forensic work should be able to get you close.

Drop the important table

psql -c "begin; drop table important_table; commit; select * from important_table;"

does not exist

Now the restore can be performed with time-based recovery to bring back the missing table.

Stop <postgres/>, restore the {[postgres-cluster-demo]} cluster to <id>{[time-recovery-timestamp]}</id>, and display <file>{[pg-recovery-file-demo]}</file>

{[pg-cluster-stop]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta {[dash]}-type=time "{[dash]}-target={[time-recovery-timestamp]}" --target-action=promote restore

rm {[postgres-log-demo]}

cat {[pg-recovery-path-demo]}

recovery_target_time

has automatically generated the recovery settings in {[pg-recovery-file-demo]} so can be started immediately. %f is how specifies the WAL segment it needs and %p is the location where it should be copied. Once has finished recovery the table will exist again and can be queried.

Start <postgres/> and check that the important table exists

{[pg-cluster-start]}

{[pg-cluster-wait]}

psql -c "select * from important_table"

{[test-table-data]}

The log also contains valuable information. It will indicate the time and transaction where the recovery stopped and also give the time of the last transaction to be applied.

Examine the <postgres/> log output

cat {[postgres-log-demo]}

recovery stopping before|last completed transaction|starting point-in-time recovery

This example was rigged to give the correct result. If a backup after the required time is chosen then will not be able to recover the lost table. can only play forward, not backward. To demonstrate this the important table must be dropped (again).

Drop the important table (again)

psql -c "begin; drop table important_table; commit; select * from important_table;"

does not exist

Now take a new backup and attempt recovery from the new backup by specifying the {[dash]}-set option. The info command can be used to find the new backup label.

Perform a backup and get backup info

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-type=incr backup

{[cmd-backup-last]}

{[project-exe]} info

{[backup-last]}

Attempt recovery from the specified backup

{[pg-cluster-stop]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta {[dash]}-set={[backup-last]} {[dash]}-type=time "{[dash]}-target={[time-recovery-timestamp]}" {[dash]}-target-action=promote restore

rm {[postgres-log-demo]}

{[pg-cluster-start]}

{[pg-cluster-wait]}

psql -c "select * from important_table"

does not exist

Looking at the log output it's not obvious that recovery failed to restore the table. The key is to look for the presence of the recovery stopping before... and last completed transaction... log messages. If they are not present then the recovery to the specified point-in-time was not successful.

Examine the <postgres/> log output to discover the recovery was not successful

cat {[postgres-log-demo]}

starting point-in-time recovery|consistent recovery state reached

The default behavior for time-based restore, if the {[dash]}-set option is not specified, is to attempt to discover an earlier backup to play forward from. If a backup set cannot be found, then restore will default to the latest backup which, as shown earlier, may not give the desired result.

Stop <postgres/>, restore from auto-selected backup, and start <postgres/>

{[pg-cluster-stop]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta {[dash]}-type=time "{[dash]}-target={[time-recovery-timestamp]}" {[dash]}-target-action=promote restore

rm {[postgres-log-demo]}

{[pg-cluster-start]}

{[pg-cluster-wait]}

psql -c "select * from important_table"

{[test-table-data]}

Now the log output will contain the expected recovery stopping before... and last completed transaction... messages showing that the recovery was successful.

Examine the <postgres/> log output for log messages indicating success

cat {[postgres-log-demo]}

recovery stopping before|last completed transaction|starting point-in-time recovery

Multiple Repositories

Multiple repositories may be configured as demonstrated in S3 Support. A potential benefit is the ability to have a local repository for fast restores and a remote repository for redundancy.

Some commands, e.g. stanza-create/stanza-upgrade, will automatically work with all configured repositories while others, e.g. stanza-delete, will require a repository to be specified using the repo option. See the command reference for details on which commands require the repository to be specified.

Note that the repo option is not required when only repo1 is configured in order to maintain backward compatibility. However, the repo option is required when a single repo is configured as, e.g. repo2. This is to prevent command breakage if a new repository is added later.

The archive-push command will always push WAL to the archive in all configured repositories but backups will need to be scheduled individually for each repository. In many cases this is desirable since backup types and retention will vary by repository. Likewise, restores must specify a repository. It is generally better to specify a repository for restores that has low latency/cost even if that means more recovery time. Only restore testing can determine which repository will be most efficient.

Azure-Compatible Object Store Support

{[host-pg1]}

postgres

postgres:postgres

Commands are run exactly as if the repository were stored on a local disk.

Create the stanza

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-log-level-console=info stanza-create

completed successfully

File creation time in object stores is relatively slow so commands benefit by increasing process-max to parallelize file creation.

Backup the {[postgres-cluster-demo]} cluster

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --repo=2 --log-level-console=info backup

no prior backup exists|full backup size

S3-Compatible Object Store Support

{[host-pg1]}

postgres

postgres:postgres

A role should be created to run and the bucket permissions should be set as restrictively as possible. If the role is associated with an instance in AWS then will automatically retrieve temporary credentials when repo3-s3-key-type=auto, which means that keys do not need to be explicitly set in {[backrest-config-demo]}.

This sample Amazon S3 policy will restrict all reads and writes to the bucket and repository path.

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::{[s3-bucket]}" ], "Condition": { "StringEquals": { "s3:prefix": [ "", "{[s3-repo]}" ], "s3:delimiter": [ "/" ] } } }, { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::{[s3-bucket]}" ], "Condition": { "StringLike": { "s3:prefix": [ "{[s3-repo]}/*" ] } } }, { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::{[s3-bucket]}/{[s3-repo]}/*" ] } ] }

Commands are run exactly as if the repository were stored on a local disk.

Create the stanza

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-log-level-console=info stanza-create

completed successfully

File creation time in object stores is relatively slow so commands benefit by increasing process-max to parallelize file creation.

Backup the {[postgres-cluster-demo]} cluster

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --repo=3 --log-level-console=info backup

no prior backup exists|full backup size

GCS-Compatible Object Store Support

{[host-pg1]}

postgres

postgres:postgres

Commands are run exactly as if the repository were stored on a local disk.

File creation time in object stores is relatively slow so commands benefit by increasing process-max to parallelize file creation.

Delete a Stanza

Stop <postgres/> cluster to be removed

{[pg-cluster-stop]}

Stop <backrest/> for the stanza

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-log-level-console=info stop

completed successfully

Delete the stanza from one repository

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --repo=1 {[dash]}-log-level-console=info stanza-delete

completed successfully

{[pg-cluster-start]}

Dedicated Repository Host

The configuration described in Quickstart is suitable for simple installations but for enterprise configurations it is more typical to have a dedicated repository host where the backups and WAL archive files are stored. This separates the backups and WAL archive from the database server so database host failures have less impact. It is still a good idea to employ traditional backup software to backup the repository host.

On hosts, pg1-path is required to be the path of the local PostgreSQL cluster and no pg1-host should be configured. When configuring a repository host, the pgbackrest configuration file must have the pg-host option configured to connect to the primary and standby (if any) hosts. The repository host has the only pgbackrest configuration that should be aware of more than one host. Order does not matter, e.g. pg1-path/pg1-host, pg2-path/pg2-host can be primary or standby.

Installation

A new host named repository is created to store the cluster backups.

The version installed on the repository host must exactly match the version installed on the host.

The {[br-user]} user is created to own the repository. Any user can own the repository but it is best not to use postgres (if it exists) to avoid confusion.

Create <user>{[br-user]}</user> user

adduser --disabled-password --gecos "" {[br-user]}

groupadd {[br-group]}

adduser -g{[br-group]} -n {[br-user]}

{[host-repo1]}

{[br-user]}

{[br-group]}

{[host-repo1]}

{[br-user]}

{[br-group]}

Setup Passwordless SSH

Create <host>{[host-repo1]}</host> host key pair

mkdir -m 750 {[br-home-path]}/.ssh

ssh-keygen -f {[br-home-path]}/.ssh/id_rsa -t rsa -b 4096 -N ""

{[host-pg1]}

postgres

{[pg-home-path]}

ssh has been configured to only allow to be run via passwordless ssh. This enhances security in the event that one of the service accounts is hijacked.

Configuration

can use TLS with client certificates to enable communication between the hosts. It is also possible to use SSH, see Setup SSH.

expects client/server certificates to be generated in the same way as . See Secure TCP/IP Connections with TLS for detailed instructions on generating certificates.

Configure the <backrest/> repository path

{[backrest-repo-path]}

The repository host must be configured with the {[host-pg1]} host/user and database path. The primary will be configured as pg1 to allow a standby to be added later.

Configure <br-option>pg1-host</br-option>/<br-option>pg1-host-user</br-option> and <br-option>pg1-path</br-option>

{[pg-path]}

{[host-pg1]}

tls

/etc/pgbackrest/cert/ca.crt

/etc/pgbackrest/cert/client.crt

/etc/pgbackrest/cert/client.key

pgbackrest-client=demo

/etc/pgbackrest/cert/ca.crt

/etc/pgbackrest/cert/server.crt

/etc/pgbackrest/cert/server.key

off

The database host must be configured with the repository host/user. The default for the repo1-host-user option is pgbackrest. If the postgres user does restores on the repository host it is best not to also allow the postgres user to perform backups. However, the postgres user can read the repository directly if it is in the same group as the pgbackrest user.

Configure <br-option>repo1-host</br-option>/<br-option>repo1-host-user</br-option>

{[pg-path]}

{[host-repo1]}

tls

/etc/pgbackrest/cert/ca.crt

/etc/pgbackrest/cert/client.crt

/etc/pgbackrest/cert/client.key

pgbackrest-client=demo

/etc/pgbackrest/cert/ca.crt

/etc/pgbackrest/cert/server.crt

/etc/pgbackrest/cert/server.key

detail

off

configuration may be found in the Configure Archiving section.

Commands are run the same as on a single host configuration except that some commands such as backup and expire are run from the repository host instead of the database host.

Configure Azure-compatible object store if required.

{[host-repo1]}

{[br-user]}

{[br-user]}:{[br-group]}

Configure GCS-compatible object store if required.

{[host-repo1]}

{[br-user]}

{[br-user]}:{[br-group]}

Configure S3-compatible object store if required.

{[host-repo1]}

{[br-user]}

{[br-user]}:{[br-group]}

Setup TLS Server

The TLS server must be configured and started on each host.

{[host-repo1]}

{[br-user]}

{[br-group]}

{[host-pg1]}

postgres

Create and Check Stanza

Create the stanza in the new repository.

Create the stanza

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} stanza-create

Check that the configuration is correct on both the database and repository hosts. More information about the check command can be found in Check the Configuration.

Check the configuration

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} check

Check the configuration

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} check

Perform a Backup

To perform a backup of the cluster run with the backup command on the repository host.

Backup the {[postgres-cluster-demo]} cluster

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} backup

Since a new repository was created on the repository host the warning about the incremental backup changing to a full backup was emitted.

Restore a Backup

To perform a restore of the cluster run with the restore command on the database host.

Stop the {[postgres-cluster-demo]} cluster, restore, and restart <postgres/>

{[pg-cluster-stop]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta restore

{[pg-cluster-start]}

{[pg-cluster-wait]}

Parallel Backup / Restore

offers parallel processing to improve performance of compression and transfer. The number of processes to be used for this feature is set using the --process-max option.

It is usually best not to use more than 25% of available CPUs for the backup command. Backups don't have to run that fast as long as they are performed regularly and the backup process should not impact database performance, if at all possible.

The restore command can and should use all available CPUs because during a restore the cluster is shut down and there is generally no other important work being done on the host. If the host contains multiple clusters then that should be considered when setting restore parallelism.

Perform a backup with single process

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-type=full backup

Configure <backrest/> to use multiple <cmd>backup</cmd> processes

Perform a backup with multiple processes

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-type=full backup

Get backup info for the {[postgres-cluster-demo]} cluster

{[project-exe]} info

timestamp start/stop

The performance of the last backup should be improved by using multiple processes. For very small backups the difference may not be very apparent, but as the size of the database increases so will time savings.

Starting and Stopping

Sometimes it is useful to prevent from running on a system. For example, when failing over from a primary to a standby it's best to prevent from running on the old primary in case gets restarted or can't be completely killed. This will also prevent from running on cron.

Stop the <backrest/> services

{[project-exe]} stop

New processes will no longer run.

Attempt a backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} backup

\: stop file exists for all stanzas

Specify the --force option to terminate any process that are currently running. If is already stopped then stopping again will generate a warning.

Stop the <backrest/> services again

{[project-exe]} stop

Start processes again with the start command.

Start the <backrest/> services

{[project-exe]} start

It is also possible to stop for a single stanza.

Stop <backrest/> services for the <id>demo</id> stanza

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} stop

New processes for the specified stanza will no longer run.

Attempt a backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} backup

\: stop file exists for stanza demo

The stanza must also be specified when starting the processes for a single stanza.

Start the <backrest/> services for the <id>demo</id> stanza

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} start

Replication

Replication allows multiple copies of a cluster (called standbys) to be created from a single primary. The standbys are useful for balancing reads and to provide redundancy in case the primary host fails.

Installation

A new host named {[host-pg2]} is created to run the standby.

{[host-pg2]}

postgres

Setup Passwordless SSH

{[host-pg2]}

postgres

{[pg-home-path]}

Hot Standby

A hot standby performs replication using the WAL archive and allows read-only queries.

configuration is very similar to {[host-pg1]} except that the standby recovery type will be used to keep the cluster in recovery mode when the end of the WAL stream has been reached.

Configure <backrest/> on the standby

{[pg-path]}

{[host-repo1]}

tls

/etc/pgbackrest/cert/ca.crt

/etc/pgbackrest/cert/client.crt

/etc/pgbackrest/cert/client.key

pgbackrest-client=demo

/etc/pgbackrest/cert/ca.crt

/etc/pgbackrest/cert/server.crt

/etc/pgbackrest/cert/server.key

detail

off

{[host-pg2]}

postgres

The demo cluster must be created (even though it will be overwritten on restore) in order to create the configuration files.

Create demo cluster

{[pg-cluster-create]}

Create the path where will be restored.

Create <postgres/> path

mkdir -p -m 700 {[pg-path]}

Now the standby can be created with the restore command.

If the cluster is intended to be promoted without becoming the new primary (e.g. for reporting or testing), use --archive-mode=off or set archive_mode=off in postgresql.conf to disable archiving. If archiving is not disabled then the repository may be polluted with WAL that can make restores more difficult.

Restore the {[postgres-cluster-demo]} standby cluster

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta --type=standby restore

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=standby restore

cat {[pg-recovery-path-demo]}

cat /root/postgresql.common.conf >> {[postgres-config-demo]}

The hot_standby setting must be enabled before starting to allow read-only connections on {[host-pg2]}. Otherwise, connection attempts will be refused. The rest of the configuration is in case the standby is promoted to a primary.

Configure <postgres/>

'{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} archive-push %p'

{[wal-level]}

'postgresql.log'

Start <postgres/>

rm {[postgres-log-demo]}

{[pg-cluster-start]}

{[pg-cluster-wait]}

The log gives valuable information about the recovery. Note especially that the cluster has entered standby mode and is ready to accept read-only connections.

Examine the <postgres/> log output for log messages indicating success

cat {[postgres-log-demo]}

entering standby mode|database system is ready to accept read only connections

An easy way to test that replication is properly configured is to create a table on {[host-pg1]}.

Create a new table on the primary

psql -c " begin; create table replicated_table (message text); insert into replicated_table values ('{[test-table-data]}'); commit; select * from replicated_table";

{[test-table-data]}

And then query the same table on {[host-pg2]}.

Query new table on the standby

psql -c "select * from replicated_table;"

does not exist

So, what went wrong? Since is pulling WAL segments from the archive to perform replication, changes won't be seen on the standby until the WAL segment that contains those changes is pushed from {[host-pg1]}.

This can be done manually by calling {[pg-switch-wal]}() which pushes the current WAL segment to the archive (a new WAL segment is created to contain further changes).

Call <code>{[pg-switch-wal]}()</code>

psql -c "select *, current_timestamp from {[pg-switch-wal]}()";

Now after a short delay the table will appear on {[host-pg2]}.

Now the new table exists on the standby (may require a few retries)

psql -c " select *, current_timestamp from replicated_table"

{[test-table-data]}

Check the standby configuration for access to the repository.

Check the configuration

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-log-level-console=info check

because this is a standby

Streaming Replication

Instead of relying solely on the WAL archive, streaming replication makes a direct connection to the primary and applies changes as soon as they are made on the primary. This results in much less lag between the primary and standby.

Streaming replication requires a user with the replication privilege.

Create replication user

psql -c " create user replicator password 'jw8s0F4' replication";

The pg_hba.conf file must be updated to allow the standby to connect as the replication user. Be sure to replace the IP address below with the actual IP address of your {[host-pg2]}. A reload will be required after modifying the pg_hba.conf file.

Create <file>pg_hba.conf</file> entry for replication user

sh -c 'echo "host replication replicator {[host-pg2-ip]}/32 md5" >> {[postgres-hba-demo]}'

{[pg-cluster-reload]}

The standby needs to know how to contact the primary so the primary_conninfo setting will be configured in .

Set <pg-option>primary_conninfo</pg-option>

primary_conninfo=host={[host-pg1-ip]} port=5432 user=replicator

It is possible to configure a password in the primary_conninfo setting but using a .pgpass file is more flexible and secure.

Configure the replication password in the <file>.pgpass</file> file.

sh -c 'echo "{[host-pg1-ip]}:*:replication:replicator:jw8s0F4" >> {[postgres-pgpass]}'

chmod 600 {[postgres-pgpass]}

Now the standby can be created with the restore command.

Stop <postgres/> and restore the {[postgres-cluster-demo]} standby cluster

{[pg-cluster-stop]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta --type=standby restore

cat {[pg-recovery-path-demo]}

The primary_conninfo setting has been written into the {[pg-recovery-file-demo]} file because it was configured as a recovery-option in {[project-exe]}.conf. The {[dash]}-type=preserve option can be used with the restore to leave the existing {[pg-recovery-file-demo]} file in place if that behavior is preferred.

By default {[user-guide-os]} stores the postgresql.conf file in the data directory. That means the change made to postgresql.conf was overwritten by the last restore and the hot_standby setting must be enabled again. Other solutions to this problem are to store the postgresql.conf file elsewhere or to enable the hot_standby setting on the {[host-pg1]} host where it will be ignored.

Enable <pg-option>hot_standby</pg-option>

Start <postgres/>

rm {[postgres-log-demo]}

{[pg-cluster-start]}

{[pg-cluster-wait]}

The log will confirm that streaming replication has started.

Examine the <postgres/> log output for log messages indicating success

cat {[postgres-log-demo]}

started streaming WAL from primary

Now when a table is created on {[host-pg1]} it will appear on {[host-pg2]} quickly and without the need to call {[pg-switch-wal]}().

Create a new table on the primary

psql -c " begin; create table stream_table (message text); insert into stream_table values ('{[test-table-data]}'); commit; select *, current_timestamp from stream_table";

{[test-table-data]}

Query table on the standby

psql -c " select *, current_timestamp from stream_table"

{[test-table-data]}

Asynchronous Archiving

Asynchronous archiving is enabled with the archive-async option. This option enables asynchronous operation for both the archive-push and archive-get commands.

A spool path is required. The commands will store transient data here but each command works quite a bit differently so spool path usage is described in detail in each section.

Create the spool directory

mkdir -p -m 750 {[spool-path]}

chown postgres:postgres {[spool-path]}

Create the spool directory

mkdir -p -m 750 {[spool-path]}

chown postgres:postgres {[spool-path]}

The spool path must be configured and asynchronous archiving enabled. Asynchronous archiving automatically confers some benefit by reducing the number of connections made to remote storage, but setting process-max can drastically improve performance by parallelizing operations. Be sure not to set process-max so high that it affects normal database operations.

Configure the spool path and asynchronous archiving

{[spool-path]}

Configure the spool path and asynchronous archiving

{[spool-path]}

process-max is configured using command sections so that the option is not used by backup and restore. This also allows different values for archive-push and archive-get.

For demonstration purposes streaming replication will be broken to force to get WAL using the restore_command.

Break streaming replication by changing the replication password

psql -c "alter user replicator password 'bogus'"

Restart standby to break connection

{[pg-cluster-restart]}

Archive Push

The asynchronous archive-push command offloads WAL archiving to a separate process (or processes) to improve throughput. It works by looking ahead to see which WAL segments are ready to be archived beyond the request that is currently making via the archive_command. WAL segments are transferred to the archive directly from the pg_xlog/pg_wal directory and success is only returned by the archive_command when the WAL segment has been safely stored in the archive.

The spool path holds the current status of WAL archiving. Status files written into the spool directory are typically zero length and should consume a minimal amount of space (a few MB at most) and very little IO. All the information in this directory can be recreated so it is not necessary to preserve the spool directory if the cluster is moved to new hardware.

In the original implementation of asynchronous archiving, WAL segments were copied to the spool directory before compression and transfer. The new implementation copies WAL directly from the pg_xlog directory. If asynchronous archiving was utilized in v1.12 or prior, read the v1.13 release notes carefully before upgrading.

The [stanza]-archive-push-async.log file can be used to monitor the activity of the asynchronous process. A good way to test this is to quickly push a number of WAL segments.

Test parallel asynchronous archiving

rm -f /var/log/pgbackrest/demo-archive-push-async.log

psql -c " select pg_create_restore_point('test async push'); select {[pg-switch-wal]}(); select pg_create_restore_point('test async push'); select {[pg-switch-wal]}(); select pg_create_restore_point('test async push'); select {[pg-switch-wal]}(); select pg_create_restore_point('test async push'); select {[pg-switch-wal]}(); select pg_create_restore_point('test async push'); select {[pg-switch-wal]}();"

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-log-level-console=info check

WAL segment

Now the log file will contain parallel, asynchronous activity.

Check results in the log

cat /var/log/pgbackrest/demo-archive-push-async.log

WAL file$s$ to archive|pushed WAL file \'0000000

Archive Get

The asynchronous archive-get command maintains a local queue of WAL to improve throughput. If a WAL segment is not found in the queue it is fetched from the repository along with enough consecutive WAL to fill the queue. The maximum size of the queue is defined by archive-get-queue-max. Whenever the queue is less than half full more WAL will be fetched to fill it.

Asynchronous operation is most useful in environments that generate a lot of WAL or have a high latency connection to the repository storage (i.e., S3 or other object stores). In the case of a high latency connection it may be a good idea to increase process-max.

The [stanza]-archive-get-async.log file can be used to monitor the activity of the asynchronous process.

Check results in the log

sleep 5

cat /var/log/pgbackrest/demo-archive-get-async.log

found [0-F]{24} in the .* archive

Fix streaming replication by changing the replication password

psql -c "alter user replicator password 'jw8s0F4'"

Backup from a Standby

can perform backups on a standby instead of the primary. Standby backups require the {[host-pg2]} host to be configured and the backup-standby option enabled. If more than one standby is configured then the first running standby found will be used for the backup.

Configure <br-option>pg2-host</br-option>/<br-option>pg2-host-user</br-option> and <br-option>pg2-path</br-option>

{[pg-path]}

tls

{[host-pg2]}

/etc/pgbackrest/cert/ca.crt

/etc/pgbackrest/cert/client.crt

/etc/pgbackrest/cert/client.key

Both the primary and standby databases are required to perform the backup, though the vast majority of the files will be copied from the standby to reduce load on the primary. The database hosts can be configured in any order. will automatically determine which is the primary and which is the standby.

Backup the {[postgres-cluster-demo]} cluster from <host>pg2</host>

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --log-level-console=detail backup

backup file {[host-pg1]}|replay on the standby

This incremental backup shows that most of the files are copied from the {[host-pg2]} host and only a few are copied from the {[host-pg1]} host.

creates a standby backup that is identical to a backup performed on the primary. It does this by starting/stopping the backup on the {[host-pg1]} host, copying only files that are replicated from the {[host-pg2]} host, then copying the remaining few files from the {[host-pg1]} host. This means that logs and statistics from the primary database will be included in the backup.

Stress Testing

Configuration

Configure {[host-repo1]} for stress testing

lz4

Create the {[host-pg1]} spool directory

mkdir -p -m 750 {[spool-path]}

chown postgres:postgres {[spool-path]}

Configure {[host-pg1]} for stress testing

lz4

{[spool-path]}

Create the {[host-pg2]} spool directory

mkdir -p -m 750 {[spool-path]}

chown postgres:postgres {[spool-path]}

Configure {[host-pg2]} for stress testing

lz4

{[spool-path]}

Create Tables and Load Data

Break Streaming Replication

Break streaming replication to force the standby to replicate from the archive during data load.

Break streaming replication by changing the replication password

psql -c "alter user replicator password 'bogus'"

Restart standby to break connection

{[pg-cluster-restart]}

Create Tables

Create tables

bash -c 'for i in {1..{[stress-scale-table]}}; do psql -c "select create_test_table(${i?}, 1000, true)"; done'

Load Data

Load data

{[pg-bin-path]}/pgbench -n -i -s {[stress-scale-data]}

2>&1

Fix Streaming Replication

Fix streaming replication so backups will work. Note that streaming replication will not start again until all WAL in the archive has been exhausted.

Fix streaming replication by changing the replication password

psql -c "alter user replicator password 'jw8s0F4'"

Testing

Full Backup

Full backup

pgbackrest --stanza=demo --type=full --log-level-console=info --log-level-file=detail backup

2>&1

Diff Backup with Delta

Diff backup

pgbackrest --stanza=demo --type=diff --delta --log-level-console=info --log-level-file=detail backup

2>&1

Restore with Delta

Stop <postgres/>

{[pg-cluster-stop]}

Restore

pgbackrest --stanza=demo --type=standby --delta --log-level-console=info --log-level-file=detail restore

2>&1

Restore

Remove data

rm -rf {[pg-path]}

Restore

pgbackrest --stanza=demo --type=standby --log-level-console=info --log-level-file=detail restore

2>&1

Start <postgres/>

{[pg-cluster-start]}

Check cluster

psql -c "select count(*) from pg_class"

Upgrading <postgres/>

The following instructions are not meant to be a comprehensive guide for upgrading , rather they outline the general process for upgrading a primary and standby with the intent of demonstrating the steps required to reconfigure . It is recommended that a backup be taken prior to upgrading.

Stop old cluster

{[pg-cluster-stop]}

Stop the old cluster on the standby since it will be restored from the newly upgraded cluster.

Stop old cluster

{[pg-cluster-stop]}

Create the new cluster and perform upgrade.

Create new cluster and perform the upgrade

{[pg-bin-upgrade-path]}/initdb -D {[pg-path-upgrade]} -k -A peer

{[pg-cluster-create-upgrade]}

sh -c 'cd /var/lib/postgresql && /usr/lib/postgresql/{[pg-version-upgrade]}/bin/pg_upgrade {[dash]}-old-bindir=/usr/lib/postgresql/{[pg-version]}/bin {[dash]}-new-bindir=/usr/lib/postgresql/{[pg-version-upgrade]}/bin {[dash]}-old-datadir={[pg-path]} {[dash]}-new-datadir={[pg-path-upgrade]} {[dash]}-old-options=" -c config_file={[postgres-config-demo]}" {[dash]}-new-options=" -c config_file={[postgres-config-demo-upgrade]}"'

Upgrade Complete

sh -c 'cd /var/lib/pgsql && /usr/pgsql-{[pg-version-upgrade]}/bin/pg_upgrade {[dash]}-old-bindir=/usr/pgsql-{[pg-version]}/bin {[dash]}-new-bindir=/usr/pgsql-{[pg-version-upgrade]}/bin {[dash]}-old-datadir={[pg-path]} {[dash]}-new-datadir={[pg-path-upgrade]} {[dash]}-old-options=" -c config_file={[postgres-config-demo]}" {[dash]}-new-options=" -c config_file={[postgres-config-demo-upgrade]}"'

Upgrade Complete

cat /root/postgresql.common.conf >> {[postgres-config-demo-upgrade]}

Configure the new cluster settings and port.

Configure <postgres/>

'{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} archive-push %p'

{[wal-level]}

'postgresql.log'

Update the configuration on all systems to point to the new cluster.

Upgrade the <br-option>pg1-path</br-option>

{[pg-path-upgrade]}

Upgrade the <br-option>pg-path</br-option>

{[pg-path-upgrade]}

Upgrade <br-option>pg1-path</br-option> and <br-option>pg2-path</br-option>, disable backup from standby

{[pg-path-upgrade]}

Copy hba configuration

cp {[postgres-hba-demo]} {[postgres-hba-demo-upgrade]}

Before starting the new cluster, the stanza-upgrade command must be run.

Upgrade the stanza

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-no-online {[dash]}-log-level-console=info stanza-upgrade

completed successfully

Start the new cluster and confirm it is successfully installed.

Start new cluster

{[pg-cluster-start-upgrade]}

Test configuration using the check command.

Check configuration

{[pg-cluster-check-upgrade]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} check

Remove the old cluster.

Remove old cluster

pg_dropcluster {[pg-version]} {[postgres-cluster-demo]}

rm -rf {[pg-path]}

Install the new binaries on the standby and create the cluster.

Remove old cluster and create the new cluster

pg_dropcluster {[pg-version]} {[postgres-cluster-demo]}

rm -rf {[pg-path]}

mkdir -p -m 700 {[pg-bin-upgrade-path]}

{[pg-cluster-create-upgrade]}

Run the check on the repository host. The warning regarding the standby being down is expected since the standby cluster is down. Running this command demonstrates that the repository server is aware of the standby and is configured properly for the primary server.

Check configuration

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} check

Run a full backup on the new cluster and then restore the standby from the backup. The backup type will automatically be changed to full if incr or diff is requested.

Run a full backup

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-type=full backup

Restore the {[postgres-cluster-demo]} standby cluster

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} {[dash]}-delta --type=standby restore

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} --type=standby restore

Configure <postgres/>

Start <postgres/> and check the <backrest/> configuration

{[pg-cluster-start-upgrade]}

{[pg-cluster-wait]}

{[project-exe]} {[dash]}-stanza={[postgres-cluster-demo]} check

Backup from standby can be enabled now that the standby is restored.

Reenable backup from standby