<description>The {[project]} User Guide demonstrates how to quickly and easily setup {[project]} for your {[postgres]} database. Step-by-step instructions lead the user through all the important features of the fastest, most reliable {[postgres]} backup and restore solution.</description>
<p>This user guide is intended to be followed sequentially from beginning to end &mdash; each section depends on the last. For example, the <linksection="/backup">Backup</link> section relies on setup that is performed in the <linksection="/quickstart">Quick Start</link> section. Once <backrest/> is up and running then skipping around is possible but following the user guide in order is recommended the first time through.</p>
<p>Although the examples are targeted at {[user-guide-os]} and <postgres/> {[pg-version]}, it should be fairly easy to apply this guide to any Unix distribution and <postgres/> version. Note that only 64-bit distributions are currently supported due to 64-bit operations in the Perl code. The only OS-specific commands are those to create, start, stop, and drop <postgres/> clusters. The <backrest/> commands will be the same on any Unix system though the locations to install Perl libraries and executables may vary.
Configuration information and documentation for PostgreSQL can be found in the <postgres/><linkurl='http://www.postgresql.org/docs/{[pg-version]}/static/index.html'>Manual</link>.</p>
<p>A somewhat novel approach is taken to documentation in this user guide. Each command is run on a virtual machine when the documentation is built from the XML source. This means you can have a high confidence that the commands work correctly in the order presented. Output is captured and displayed below the command when appropriate. If the output is not included it is because it was deemed not relevant or was considered a distraction from the narrative.</p>
<p>All commands are intended to be run as an unprivileged user that has sudo privileges for both the <user>root</user> and <user>postgres</user> users. It's also possible to run the commands directly as their respective users without modification and in that case the <cmd>sudo</cmd> commands can be stripped off.</p>
<p>A backup is a consistent copy of a database cluster that can be restored to recover from a hardware failure, to perform Point-In-Time Recovery, or to bring up a new standby.</p>
<p><b>Full Backup</b>: <backrest/> copies the entire contents of the database cluster to the backup server. The first backup of the database cluster is always a Full Backup. <backrest/> is always able to restore a full backup directly. The full backup does not depend on any files outside of the full backup for consistency.</p>
<p><b>Differential Backup</b>: <backrest/> copies only those database cluster files that have changed since the last full backup. <backrest/> restores a differential backup by copying all of the files in the chosen differential backup and the appropriate unchanged files from the previous full backup. The advantage of a differential backup is that it requires less disk space than a full backup, however, the differential backup and the full backup must both be valid to restore the differential backup.</p>
<p><b>Incremental Backup</b>: <backrest/> copies only those database cluster files that have changed since the last backup (which can be another incremental backup, a differential backup, or a full backup). As an incremental backup only includes those files changed since the prior backup, they are generally much smaller than full or differential backups. As with the differential backup, the incremental backup depends on other backups to be valid to restore the incremental backup. Since the incremental backup includes only those files since the last backup, all prior incremental backups back to the prior differential, the prior differential backup, and the prior full backup must all be valid to perform a restore of the incremental backup. If no differential backup exists then all prior incremental backups back to the prior full backup, which must exist, and the full backup itself must be valid to restore the incremental backup.</p>
<p>A restore is the act of copying a backup to a system where it will be started as a live database cluster. A restore requires the backup files and one or more WAL segments in order to work correctly.</p>
<p>WAL is the mechanism that <postgres/> uses to ensure that no committed changes are lost. Transactions are written sequentially to the WAL and a transaction is considered to be committed when those writes are flushed to disk. Afterwards, a background process writes the changes into the main database cluster files (also known as the heap). In the event of a crash, the WAL is replayed to make the database consistent.</p>
<p>WAL is conceptually infinite but in practice is broken up into individual 16MB files called segments. WAL segments follow the naming convention <id>0000000100000A1E000000FE</id> where the first 8 hexadecimal digits represent the timeline and the next 16 digits are the logical sequence number (LSN).</p>
<pkeyword="default"><backrest/> is written in Perl which is included with {[user-guide-os]} by default. The <id>DBD::Pg</id> module must also be installed.</p>
<pkeyword="co6"><backrest/> is written in Perl which is not included with {[user-guide-os]} by default, however all required modules are available as standard packages.</p>
<pkeyword="default">{[user-guide-os]} packages for <backrest/> are available, but if they are not provided on your distribution/version it is easy to download the source and install manually.</p>
<pkeyword="co6">{[user-guide-os]} packages for <backrest/> are available from <linkurl="{[crunchy-url-base]}">Crunchy Data</link> or <linkurl="http://yum.postgresql.org">yum.postgresql.org</link>, but it is also easy to download the source and install manually.</p>
<p>If <backrest/> has been installed before it's best to be sure that no prior copies of it are still installed. Depending on how old the version of pgBackRest is it may have been installed in a few different locations. The following commands will remove all prior versions of pgBackRest.</p>
<p><backrest/> should now be properly installed but it is best to check. If any dependencies were missed then you will get an error when running <backrest/> from the command line.</p>
<p>The Quick Start section will cover basic configuration of <backrest/> and <postgres/> and introduce the <cmd>backup</cmd>, <cmd>restore</cmd>, and <cmd>info</cmd> commands.</p>
<p>Creating the demo cluster is optional but is strongly recommended, especially for new users, since the example commands in the user guide reference the demo cluster; the examples assume the demo cluster is running on the default port (i.e. 5432). The cluster will not be started until a later section because there is still some configuration to do.</p>
<p>By default <postgres/> will only accept local connections. The examples in this guide will require connections from other servers so <pg-option>listen_addresses</pg-option> is configured to listen on all interfaces. This may not be appropriate for secure installations.</p>
<p>For demonstration purposes the <pg-option>log_line_prefix</pg-option> setting will be minimally configured. This keeps the log output as brief as possible to better illustrate important information.</p>
<pkeyword="co6">By default {[user-guide-os]} includes the day of the week in the log filename. This makes automating the user guide a bit more complicated so the <pg-option>log_filename</pg-option> is set to a constant.</p>
<p><backrest/> needs to know where the base data directory for the <postgres/> cluster is located. The path can be requested from <postgres/> directly but in a recovery scenario the <postgres/> process will not be available. During backups the value supplied to <backrest/> will be compared against the path that <postgres/> is running on and they must be equal or the backup will return an error. Make sure that <br-option>db-path</br-option> is exactly equal to <pg-option>data_directory</pg-option> in <file>postgresql.conf</file>.</p>
<p>By default {[user-guide-os]} stores clusters in <path>{[db-path-default]}</path> so it is easy to determine the correct path for the data directory.</p>
<p><backrest/> configuration files follow the Windows INI convention. Sections are denoted by text in brackets and key/value pairs are contained in each section. Lines beginning with <id>#</id> are ignored and can be used as comments.</p>
<p>For this demonstration the repository will be stored on the same host as the <postgres/> server. This is the simplest configuration and is useful in cases where traditional backup software is employed to backup the database host.</p>
<p>Backing up a running <postgres/> cluster requires WAL archiving to be enabled. Note that <i>at least</i> one WAL segment will be created during the backup process even if no explicit writes are made to the cluster.</p>
<p>The <pg-option>wal_level</pg-option> setting must be set to <pg-setting>archive</pg-setting> at a minimum but <pg-setting>hot_standby</pg-setting> and <pg-setting>logical</pg-setting> also work fine for backups. Setting <pg-option>wal_level</pg-option> to <pg-setting>hot_standy</pg-setting> and increasing <pg-option>max_wal_senders</pg-option> is a good idea even if you do not currently run a hot standby as this will allow them to be added later without restarting the master cluster.</p>
<p>When archiving a WAL segment is expected to take more than 60 seconds (the default) then the <br-option>archive-timeout</br-option> option should be increased.</p>
<p>The <cmd>stanza-create</cmd> command must be run on the host where the repository is located to initialize the stanza. It is recommended that the <cmd>check</cmd> command be run after <cmd>stanza-create</cmd> to ensure archiving and backups are properly configured.</p>
<p>By default <backrest/> will attempt to perform an incremental backup. However, an incremental backup must be based on a full backup and since no full backup existed <backrest/> ran a full backup instead.</p>
<p>This time there was no warning because a full backup already existed. While incremental backups can be based on a full <i>or</i> differential backup, differential backups must be based on a full backup. A full backup can be performed by running the <cmd>backup</cmd> command with <br-setting>{[dash]}-type=full</br-setting>.</p>
<p>Backups can be scheduled with utilities such as cron.</p>
<p>In the following example, two cron jobs are configured to run; full backups are scheduled for 6:30 AM every Sunday with differential backups scheduled for 6:30 AM Monday through Saturday. If this crontab is installed for the first time mid-week, then pgBackRest will run a full backup the first time the differential job is executed, followed the next day by a differential backup.</p>
<p>Once backups are scheduled it's important to configure retention so backups are expired on a regular schedule, see <linksection="/retention">Retention</link>.</p>
<p>Each stanza has a separate section and it is possible to limit output to a single stanza with the <br-option>--stanza</br-option> option. The stanza '<id>status</id>' gives a brief indication of the stanza's health. If this is '<id>ok</id>' then <backrest/> is functioning normally. The '<id>wal archive min/max</id>' shows the minimum and maximum WAL currently stored in the archive. Note that there may be gaps due to archive retention policies or other reasons.</p>
<p>The backups are displayed oldest to newest. The oldest backup will <i>always</i> be a full backup (indicated by an <id>F</id> at the end of the label) but the newest backup can be full, differential (ends with <id>D</id>), or incremental (ends with <id>I</id>).</p>
<p>The '<id>timestamp start/stop</id>' defines the time period when the backup ran. The '<id>timestamp stop</id>' can be used to determine the backup to use when performing Point-In-Time Recovery. More information about Point-In-Time Recovery can be found in the <linksection="/pitr">Point-In-Time Recovery</link> section.</p>
<p>The '<id>wal start/stop</id>' defines the WAL range that is required to make the database consistent when restoring. The <cmd>backup</cmd> command will ensure that this WAL range is in the archive before completing.</p>
<p>The '<id>database size</id>' is the full uncompressed size of the database while '<id>backup size</id>' is the amount of data actually backed up (these will be the same for full backups). The '<id>repository size</id>' includes all the files from this backup and any referenced backups that are required to restore the database while '<id>repository backup size</id>' includes only the files in this backup (these will also be the same for full backups). Repository sizes reflect compressed file sizes if compression is enabled in <backrest/> or the filesystem.</p>
<p>The '<id>backup reference list</id>' contains the additional backups that are required to restore this backup.</p>
<p>Backups can protect you from a number of disaster scenarios, the most common of which are hardware failure and data corruption. The easiest way to simulate data corruption is to remove an important <postgres/> cluster file.</p>
<p>To restore a backup of the <postgres/> cluster run <backrest/> with the <cmd>restore</cmd> command. The cluster needs to be stopped (in this case it is already stopped) and all files must be removed from the <postgres/> data directory.</p>
<p>By default <backrest/> will wait for the next regularly scheduled checkpoint before starting a backup. Depending on the <pg-option>checkpoint_timeout</pg-option> and <pg-option>checkpoint_segments</pg-option> settings in <postgres/> it may be quite some time before a checkpoint completes and the backup can begin.</p>
<p>When <br-setting>{[dash]}-start-fast</br-setting> is passed on the command-line or <br-setting>start-fast=y</br-setting> is set in <file>{[backrest-config-demo]}</file> an immediate checkpoint is requested and the backup will start more quickly. This is convenient for testing and for ad-hoc backups. For instance, if a backup is being taken at the beginning of a release window it makes no sense to wait for a checkpoint. Since regularly scheduled backups generally only happen once per day it is unlikely that enabling the <br-option>start-fast</br-option> in <file>{[backrest-config-demo]}</file> will negatively affect performance, however for high-volume transactional systems you may want to pass <br-setting>{[dash]}-start-fast</br-setting> on the command-line instead. Alternately, it is possible to override the setting in the configuration file by passing <br-setting>{[dash]}-no-start-fast</br-setting> on the command-line.</p>
<p>Sometimes <backrest/> will exit unexpectedly and the backup in progress on the <postgres/> cluster will not be properly stopped. <backrest/> exits as quickly as possible when an error occurs so that the cause can be reported accurately and is not masked by another problem that might happen during a more extensive cleanup.</p>
<p>Even when the permissions are fixed <backrest/> will still be unable to perform a backup because the <postgres/> cluster is stuck in backup mode.</p>
<p>Enabling the <br-option>stop-auto</br-option> option allows <backrest/> to stop the current backup if it detects that no other <backrest/> backup process is running.</p>
<exe-highlight>cluster is already in backup mode|backup begins after the requested immediate checkpoint completes</exe-highlight>
</execute>
</execute-list>
<p>Although useful this feature may not be appropriate when another third-party backup solution is being used to take online backups as <backrest/> will not recognize that the other software is running and may terminate a backup started by that software. However, it would be unusual to run more than one third-party backup solution at the same time so this is not likely to be a problem.</p>
<p>Note that <id>pg_dump</id> and <id>pg_base_backup</id> do not take online backups so are not affected. It is safe to run them in conjunction with <backrest/>.</p>
<p>During an online backup, <backrest/> waits for WAL segments that are required to make the backup consistent to be archived. This wait time is governed by the <br-option>archive-timeout</br-option> option which defaults to 60 seconds. If archiving an individual segment is known to take longer, then this option should be increased.</p>
<p>Generally it is best to retain as many backups as possible to provide a greater window for <linksection="/pitr">Point-in-Time Recovery</link>, but practical concerns such as disk space must also be considered. Retention options remove older backups once they are no longer needed.</p>
<p>Set <br-option>retention-full</br-option> to the number of full backups required. New backups must be completed before expiration will occur &mdash; that means if <br-setting>retention-full=2</br-setting> then there will be three full backups stored before the oldest one is expired.</p>
<p>Backup <br-setting>retention-full=2</br-setting> but currently there is only one full backup so the next full backup to run will not expire any full backups.</p>
<p>Archive <i>is</i> expired because WAL segments were generated before the oldest backup. These are not useful for recovery &mdash; only WAL segments generated after a backup can be used to recover that backup.</p>
<p>The <id>{[backup-full-first]}</id> full backup is expired and archive retention is based on the <id>{[backup-full-second]}</id> which is now the oldest full backup.</p>
<p>Set <br-option>retention-diff</br-option> to the number of differential backups required. Differentials only rely on the prior full backup so it is possible to create a <quote>rolling</quote> set of differentials for the last day or more. This allows quick restores to recent points-in-time but reduces overall space consumption.</p>
<p>Backup <br-setting>retention-diff=1</br-setting> so two differentials will need to be performed before one is expired. An incremental backup is added to demonstrate incremental expiration. Incremental backups cannot be expired independently &mdash; they are always expired with their related full or differential backup.</p>
<p>Although <backrest/> automatically removes archived WAL segments when expiring backups (the default expires WAL for full backups based on the <br-option>retention-full</br-option> option), it may be useful to expire archive more aggressively to save disk space. Note that full backups are treated as differential backups for the purpose of differential archive retention.</p>
<p>Expiring archive will never remove WAL segments that are required to make a backup consistent. However, since Point-in-Time-Recovery (PITR) only works on a continuous WAL stream, care should be taken when aggressively expiring archive outside of the normal backup expiration process.</p>
<exe-highlight>archive retention on backup {[backup-diff-first]}|remove archive</exe-highlight>
</execute>
</execute-list>
<p>The <id>{[backup-diff-first]}</id> differential backup has archived WAL segments that must be retained to make the older backups consistent even though they cannot be played any further forward with PITR. WAL segments generated after <id>{[backup-diff-first]}</id> but before <id>{[backup-diff-second]}</id> are removed. WAL segments generated after the new backup <id>{[backup-diff-second]}</id> remain and can be used for PITR.</p>
<p>Since full backups are considered differential backups for the purpose of differential archive retention, if a full backup is now performed with the same settings, only the archive for that full backup is retained for PITR.</p>
<p><linksection="/quickstart/perform-restore">Restore a Backup</link> in <linksection="/quickstart">Quick Start</link> required the database cluster directory to be cleaned before the <cmd>restore</cmd> could be performed. The <br-option>delta</br-option> option allows <backrest/> to automatically determine which files in the database cluster directory can be preserved and which ones need to be restored from the backup &mdash; it also <i>removes</i> files not present in the backup manifest so it will dispose of divergent changes. This is accomplished by calculating a <linkurl="https://en.wikipedia.org/wiki/SHA-1">SHA-1</link> cryptographic hash for each file in the database cluster directory. If the <id>SHA-1</id> hash does not match the hash stored in the backup then that file will be restored. This operation is very efficient when combined with the <br-option>process-max</br-option> option. Since the <postgres/> server is shut down during the restore, a larger number of processes can be used than might be desirable during a backup when the <postgres/> server is running.</p>
<p>There may be cases where it is desirable to selectively restore specific databases from a cluster backup. This could be done for performance reasons or to move selected databases to a machine that does not have enough space to restore the entire cluster backup.</p>
<p>To demonstrate this feature two databases are created: test1 and test2. A fresh backup is run so <backrest/> is aware of the new databases.</p>
<execute-listhost="{[host-db-master]}">
<title>Create two test databases and perform a backup</title>
<p>Each test database will be seeded with tables and data to demonstrate that recovery works with selective restore.</p>
<execute-listhost="{[host-db-master]}">
<title>Create a test table in each database</title>
<executeoutput="y"filter="n">
<exe-cmd>
psql -c "create table test1_table (id int);
insert into test1_table (id) values (1);" test1
</exe-cmd>
</execute>
<executeoutput="y"filter="n">
<exe-cmd>
psql -c "create table test2_table (id int);
insert into test2_table (id) values (2);" test2
</exe-cmd>
</execute>
</execute-list>
<p>One of the main reasons to use selective restore is to save space. The size of the test1 database is shown here so it can be compared with the disk utilization after a selective restore.</p>
<execute-listhost="{[host-db-master]}">
<title>Show space used by test1 database</title>
<executeoutput="y"filter="n">
<exe-cmd>
du -sh {[db-path]}/base/16384
</exe-cmd>
</execute>
</execute-list>
<p>Stop the cluster and restore only the test2 database. Built-in databases (<id>template0</id>, <id>template1</id>, and <id>postgres</id>) are always restored.</p>
<execute-listhost="{[host-db-master]}">
<title>Restore from last backup including only the test2 database</title>
<p>Once recovery is complete the test2 database will contain all previously created tables and data.</p>
<execute-listhost="{[host-db-master]}">
<title>Demonstrate that the test2 database was recovered</title>
<executeoutput="y"filter="n">
<exe-cmd>
psql -c "select * from test2_table;" test2
</exe-cmd>
</execute>
</execute-list>
<p>The test1 database, despite successful recovery, is not accessible. This is because the entire database was restored as sparse, zeroed files. <postgres/> can successfully apply WAL on the zeroed files but the database as a whole will not be valid because key files contain no data. This is purposeful to prevent the database from being accidentally used when it might contain partial data that was applied during WAL replay.</p>
<execute-listhost="{[host-db-master]}">
<title>Attempting to connect to the test1 database will produce an error</title>
<p>Since the test1 database is restored with sparse, zeroed files it will only require as much space as the amount of WAL that is written during recovery. While the amount of WAL generated during a backup and applied during recovery can be significant it will generally be a small fraction of the total database size, especially for large databases where this feature is most likely to be useful.</p>
<p>It is clear that the test1 database uses far less disk space during the selective restore than it would have if the entire database had been restored.</p>
<execute-listhost="{[host-db-master]}">
<title>Show space used by test1 database after recovery</title>
<executeoutput="y"filter="n">
<exe-cmd>
du -sh {[db-path]}/base/16384
</exe-cmd>
</execute>
</execute-list>
<p>At this point the only action that can be taken on the invalid test1 database is <id>drop database</id>. <backrest/> does not automatically drop the database since this cannot be done until recovery is complete and the cluster is accessible.</p>
<execute-listhost="{[host-db-master]}">
<title>Drop the test1 database</title>
<executeoutput="y"filter="n">
<exe-cmd>
psql -c "drop database test1;"
</exe-cmd>
</execute>
</execute-list>
<p>Now that the invalid test1 database has been dropped only the test2 and built-in databases remain.</p>
<execute-listhost="{[host-db-master]}">
<title>List remaining databases</title>
<executeoutput="y"filter="n">
<exe-cmd>
psql -c "select oid, datname from pg_database order by oid;"
<p><linksection="/quickstart/perform-restore">Restore a Backup</link> in <linksection="/quickstart">Quick Start</link> performed default recovery, which is to play all the way to the end of the WAL stream. In the case of a hardware failure this is usually the best choice but for data corruption scenarios (whether machine or human in origin) Point-in-Time Recovery (PITR) is often more appropriate.</p>
<p>Point-in-Time Recovery (PITR) allows the WAL to be played from the last backup to a specified time, transaction id, or recovery point. For common recovery scenarios time-based recovery is arguably the most useful. A typical recovery scenario is to restore a table that was accidentally dropped or data that was accidentally deleted. Recovering a dropped table is more dramatic so that's the example given here but deleted data would be recovered in exactly the same way.</p>
<p>It is important to represent the time as reckoned by <postgres/> and to include timezone offsets. This reduces the possibility of unintended timezone conversions and an unexpected recovery result.</p>
<p>Now that the time has been recorded the table is dropped. In practice finding the exact time that the table was dropped is a lot harder than in this example. It may not be possible to find the exact time, but some forensic work should be able to get you close.</p>
<execute-listhost="{[host-db-master]}">
<title>Drop the important table</title>
<executeoutput="y"err-expect="1">
<exe-cmd>psql -c "begin;
drop table important_table;
commit;
select * from important_table;"</exe-cmd>
<exe-highlight>does not exist</exe-highlight>
</execute>
</execute-list>
<p>Now the restore can be performed with time-based recovery to bring back the missing table.</p>
<execute-listhost="{[host-db-master]}">
<title>Stop <postgres/>, restore the {[postgres-cluster-demo]} cluster to <id>{[time-recovery-timestamp]}</id>, and display <file>recovery.conf</file></title>
<p>The <file>recovery.conf</file> file has been automatically generated by <backrest/> so <postgres/> can be started immediately. Once <postgres/> has finished recovery the table will exist again and can be queried.</p>
<execute-listhost="{[host-db-master]}">
<title>Start <postgres/> and check that the important table exists</title>
<p>The <postgres/> log also contains valuable information. It will indicate the time and transaction where the recovery stopped and also give the time of the last transaction to be applied.</p>
<p>This example was rigged to give the correct result. If a backup after the required time is chosen then <postgres/> will not be able to recover the lost table. <postgres/> can only play forward, not backward. To demonstrate this the important table must be dropped (again).</p>
<execute-listhost="{[host-db-master]}">
<title>Drop the important table (again)</title>
<executeoutput="y"err-expect="1">
<exe-cmd>psql -c "begin;
drop table important_table;
commit;
select * from important_table;"</exe-cmd>
<exe-highlight>does not exist</exe-highlight>
</execute>
</execute-list>
<p>Now take a new backup and attempt recovery from the new backup.</p>
<execute-listhost="{[host-db-master]}">
<title>Perform a backup then attempt recovery from that backup</title>
<exe-cmd>psql -c "select * from important_table"</exe-cmd>
<exe-highlight>does not exist</exe-highlight>
</execute>
</execute-list>
<p>Looking at the log output it's not obvious that recovery failed to restore the table. The key is to look for the presence of the <quote>recovery stopping before...</quote> and <quote>last completed transaction...</quote> log messages. If they are not present then the recovery to the specified point-in-time was not successful.</p>
<execute-listhost="{[host-db-master]}">
<title>Examine the <postgres/> log output to discover the recovery was not successful</title>
<p>Using an earlier backup will allow <postgres/> to play forward to the correct time. The <cmd>info</cmd> command can be used to find the next to last backup.</p>
<execute-listhost="{[host-db-master]}">
<title>Get backup info for the {[postgres-cluster-demo]} cluster</title>
<executefilter="n"output="y">
<exe-cmd>{[project-exe]} info</exe-cmd>
<exe-highlight>{[backup-last]}</exe-highlight>
</execute>
</execute-list>
<p>The default behavior for restore is to use the last backup but an earlier backup can be specified with the <br-option>{[dash]}-set</br-option> option.</p>
<p>Now the the log output will contain the expected <quote>recovery stopping before...</quote> and <quote>last completed transaction...</quote> messages showing that the recovery was successful.</p>
<execute-listhost="{[host-db-master]}">
<title>Examine the <postgres/> log output for log messages indicating success</title>
<p>File creation time in <proper>S3</proper> is relatively slow so commands benefit by increasing <br-option>process-max</br-option> to parallelize file creation.</p>
<execute-listhost="{[host-db-master]}">
<title>Backup the {[postgres-cluster-demo]} cluster</title>
<p>The configuration described in <linksection="/quickstart">Quickstart</link> is suitable for simple installations but for enterprise configurations it is more typical to have a dedicated <host>backup</host> host. This separates the backups and WAL archive from the database server so <host>database</host> host failures have less impact. It is still a good idea to employ traditional backup software to backup the <host>backup</host> host.</p>
<p>For this example a new host named <host>backup</host> has been created to store the cluster backups. Follow the instructions in <linksection="/installation">Installation</link> to install <backrest/>, <linksection="/quickstart/create-repository">Create the Repository</link> to create the <backrest/> repository and <linksection="/quickstart/create-stanza">Create the Stanza</link> to create the stanza. The <host>backup</host> host must also be configured with the <host>db-master</host> host/user and database path. The master database will be configured as <id>db1</id> to allow a standby to be added later.</p>
<p>The database host must be configured with the backup host/user. The default for the <br-option>backup-user</br-option> option is <id>backrest</id>. If the <id>postgres</id> user does restores on the backup host it is best not to also allow the <id>postgres</id> user to perform backups. However, the <id>postgres</id> user can read the repository directly if it is in the same group as the <id>backrest</id> user.</p>
<p>The repository directory will also be removed from the database host. It will not be used anymore so leaving it around may be confusing later on.</p>
<p>Commands are run the same as on a single host configuration except that some commands such as <cmd>backup</cmd> and <cmd>expire</cmd> are run from the <host>backup</host> host instead of the <host>database</host> host.</p>
<p>Check that the configuration is correct on both the <host>database</host> and <host>backup</host> hosts. More information about the <cmd>check</cmd> command can be found in <linksection="/quickstart/check-configuration">Check the Configuration</link>.</p>
<p>Since a new repository was created on the <host>backup</host> host the warning about the incremental backup changing to a full backup was emitted.</p>
<p>The <br-option>archive-async</br-option> option offloads WAL archiving to a separate process (or processes) to improve throughput. It works by <quote>looking ahead</quote> to see which WAL segments are ready to be archived beyond the request that <postgres/> is currently making via the <code>archive_command</code>. WAL segments are transferred to the archive directly from the <path>pg_xlog</path> directory and success is only returned by the <code>archive_command</code> when the WAL segment has been safely stored in the archive.</p>
<p>The spool directory is created to hold the current status of WAL archiving. Status files written into the spool directory are typically zero length and should consume a minimal amount of space (a few MB at most) and very little IO. All the information in this directory can be recreated so it is not necessary to preserve the spool directory if the cluster is moved to new hardware.</p>
<p><b>NOTE:</b> In the original implementation of asynchronous archiving, WAL segments were copied to the spool directory before compression and transfer. The new implementation copies WAL directly from the <path>pg_xlog</path> directory. If asynchronous archiving was utilized in <id>v1.12</id> or prior, read the <id>v1.13</id> release notes carefully before upgrading.</p>
<p>The spool path must be configured and asynchronous archiving enabled. Asynchronous archiving automatically confers some benefit by reducing the number of ssh connections made to the backup server, but setting <br-option>process-max</br-option> can drastically improve performance. Be sure not to set <br-option>process-max</br-option> so high that it affects normal database operations.</p>
<p>The <file>archive-async.log</file> file can be used to monitor the activity of the asynchronous process. A good way to test this is to quickly push a number of WAL segments.</p>
<p><backrest/> offers parallel processing to improve performance of compression and transfer. The number of processes to be used for this feature is set using the <br-option>--process-max</br-option> option.</p>
<execute-listhost="{[host-backup]}">
<title>Check the number of CPUs</title>
<executeuser="root"output="y">
<exe-cmd>lscpu</exe-cmd>
<exe-highlight>^CPU\(s\)\:</exe-highlight>
</execute>
</execute-list>
<p>It is usually best not to use more than 25% of the available CPUs for the <cmd>backup</cmd> command. Backups don't have to run that fast as long as they are performed regularly and the backup process should not impact database performance, if at all possible.</p>
<p>The restore command can and should use all available CPUs because during a restore the <postgres/> cluster is shut down and there is generally no other important work being done on the host. If the host contains multiple clusters then that should be considered when setting restore parallelism.</p>
<execute-listhost="{[host-backup]}">
<title>Perform a backup with single process</title>
<p>The performance of the last backup should be improved by using multiple processes. For very small backups the difference may not be very apparent, but as the size of the database increases so will time savings.</p>
<p>Sometimes it is useful to prevent <backrest/> from running on a system. For example, when failing over from a master to a standby it's best to prevent <backrest/> from running on the old master in case <postgres/> gets restarted or can't be completely killed. This will also prevent <backrest/> from running on <id>cron</id>.</p>
<p>Specify the <br-option>--force</br-option> option to terminate any <backrest/> process that are currently running. If <backrest/> is already stopped then stopping again will generate a warning.</p>
<execute-listhost="{[host-db-master]}">
<title>Stop the <backrest/> services again</title>
<executeoutput="y"filter="n">
<exe-cmd>{[project-exe]} stop</exe-cmd>
</execute>
</execute-list>
<p>Start <backrest/> processes again with the <cmd>start</cmd> command.</p>
<execute-listhost="{[host-db-master]}">
<title>Start the <backrest/> services</title>
<execute>
<exe-cmd>{[project-exe]} start</exe-cmd>
</execute>
</execute-list>
<p>It is also possible to stop <backrest/> for a single stanza.</p>
<execute-listhost="{[host-db-master]}">
<title>Stop <backrest/> services for the <id>demo</id> stanza</title>
<p>Replication allows multiple copies of a <postgres/> cluster (called standbys) to be created from a single master. The standbys are useful for balancing reads and to provide redundancy in case the master host fails.</p>
<p>A new host named <host>db-standby</host> will be created to run the standby. Follow the instructions in <linksection="/installation">Installation</link> to install <backrest/> and <linksection="/quickstart/setup-demo-cluster">Setup Demo Cluster</link> to setup the demo cluster.</p>
<p><backrest/> configuration is very similar to <host>db-master</host> except that the <pg-option>standby_mode</pg-option> setting will be enabled to keep the cluster in recovery mode when the end of the WAL stream has been reached.</p>
<p>Note that the <pg-setting>standby_mode</pg-setting> setting has been written into the <file>recovery.conf</file> file. Configuring recovery settings in <backrest/> means that the <file>recovery.conf</file> file does not need to be stored elsewhere since it will be properly recreated with each restore. The <br-setting>--type=preserve</br-setting> option can be used with the <cmd>restore</cmd> to leave the existing <file>recovery.conf</file> file in place if that behavior is preferred.</p>
<p>The <pg-setting>hot_standby</pg-setting> setting must be enabled before starting <postgres/> to allow read-only connections on <host>db-standby</host>. Otherwise, connection attempts will be refused.</p>
<p>The <postgres/> log gives valuable information about the recovery. Note especially that the cluster has entered standby mode and is ready to accept read-only connections.</p>
<execute-listhost="{[host-db-standby]}">
<title>Examine the <postgres/> log output for log messages indicating success</title>
<executeoutput="y">
<exe-cmd>cat {[postgres-log-demo]}</exe-cmd>
<exe-highlight>entering standby mode|database system is ready to accept read only connections</exe-highlight>
</execute>
</execute-list>
<p>An easy way to test that replication is properly configured is to create a table on <host>db-master</host>.</p>
<execute-listhost="{[host-db-master]}">
<title>Create a new table on the master</title>
<executeoutput="y">
<exe-cmd>
psql -c "
begin;
create table replicated_table (message text);
insert into replicated_table values ('{[test-table-data]}');
<p>So, what went wrong? Since <postgres/> is pulling WAL segments from the archive to perform replication, changes won't be seen on the standby until the WAL segment that contains those changes is pushed from <host>db-master</host>.</p>
<p>This can be done manually by calling <code>pg_switch_xlog()</code> which pushes the current WAL segment to the archive (a new WAL segment is created to contain further changes).</p>
<p>Instead of relying solely on the WAL archive, streaming replication makes a direct connection to the master and applies changes as soon as they are made on the master. This results in much less lag between the master and standby.</p>
<p>Streaming replication requires a user with the replication privilege.</p>
<execute-listhost="{[host-db-master]}">
<title>Create replication user</title>
<executeoutput="y"filter="n">
<exe-cmd>
psql -c "
create user replicator password 'jw8s0F4' replication";
<p>The <file>pg_hba.conf</file> file must be updated to allow the standby to connect as the replication user. Be sure to replace the IP address below with the actual IP address of your <host>db-master</host>. A reload will be required after modifying the <file>pg_hba.conf</file> file.</p>
<!-- <p>The <pg-option>max_wal_senders</pg-option> setting must be increased (the default is 0) to allow standbys to connect to the master. It will be set to 3 to allow more standbys to be created later. <postgres/> must restarted for this setting to take effect.</p>
<p>It is possible to configure a password in the <pg-option>primary_conninfo</pg-option> setting but using a <file>.pgpass</file> file is more flexible and secure.</p>
<execute-listhost="{[host-db-standby]}">
<title>Configure the replication password in the <file>.pgpass</file> file.</title>
<pkeyword="co6">By default {[user-guide-os]} stores the <file>postgresql.conf</file> file in the <postgres/> data directory. That means the change made to <file>postgresql.conf</file> was overwritten by the last restore and the <pg-option>hot_standby</pg-option> setting must be enabled again. Other solutions to this problem are to store the <file>postgresql.conf</file> file elsewhere or to enable the <pg-option>hot_standby</pg-option> setting on the <host>db-master</host> host where it will be ignored.</p>
<exe-highlight>started streaming WAL from primary</exe-highlight>
</execute>
</execute-list>
<p>Now when a table is created on <host>db-master</host> it will appear on <host>db-standby</host> quickly and without the need to call <code>pg_switch_xlog()</code>.</p>
<execute-listhost="{[host-db-master]}">
<title>Create a new table on the master</title>
<executeoutput="y">
<exe-cmd>
psql -c "
begin;
create table stream_table (message text);
insert into stream_table values ('{[test-table-data]}');
<p><backrest/> can perform backups on a standby instead of the master. Standby backups require the <host>db-standby</host> host to be configured and the <br-option>backup-standby</br-option> option enabled.</p>
<p>Both the master and standby databases are required to perform the backup, though the vast majority of the files will be copied from the standby to reduce load on the master. The database hosts can be configured in any order. <backrest/> will automatically determine which is the master and which is the standby.</p>
<execute-listhost="{[host-backup]}">
<title>Backup the {[postgres-cluster-demo]} cluster from <host>db-standby</host></title>
<exe-highlight>backup file db-master|replay on the standby</exe-highlight>
</execute>
</execute-list>
<p>This incremental backup shows that most of the files are copied from the <host>db-standby</host> host and only a few are copied from the <host>db-master</host> host.</p>
<p><backrest/> creates a standby backup that is identical to a backup performed on the master. It does this by starting/stopping the backup on the <host>db-master</host> host, copying only files that are replicated from the <host>db-standby</host> host, then copying the remaining few files from the <host>db-master</host> host. This means that logs and statistics from the master database will be included in the backup.</p>
<p>The following instructions are not meant to be a comprehensive guide for upgrading <postgres/>, rather they will outline the general process for upgrading a master and standby with the intent of demonstrating the steps required to reconfigure <backrest/>. It is recommended that a backup be taken prior to upgrading.</p>
<execute-listhost="{[host-db-master]}">
<title>Install new <postgres/> version</title>
<executeuser="root"err-suppress="y">
<exe-cmd>{[postgres-install-upgrade]}</exe-cmd>
<exe-cmd-extra>-y</exe-cmd-extra>
</execute>
</execute-list>
<p>Create the new cluster. If the <postgres/> install creates a default cluster, then remove it to avoid confusion.</p>
<execute-listhost="{[host-db-master]}">
<title>Drop default cluster and create the new demo cluster</title>
<p>Stop the old cluster on the standby since it will be restored from the newly upgraded cluster to ensure the database system id is identical on both the master and standby.</p>
<p>Start the new cluster and confirm it is successfully installed.</p>
<execute-listhost="{[host-db-master]}">
<title>Start new cluster</title>
<executeuser="root"output="y">
<exe-cmd>{[db-cluster-start-upgrade]}</exe-cmd>
</execute>
</execute-list>
<p>Test configuration using the <cmd>check</cmd> command. The warning on the <host>backup</host> host regarding the standby being down is expected and can be ignored.</p>
<p>Run a full backup on the new cluster and then restore the standby from the backup. The backup type will automatically be changed to <id>full</id> if <id>incr</id> or <id>diff</id> is requested.</p>