1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-12 10:04:14 +02:00
pgbackrest/README.md

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

124 lines
9.3 KiB
Markdown
Raw Normal View History

# pgBackRest <br/> Reliable PostgreSQL Backup & Restore
2014-03-06 03:51:03 +03:00
## Introduction
v2.29: Auto S3 Credentials on AWS Bug Fixes: * Suppress errors when closing local/remote processes. Since the command has completed it is counterproductive to throw an error but still warn to indicate that something unusual happened. (Reviewed by Cynthia Shang. Reported by argdenis.) * Fix issue with = character in file or database names. (Reviewed by Bastian Wegge, Cynthia Shang. Reported by Brad Nicholson, Bastian Wegge.) Features: * Automatically retrieve temporary S3 credentials on AWS instances. (Contributed by David Steele, Stephen Frost. Reviewed by Cynthia Shang, David Youatt, Aleš Zelený, Jeanette Bromage.) * Add archive-mode option to disable archiving on restore. (Reviewed by Stephen Frost. Suggested by Stephen Frost.) Improvements: * PostgreSQL 13 beta3 support. Changes to the control/catalog/WAL versions in subsequent betas may break compatibility but pgBackRest will be updated with each release to keep pace. * Asynchronous list/remove for S3/Azure storage. (Reviewed by Cynthia Shang, Stephen Frost.) * Improve memory usage of unlogged relation detection in manifest build. (Reviewed by Cynthia Shang, Stephen Frost, Brad Nicholson, Oscar. Suggested by Oscar, Brad Nicholson.) * Proactively close file descriptors after forking async process. (Reviewed by Stephen Frost, Cynthia Shang.) * Delay backup remote connection close until after archive check. (Contributed by Floris van Nee. Reviewed by David Steele.) * Improve detailed error output. (Reviewed by Cynthia Shang.) * Improve TLS error reporting. (Reviewed by Cynthia Shang, Stephen Frost.) Documentation Bug Fixes: * Add none to compress-type option reference and fix example. (Reported by Ugo Bellavance, Don Seiler.) * Add missing azure type in repo-type option reference. (Fixed by Don Seiler. Reviewed by David Steele.) * Fix typo in repo-cipher-type option reference. (Fixed by Don Seiler. Reviewed by David Steele.) Documentation Improvements: * Clarify that expire must be run regularly when expire-auto is disabled. (Reviewed by Douglas J Hunley. Suggested by Douglas J Hunley.)
2020-08-31 13:28:22 +02:00
pgBackRest aims to be a reliable, easy-to-use backup and restore solution that can seamlessly scale up to the largest databases and workloads by utilizing algorithms that are optimized for database-specific requirements.
pgBackRest [v2.46](https://github.com/pgbackrest/pgbackrest/releases/tag/release/2.46) is the current stable release. Release notes are on the [Releases](http://www.pgbackrest.org/release.html) page.
Please find us on [GitHub](https://github.com/pgbackrest/pgbackrest) and give us a star if you like pgBackRest!
## Features
### Parallel Backup & Restore
Compression is usually the bottleneck during backup operations so pgBackRest solves this problem with parallel processing and more efficient compression algorithms such as lz4 and zstd.
### Local or Remote Operation
A custom protocol allows pgBackRest to backup, restore, and archive locally or remotely via TLS/SSH with minimal configuration. An interface to query PostgreSQL is also provided via the protocol layer so that remote access to PostgreSQL is never required, which enhances security.
v2.33: Multi-Repository and GCS Support Bug Fixes: * Fix option warnings breaking async archive-get/archive-push. (Reviewed by Cynthia Shang. Reported by Lev Kokotov.) * Fix memory leak in backup during archive copy. (Reviewed by Cynthia Shang. Reported by Christian ROUX, Efremov Egor.) * Fix stack overflow in cipher passphrase generation. (Reviewed by Cynthia Shang. Reported by bsiara.) * Fix repo-ls / on S3 repositories. (Reviewed by Cynthia Shang. Reported by Lesovsky Alexey.) Features: * Multiple repository support. (Contributed by Cynthia Shang, David Steele. Reviewed by Stefan Fercot, Stephen Frost.) * GCS support for repository storage. (Reviewed by Cynthia Shang.) * Add archive-header-check option. (Reviewed by Stephen Frost, Cynthia Shang. Suggested by Hans-Jürgen Schönig.) Improvements: * Include recreated system databases during selective restore. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.) * Exclude content-length from S3 signed headers. (Reviewed by Cynthia Shang. Suggested by Brian P Bockelman.) * Consolidate less commonly used repository storage options. (Reviewed by Cynthia Shang.) * Allow custom config-path default with ./configure --with-configdir. (Contributed by Michael Schout. Reviewed by David Steele.) * Log archive copy during backup. (Reviewed by Cynthia Shang, Stefan Fercot.) Documentation Improvements: * Update reference to include links to user guide examples. (Contributed by Cynthia Shang. Reviewed by David Steele.) * Update selective restore documentation with caveats. (Reviewed by Cynthia Shang, Stefan Fercot.) * Add compress-type clarification to archive-copy documentation. (Reviewed by Cynthia Shang, Stefan Fercot.) * Add compress-level defaults per compress-type value. (Contributed by Cynthia Shang. Reviewed by David Steele.) * Add note about required NFS settings being the same as PostgreSQL. (Contributed by Cynthia Shang. Reviewed by David Steele.)
2021-04-05 15:18:20 +02:00
### Multiple Repositories
Multiple repositories allow, for example, a local repository with minimal retention for fast restores and a remote repository with a longer retention for redundancy and access across the enterprise.
### Full, Incremental, & Differential Backups
Full, differential, and incremental backups are supported. pgBackRest is not susceptible to the time resolution issues of rsync, making differential and incremental backups safe without the requirement to checksum each file.
### Backup Rotation & Archive Expiration
Retention polices can be set for full and differential backups to create coverage for any time frame. The WAL archive can be maintained for all backups or strictly for the most recent backups. In the latter case WAL required to make older backups consistent will be maintained in the archive.
### Backup Integrity
Checksums are calculated for every file in the backup and rechecked during a restore or verify. After a backup finishes copying files, it waits until every WAL segment required to make the backup consistent reaches the repository.
Backups in the repository may be stored in the same format as a standard PostgreSQL cluster (including tablespaces). If compression is disabled and hard links are enabled it is possible to snapshot a backup in the repository and bring up a PostgreSQL cluster directly on the snapshot. This is advantageous for terabyte-scale databases that are time consuming to restore in the traditional way.
All operations utilize file and directory level fsync to ensure durability.
v1.18: Stanza Upgrade, Refactoring, and Locking Improvements Bug Fixes: * Fixed an issue where read-only operations that used local worker processes (i.e. restore) were creating write locks that could interfere with parallel archive-push. (Reported by Jens Wilke.) Features: * Added the stanza-upgrade command to provide a mechanism for upgrading a stanza after upgrading to a new major version of PostgreSQL. (Contributed by Cynthia Shang.) * Added validation of pgbackrest.conf to display warnings if options are not valid or are not in the correct section. (Contributed by Cynthia Shang.) Refactoring: * Simplify locking scheme. Now, only the master process will hold write locks (for archive-push and backup commands) and not all local and remote worker processes as before. * Refactor Ini.pm to facilitate testing. * Do not set timestamps of files in the backup directories to match timestamps in the cluster directory. This was originally done to enable backup resume, but that process is now implemented with checksums. * Improved error message when the restore command detects the presence of postmaster.pid. (Suggested by Yogesh Sharma.) * Renumber return codes between 25 and 125 to avoid PostgreSQL interpreting some as fatal signal exceptions. (Suggested by Yogesh Sharma.) * The backup and restore commands no longer copy via temp files. In both cases the files are checksummed on resume so there's no danger of partial copies. * Allow functions to accept optional parameters as a hash. * Refactor File->list() and fileList() to accept optional parameters. * Refactor backupLabel() and add unit tests. * Silence some perl critic warnings. (Contributed by Cynthia Shang.)
2017-04-13 01:17:39 +02:00
### Page Checksums
PostgreSQL has supported page-level checksums since 9.3. If page checksums are enabled pgBackRest will validate the checksums for every file that is copied during a backup. All page checksums are validated during a full backup and checksums in files that have changed are validated during differential and incremental backups.
Validation failures do not stop the backup process, but warnings with details of exactly which pages have failed validation are output to the console and file log.
This feature allows page-level corruption to be detected early, before backups that contain valid copies of the data have expired.
### Backup Resume
An interrupted backup can be resumed from the point where it was stopped. Files that were already copied are compared with the checksums in the manifest to ensure integrity. Since this operation can take place entirely on the repository host, it reduces load on the PostgreSQL host and saves time since checksum calculation is faster than compressing and retransmitting data.
### Streaming Compression & Checksums
Compression and checksum calculations are performed in stream while files are being copied to the repository, whether the repository is located locally or remotely.
If the repository is on a repository host, compression is performed on the PostgreSQL host and files are transmitted in a compressed format and simply stored on the repository host. When compression is disabled a lower level of compression is utilized to make efficient use of available bandwidth while keeping CPU cost to a minimum.
### Delta Restore
The manifest contains checksums for every file in the backup so that during a restore it is possible to use these checksums to speed processing enormously. On a delta restore any files not present in the backup are first removed and then checksums are generated for the remaining files. Files that match the backup are left in place and the rest of the files are restored as usual. Parallel processing can lead to a dramatic reduction in restore times.
v2.02: Parallel Asynchronous Archive Get and Configuration Includes Bug Fixes: * Fix directory syncs running recursively when only the specified directory should be synced. (Reported by Craig A. James.) * Fix archive-copy throwing "path not found" error for incr/diff backups. (Reported by yummyliu, Vitaliy Kukharik.) * Fix failure in manifest build when two or more files in PGDATA are linked to the same directory. (Reported by Vitaliy Kukharik.) * Fix delta restore failing when a linked file is missing. * Fix rendering of key/value and list options in help. (Reported by Clinton Adams.) Features: * Add asynchronous, parallel archive-get. This feature maintains a queue of WAL segments to help reduce latency when PostgreSQL requests a WAL segment with restore_command. * Add support for additional pgBackRest configuration files in the directory specified by the --config-include-path option. Add --config-path option for overriding the default base path of the --config and --config-include-path option. (Contributed by Cynthia Shang.) * Add repo-s3-token option to allow temporary credentials tokens to be configured. pgBackRest currently has no way to request new credentials so the entire command (e.g. backup, restore) must complete before the credentials expire. (Contributed by Yogesh Sharma.) Improvements: * Update the archive-push-queue-max, manifest-save-threshold, and buffer-size options to accept values in KB, MB, GB, TB, or PB where the multiplier is a power of 1024. (Contributed by Cynthia Shang.) * Make backup/restore path sync more efficient. Scanning the entire directory can be very expensive if there are a lot of small tables. The backup manifest contains the path list so use it to perform syncs instead of scanning the backup/restore path. * Show command parameters as well as command options in initial info log message. * Rename archive-queue-max option to archive-push-queue-max to avoid confusion with the new archive-get-queue-max option. The old option name will continue to be accepted.
2018-05-07 01:53:42 +02:00
### Parallel, Asynchronous WAL Push & Get
v2.02: Parallel Asynchronous Archive Get and Configuration Includes Bug Fixes: * Fix directory syncs running recursively when only the specified directory should be synced. (Reported by Craig A. James.) * Fix archive-copy throwing "path not found" error for incr/diff backups. (Reported by yummyliu, Vitaliy Kukharik.) * Fix failure in manifest build when two or more files in PGDATA are linked to the same directory. (Reported by Vitaliy Kukharik.) * Fix delta restore failing when a linked file is missing. * Fix rendering of key/value and list options in help. (Reported by Clinton Adams.) Features: * Add asynchronous, parallel archive-get. This feature maintains a queue of WAL segments to help reduce latency when PostgreSQL requests a WAL segment with restore_command. * Add support for additional pgBackRest configuration files in the directory specified by the --config-include-path option. Add --config-path option for overriding the default base path of the --config and --config-include-path option. (Contributed by Cynthia Shang.) * Add repo-s3-token option to allow temporary credentials tokens to be configured. pgBackRest currently has no way to request new credentials so the entire command (e.g. backup, restore) must complete before the credentials expire. (Contributed by Yogesh Sharma.) Improvements: * Update the archive-push-queue-max, manifest-save-threshold, and buffer-size options to accept values in KB, MB, GB, TB, or PB where the multiplier is a power of 1024. (Contributed by Cynthia Shang.) * Make backup/restore path sync more efficient. Scanning the entire directory can be very expensive if there are a lot of small tables. The backup manifest contains the path list so use it to perform syncs instead of scanning the backup/restore path. * Show command parameters as well as command options in initial info log message. * Rename archive-queue-max option to archive-push-queue-max to avoid confusion with the new archive-get-queue-max option. The old option name will continue to be accepted.
2018-05-07 01:53:42 +02:00
Dedicated commands are included for pushing WAL to the archive and getting WAL from the archive. Both commands support parallelism to accelerate processing and run asynchronously to provide the fastest possible response time to PostgreSQL.
v2.02: Parallel Asynchronous Archive Get and Configuration Includes Bug Fixes: * Fix directory syncs running recursively when only the specified directory should be synced. (Reported by Craig A. James.) * Fix archive-copy throwing "path not found" error for incr/diff backups. (Reported by yummyliu, Vitaliy Kukharik.) * Fix failure in manifest build when two or more files in PGDATA are linked to the same directory. (Reported by Vitaliy Kukharik.) * Fix delta restore failing when a linked file is missing. * Fix rendering of key/value and list options in help. (Reported by Clinton Adams.) Features: * Add asynchronous, parallel archive-get. This feature maintains a queue of WAL segments to help reduce latency when PostgreSQL requests a WAL segment with restore_command. * Add support for additional pgBackRest configuration files in the directory specified by the --config-include-path option. Add --config-path option for overriding the default base path of the --config and --config-include-path option. (Contributed by Cynthia Shang.) * Add repo-s3-token option to allow temporary credentials tokens to be configured. pgBackRest currently has no way to request new credentials so the entire command (e.g. backup, restore) must complete before the credentials expire. (Contributed by Yogesh Sharma.) Improvements: * Update the archive-push-queue-max, manifest-save-threshold, and buffer-size options to accept values in KB, MB, GB, TB, or PB where the multiplier is a power of 1024. (Contributed by Cynthia Shang.) * Make backup/restore path sync more efficient. Scanning the entire directory can be very expensive if there are a lot of small tables. The backup manifest contains the path list so use it to perform syncs instead of scanning the backup/restore path. * Show command parameters as well as command options in initial info log message. * Rename archive-queue-max option to archive-push-queue-max to avoid confusion with the new archive-get-queue-max option. The old option name will continue to be accepted.
2018-05-07 01:53:42 +02:00
WAL push automatically detects WAL segments that are pushed multiple times and de-duplicates when the segment is identical, otherwise an error is raised. Asynchronous WAL push allows transfer to be offloaded to another process which compresses WAL segments in parallel for maximum throughput. This can be a critical feature for databases with extremely high write volume.
v2.02: Parallel Asynchronous Archive Get and Configuration Includes Bug Fixes: * Fix directory syncs running recursively when only the specified directory should be synced. (Reported by Craig A. James.) * Fix archive-copy throwing "path not found" error for incr/diff backups. (Reported by yummyliu, Vitaliy Kukharik.) * Fix failure in manifest build when two or more files in PGDATA are linked to the same directory. (Reported by Vitaliy Kukharik.) * Fix delta restore failing when a linked file is missing. * Fix rendering of key/value and list options in help. (Reported by Clinton Adams.) Features: * Add asynchronous, parallel archive-get. This feature maintains a queue of WAL segments to help reduce latency when PostgreSQL requests a WAL segment with restore_command. * Add support for additional pgBackRest configuration files in the directory specified by the --config-include-path option. Add --config-path option for overriding the default base path of the --config and --config-include-path option. (Contributed by Cynthia Shang.) * Add repo-s3-token option to allow temporary credentials tokens to be configured. pgBackRest currently has no way to request new credentials so the entire command (e.g. backup, restore) must complete before the credentials expire. (Contributed by Yogesh Sharma.) Improvements: * Update the archive-push-queue-max, manifest-save-threshold, and buffer-size options to accept values in KB, MB, GB, TB, or PB where the multiplier is a power of 1024. (Contributed by Cynthia Shang.) * Make backup/restore path sync more efficient. Scanning the entire directory can be very expensive if there are a lot of small tables. The backup manifest contains the path list so use it to perform syncs instead of scanning the backup/restore path. * Show command parameters as well as command options in initial info log message. * Rename archive-queue-max option to archive-push-queue-max to avoid confusion with the new archive-get-queue-max option. The old option name will continue to be accepted.
2018-05-07 01:53:42 +02:00
Asynchronous WAL get maintains a local queue of WAL segments that are decompressed and ready for replay. This reduces the time needed to provide WAL to PostgreSQL which maximizes replay speed. Higher-latency connections and storage (such as S3) benefit the most.
The push and get commands both ensure that the database and repository match by comparing PostgreSQL versions and system identifiers. This virtually eliminates the possibility of misconfiguring the WAL archive location.
### Tablespace & Link Support
Tablespaces are fully supported and on restore tablespaces can be remapped to any location. It is also possible to remap all tablespaces to one location with a single command which is useful for development restores.
File and directory links are supported for any file or directory in the PostgreSQL cluster. When restoring it is possible to restore all links to their original locations, remap some or all links, or restore some or all links as normal files or directories within the cluster directory.
v2.33: Multi-Repository and GCS Support Bug Fixes: * Fix option warnings breaking async archive-get/archive-push. (Reviewed by Cynthia Shang. Reported by Lev Kokotov.) * Fix memory leak in backup during archive copy. (Reviewed by Cynthia Shang. Reported by Christian ROUX, Efremov Egor.) * Fix stack overflow in cipher passphrase generation. (Reviewed by Cynthia Shang. Reported by bsiara.) * Fix repo-ls / on S3 repositories. (Reviewed by Cynthia Shang. Reported by Lesovsky Alexey.) Features: * Multiple repository support. (Contributed by Cynthia Shang, David Steele. Reviewed by Stefan Fercot, Stephen Frost.) * GCS support for repository storage. (Reviewed by Cynthia Shang.) * Add archive-header-check option. (Reviewed by Stephen Frost, Cynthia Shang. Suggested by Hans-Jürgen Schönig.) Improvements: * Include recreated system databases during selective restore. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.) * Exclude content-length from S3 signed headers. (Reviewed by Cynthia Shang. Suggested by Brian P Bockelman.) * Consolidate less commonly used repository storage options. (Reviewed by Cynthia Shang.) * Allow custom config-path default with ./configure --with-configdir. (Contributed by Michael Schout. Reviewed by David Steele.) * Log archive copy during backup. (Reviewed by Cynthia Shang, Stefan Fercot.) Documentation Improvements: * Update reference to include links to user guide examples. (Contributed by Cynthia Shang. Reviewed by David Steele.) * Update selective restore documentation with caveats. (Reviewed by Cynthia Shang, Stefan Fercot.) * Add compress-type clarification to archive-copy documentation. (Reviewed by Cynthia Shang, Stefan Fercot.) * Add compress-level defaults per compress-type value. (Contributed by Cynthia Shang. Reviewed by David Steele.) * Add note about required NFS settings being the same as PostgreSQL. (Contributed by Cynthia Shang. Reviewed by David Steele.)
2021-04-05 15:18:20 +02:00
### S3, Azure, and GCS Compatible Object Store Support
2017-06-12 16:52:32 +02:00
v2.33: Multi-Repository and GCS Support Bug Fixes: * Fix option warnings breaking async archive-get/archive-push. (Reviewed by Cynthia Shang. Reported by Lev Kokotov.) * Fix memory leak in backup during archive copy. (Reviewed by Cynthia Shang. Reported by Christian ROUX, Efremov Egor.) * Fix stack overflow in cipher passphrase generation. (Reviewed by Cynthia Shang. Reported by bsiara.) * Fix repo-ls / on S3 repositories. (Reviewed by Cynthia Shang. Reported by Lesovsky Alexey.) Features: * Multiple repository support. (Contributed by Cynthia Shang, David Steele. Reviewed by Stefan Fercot, Stephen Frost.) * GCS support for repository storage. (Reviewed by Cynthia Shang.) * Add archive-header-check option. (Reviewed by Stephen Frost, Cynthia Shang. Suggested by Hans-Jürgen Schönig.) Improvements: * Include recreated system databases during selective restore. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.) * Exclude content-length from S3 signed headers. (Reviewed by Cynthia Shang. Suggested by Brian P Bockelman.) * Consolidate less commonly used repository storage options. (Reviewed by Cynthia Shang.) * Allow custom config-path default with ./configure --with-configdir. (Contributed by Michael Schout. Reviewed by David Steele.) * Log archive copy during backup. (Reviewed by Cynthia Shang, Stefan Fercot.) Documentation Improvements: * Update reference to include links to user guide examples. (Contributed by Cynthia Shang. Reviewed by David Steele.) * Update selective restore documentation with caveats. (Reviewed by Cynthia Shang, Stefan Fercot.) * Add compress-type clarification to archive-copy documentation. (Reviewed by Cynthia Shang, Stefan Fercot.) * Add compress-level defaults per compress-type value. (Contributed by Cynthia Shang. Reviewed by David Steele.) * Add note about required NFS settings being the same as PostgreSQL. (Contributed by Cynthia Shang. Reviewed by David Steele.)
2021-04-05 15:18:20 +02:00
pgBackRest repositories can be located in S3, Azure, and GCS compatible object stores to allow for virtually unlimited capacity and retention.
2017-06-12 16:52:32 +02:00
2017-11-15 00:07:22 +02:00
### Encryption
pgBackRest can encrypt the repository to secure backups wherever they are stored.
### Compatibility with ten versions of PostgreSQL
pgBackRest includes support for ten versions of PostgreSQL, the five supported versions and the last five EOL versions. This allows ample time to upgrade to a supported version.
## Getting Started
pgBackRest strives to be easy to configure and operate:
- [User guides](http://www.pgbackrest.org/user-guide-index.html) for various operating systems and PostgreSQL versions.
2015-10-28 11:19:33 +02:00
- [Command reference](http://www.pgbackrest.org/command.html) for command-line operations.
- [Configuration reference](http://www.pgbackrest.org/configuration.html) for creating pgBackRest configurations.
Documentation for v1 can be found [here](http://www.pgbackrest.org/1). No further releases are planned for v1 because v2 is backward-compatible with v1 options and repositories.
## Contributions
Contributions to pgBackRest are always welcome! Please see our [Contributing Guidelines](https://github.com/pgbackrest/pgbackrest/blob/main/CONTRIBUTING.md) for details on how to contribute features, improvements or issues.
## Support
pgBackRest is completely free and open source under the [MIT](https://github.com/pgbackrest/pgbackrest/blob/main/LICENSE) license. You may use it for personal or commercial purposes without any restrictions whatsoever. Bug reports are taken very seriously and will be addressed as quickly as possible.
Creating a robust disaster recovery policy with proper replication and backup strategies can be a very complex and daunting task. You may find that you need help during the architecture phase and ongoing support to ensure that your enterprise continues running smoothly.
[Crunchy Data](http://www.crunchydata.com) provides packaged versions of pgBackRest for major operating systems and expert full life-cycle commercial support for pgBackRest and all things PostgreSQL. [Crunchy Data](http://www.crunchydata.com) is committed to providing open source solutions with no vendor lock-in, ensuring that cross-compatibility with the community version of pgBackRest is always strictly maintained.
2016-04-16 17:11:29 +02:00
Please visit [Crunchy Data](http://www.crunchydata.com) for more information.
## Recognition
Primary recognition goes to Stephen Frost for all his valuable advice and criticism during the development of pgBackRest.
2016-01-14 05:48:35 +02:00
[Crunchy Data](http://www.crunchydata.com) has contributed significant time and resources to pgBackRest and continues to actively support development. [Resonate](http://www.resonate.com) also contributed to the development of pgBackRest and allowed early (but well tested) versions to be installed as their primary PostgreSQL backup solution.
[Armchair](https://thenounproject.com/search/?q=lounge+chair&i=129971) graphic by [Sandor Szabo](https://thenounproject.com/sandorsz).