1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
pgbackrest/README.md

106 lines
8.4 KiB
Markdown
Raw Normal View History

# pgBackRest <br/> Reliable PostgreSQL Backup & Restore
2014-03-06 03:51:03 +03:00
## Introduction
pgBackRest aims to be a simple, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads.
v0.10: Backup and archiving are functional This version has been put into production at Resonate, so it does work, but there are a number of major caveats. * No restore functionality, but the backup directories are consistent Postgres data directories. You'll need to either uncompress the files or turn off compression in the backup. Uncompressed backups on a ZFS (or similar) filesystem are a good option because backups can be restored locally via a snapshot to create logical backups or do spot data recovery. * Archiving is single-threaded. This has not posed an issue on our multi-terabyte databases with heavy write volume. Recommend a large WAL volume or to use the async option with a large volume nearby. * Backups are multi-threaded, but the Net::OpenSSH library does not appear to be 100% threadsafe so it will very occasionally lock up on a thread. There is an overall process timeout that resolves this issue by killing the process. Yes, very ugly. * Checksums are lost on any resumed backup. Only the final backup will record checksum on multiple resumes. Checksums from previous backups are correctly recorded and a full backup will reset everything. * The backup.manifest is being written as Storable because Config::IniFile does not seem to handle large files well. Would definitely like to save these as human-readable text. * Absolutely no documentation (outside the code). Well, excepting these release notes. * Lots of other little things and not so little things. Much refactoring to follow.
2014-03-06 03:53:13 +03:00
Instead of relying on traditional backup tools like tar and rsync, pgBackRest implements all backup features internally and uses a custom protocol for communicating with remote systems. Removing reliance on tar and rsync allows for better solutions to database-specific backup challenges. The custom remote protocol allows for more flexibility and limits the types of connections that are required to perform a backup which increases security.
v1.06: Backup from Standby and Bug Fixes Bug Fixes: * Fixed an issue where a tablespace link that referenced another link would not produce an error, but instead skip the tablespace entirely. (Reported by Michael Vitale.) * Fixed an issue where options that should not allow multiple values could be specified multiple times in pgbackrest.conf without an error being raised. (Reported by Michael Vitale.) * Fixed an issue where the protocol-timeout option was not automatically increased when the db-timeout option was increased. (Reported by Todd Vernick.) Features: * Backup from a standby cluster. A connection to the primary cluster is still required to start/stop the backup and copy files that are not replicated, but the vast majority of files are copied from the standby in order to reduce load on the master. * More flexible configuration for databases. Master and standby can both be configured on the backup server and pgBackRest will automatically determine which is the master. This means no configuration changes for backup are required after failing over from a master to standby when a separate backup server is used. * Exclude directories during backup that are cleaned, recreated, or zeroed by PostgreSQL at startup. These include pgsql_tmp and pg_stat_tmp. The postgresql.auto.conf.tmp file is now excluded in addition to files that were already excluded: backup_label.old, postmaster.opts, postmaster.pid, recovery.conf, recovery.done. * Experimental support for non-exclusive backups in PostgreSQL 9.6 beta4. Changes to the control/catalog/WAL versions in subsequent betas may break compatibility but pgBackRest will be updated with each release to keep pace. Refactoring: * Simplify protocol creation and identifying which host is local/remote. * Removed all OP_* function constants that were used only for debugging, not in the protocol, and replaced with __PACKAGE__. * Improvements in Db module: separated out connect() function, allow executeSql() calls that do not return data, and improve error handling. * Improve error message for links that reference links in manifest build. * Added hints to error message when relative paths are detected in archive-push or archive-get. * Improve backup log messages to indicate which host the files are being copied from.
2016-08-25 17:49:09 +02:00
pgBackRest [v1.06](https://github.com/pgbackrest/pgbackrest/releases/tag/release/1.06) is the current stable release. Release notes are on the [Releases](http://www.pgbackrest.org/release.html) page.
## Features
2016-09-06 15:35:02 +02:00
### Multi-process Backup & Restore
2016-09-06 15:35:02 +02:00
Compression is usually the bottleneck during backup operations but, even with now ubiquitous multi-core servers, most database backup solutions are still single-process. pgBackRest solves the compression bottleneck with multi-processing.
Utilizing multiple cores for compression makes it possible to achieve 1TB/hr raw throughput even on a 1Gb/s link. More cores and a larger pipe lead to even higher throughput.
### Local or Remote Operation
A custom protocol allows pgBackRest to backup, restore, and archive locally or remotely via SSH with minimal configuration. An interface to query PostgreSQL is also provided via the protocol layer so that remote access to PostgreSQL is never required, which enhances security.
### Full, Incremental, & Differential Backups
Full, differential, and incremental backups are supported. pgBackRest is not susceptible to the time resolution issues of rsync, making differential and incremental backups completely safe.
### Backup Rotation & Archive Expiration
Retention polices can be set for full and differential backups to create coverage for any timeframe. WAL archive can be maintained for all backups or strictly for the most recent backups. In the latter case WAL required to make older backups consistent will be maintained in the archive.
### Backup Integrity
Checksums are calculated for every file in the backup and rechecked during a restore. After a backup finishes copying files, it waits until every WAL segment required to make the backup consistent reaches the repository.
Backups in the repository are stored in the same format as a standard PostgreSQL cluster (including tablespaces). If compression is disabled and hard links are enabled it is possible to snapshot a backup in the repository and bring up a PostgreSQL cluster directly on the snapshot. This is advantageous for terabyte-scale databases that are time consuming to restore in the traditional way.
All operations utilize file and directory level fsync to ensure durability.
### Backup Resume
An aborted backup can be resumed from the point where it was stopped. Files that were already copied are compared with the checksums in the manifest to ensure integrity. Since this operation can take place entirely on the backup server, it reduces load on the database server and saves time since checksum calculation is faster than compressing and retransmitting data.
### Streaming Compression & Checksums
Compression and checksum calculations are performed in stream while files are being copied to the repository, whether the repository is located locally or remotely.
If the repository is on a backup server, compression is performed on the database server and files are transmitted in a compressed format and simply stored on the backup server. When compression is disabled a lower level of compression is utilized to make efficient use of available bandwidth while keeping CPU cost to a minimum.
### Delta Restore
2016-09-06 15:35:02 +02:00
The manifest contains checksums for every file in the backup so that during a restore it is possible to use these checksums to speed processing enormously. On a delta restore any files not present in the backup are first removed and then checksums are taken for the remaining files. Files that match the backup are left in place and the rest of the files are restored as usual. Multi-processing can lead to a dramatic reduction in restore times.
### Advanced Archiving
Dedicated commands are included for both pushing WAL to the archive and retrieving WAL from the archive.
The push command automatically detects WAL segments that are pushed multiple times and de-duplicates when the segment is identical, otherwise an error is raised. The push and get commands both ensure that the database and repository match by comparing PostgreSQL versions and system identifiers. This precludes the possibility of misconfiguring the WAL archive location.
Asynchronous archiving allows compression and transfer to be offloaded to another process which maintains a continuous connection to the remote server, improving throughput significantly. This can be a critical feature for databases with extremely high write volume.
### Tablespace & Link Support
Tablespaces are fully supported and on restore tablespaces can be remapped to any location. It is also possible to remap all tablespaces to one location with a single command which is useful for development restores.
File and directory links are supported for any file or directory in the PostgreSQL cluster. When restoring it is possible to restore all links to their original locations, remap some or all links, or restore some or all links as normal files or directories within the cluster directory.
### Compatibility with PostgreSQL >= 8.3
pgBackRest includes support for versions down to 8.3, since older versions of PostgreSQL are still regularly utilized.
## Getting Started
pgBackRest strives to be easy to configure and operate:
- [User guide](http://www.pgbackrest.org/user-guide.html) for Debian & Ubuntu / PostgreSQL 9.4.
2015-10-28 11:19:33 +02:00
- [Command reference](http://www.pgbackrest.org/command.html) for command-line operations.
- [Configuration reference](http://www.pgbackrest.org/configuration.html) for creating pgBackRest configurations.
## Contributions
Contributions to pgBackRest are always welcome!
Code fixes or new features can be submitted via pull requests. Ideas for new features and improvements to existing functionality or documentation can be [submitted as issues](https://github.com/pgbackrest/pgbackrest/issues). You may want to check the [Feature Backlog](https://github.com/pgbackrest/pgbackrest/wiki#backlog) to see if your suggestion has already been submitted.
2016-04-16 17:11:29 +02:00
Bug reports should be [submitted as issues](https://github.com/pgbackrest/pgbackrest/issues). Please provide as much information as possible to aid in determining the cause of the problem.
You will always receive credit in the [release notes](http://www.pgbackrest.org/release.html) for your contributions.
## Support
2016-04-16 17:11:29 +02:00
pgBackRest is completely free and open source under the [MIT](https://github.com/pgbackrest/pgbackrest/blob/master/LICENSE) license. You may use it for personal or commercial purposes without any restrictions whatsoever. Bug reports are taken very seriously and will be addressed as quickly as possible.
Creating a robust disaster recovery policy with proper replication and backup strategies can be a very complex and daunting task. You may find that you need help during the architecture phase and ongoing support to ensure that your enterprise continues running smoothly.
[Crunchy Data](http://www.crunchydata.com) provides packaged versions of pgBackRest for major operating systems and expert full life-cycle commercial support for pgBackRest and all things PostgreSQL. [Crunchy Data](http://www.crunchydata.com) is committed to providing open source solutions with no vendor lock-in, ensuring that cross-compatibility with the community version of pgBackRest is always strictly maintained.
2016-04-16 17:11:29 +02:00
Please visit [Crunchy Data](http://www.crunchydata.com) for more information.
## Recognition
Primary recognition goes to Stephen Frost for all his valuable advice and criticism during the development of pgBackRest.
2016-01-14 05:48:35 +02:00
[Crunchy Data](http://www.crunchydata.com) has contributed significant time and resources to pgBackRest and continues to actively support development. [Resonate](http://www.resonate.com) also contributed to the development of pgBackRest and allowed early (but well tested) versions to be installed as their primary PostgreSQL backup solution.
[Armchair](https://thenounproject.com/search/?q=lounge+chair&i=129971) graphic by [Sandor Szabo](https://thenounproject.com/sandorsz).