2016-11-16 19:34:21 +02:00
pg_probackup fork of pg_arman by Postgres Professional
2016-06-02 16:07:25 +02:00
========================================
2016-11-16 19:34:21 +02:00
pg_probackup is a backup and recovery manager for PostgreSQL servers able to do
2014-01-30 09:58:55 +03:00
differential and full backup as well as restore a cluster to a
2013-12-15 16:05:36 +03:00
state defined by a given recovery target. It is designed to perform
periodic backups of an existing PostgreSQL server, combined with WAL
archives to provide a way to recover a server in case of failure of
2014-01-30 09:58:55 +03:00
server because of a reason or another. Its differential backup
2014-01-30 09:36:22 +03:00
facility reduces the amount of data necessary to be taken between
2013-12-15 16:05:36 +03:00
two consecutive backups.
2016-09-29 16:33:21 +02:00
Main features:
* incremental backup from WAL and PTRACK
* backup from replica
* multithreaded backup and restore
* autonomous backup without archive command (will need slot replication)
2016-10-31 19:10:50 +02:00
Requirements:
* >=PostgreSQL 9.5
* >=gcc 4.4 or >=clang 3.6 or >= XLC 12.1
* pthread
2013-12-15 16:05:36 +03:00
Download
--------
The latest version of this software can be found on the project website at
2016-11-16 19:34:21 +02:00
https://github.com/postgrespro/pg_probackup. Original fork of pg_probackup can be
2016-06-02 16:07:25 +02:00
found at https://github.com/michaelpq/pg_arman.
2013-12-15 16:05:36 +03:00
Installation
------------
2016-11-16 19:34:21 +02:00
Compiling pg_probackup requires a PostgreSQL installation to be in place
2016-01-19 02:26:16 +02:00
as well as a raw source tree. Pass the path to the PostgreSQL source tree
to make, in the top_srcdir variable:
2013-12-15 16:05:36 +03:00
2016-01-19 02:26:16 +02:00
make USE_PGXS=1 top_srcdir=< path to PostgreSQL source tree >
2013-12-15 16:05:36 +03:00
In addition, you must have pg_config in $PATH.
2016-11-16 19:34:21 +02:00
The current version of pg_probackup is compatible with PostgreSQL 9.5 and
2013-12-15 16:05:36 +03:00
upper versions.
2013-12-15 16:07:32 +03:00
Platforms
---------
2016-11-16 19:34:21 +02:00
pg_probackup has been tested on Linux and Unix-based platforms.
2013-12-15 16:07:32 +03:00
2013-12-15 16:05:36 +03:00
Documentation
-------------
2016-11-16 19:34:21 +02:00
All the documentation you can find [here ](doc/pg_probackup.md ).
2014-01-11 16:57:48 +03:00
2013-12-15 16:05:36 +03:00
Regression tests
----------------
2016-12-14 16:14:53 +02:00
For tests you must have python 2.7 or python 3.3 and higher. Also good idea
is make virtual enviroment by `virtualenv` .
First of all you need to install `testgres` python module which contains useful
functions to start postgres clusters and make queries:
```
pip install testgres
```
To run tests execute:
```
python -m unittest tests
```
from current (root of project) directory. If you want to run a specific postgres build then
you should specify the path to your pg_config executable by setting PG_CONFIG
environment variable:
```
export PG_CONFIG=/path/to/pg_config
```
2013-12-15 16:05:36 +03:00
2016-06-02 16:07:25 +02:00
Block level incremental backup
------------------------------
Idea of block level incremental backup is that you may backup only blocks
changed since last full backup. It gives two major benefits: taking backups
faster and making backups smaller.
The major question here is how to get the list of changed blocks. Since
each block contains LSN number, changed blocks could be retrieved by full scan
of all the blocks. But this approach consumes as much server IO as full
backup.
This is why we implemented alternative approaches to retrieve
list of changed blocks.
1. Scan WAL archive and extract changed blocks from it. However, shortcoming
of these approach is requirement to have WAL archive.
2. Track bitmap of changes blocks inside PostgreSQL (ptrack). It introduces
some overhead to PostgreSQL performance. On our experiments it appears to be
less than 3%.
2016-11-16 19:34:21 +02:00
These two approaches were implemented in this fork of pg_probackup. The second
2017-03-13 17:46:46 +02:00
approach requires [patch for PostgreSQL 9.6.2 ](https://gist.github.com/alubennikova/9daacf35790eca1a09b63a1bca86d836 ) or
[patch for PostgreSQL 10 (master) ](https://gist.github.com/alubennikova/d24f61804525f0248fa71a1075158c21 ).
2016-06-02 16:07:25 +02:00
Testing block level incremental backup
--------------------------------------
2017-03-13 17:46:46 +02:00
You need to apply ptrack patch to [PostgreSQL 9.6.2 ](https://gist.github.com/alubennikova/9daacf35790eca1a09b63a1bca86d836 )
or [PostgreSQL 10 (master) ](https://gist.github.com/alubennikova/d24f61804525f0248fa71a1075158c21 ).
Or you can build and install [PGPRO9_5 or PGPRO9_6 branch of PostgreSQL ](https://github.com/postgrespro/postgrespro ).
Note that PGPRO branches currently contain old version of ptrack.
2016-06-02 16:07:25 +02:00
### Retrieving changed blocks from WAL archive
You need to enable WAL archive by adding following lines to postgresql.conf:
```
wal_level = archive
2016-12-16 16:56:47 +02:00
archive_mode = on
2016-11-16 19:34:21 +02:00
archive_command = 'test ! -f /home/postgres/backup/wal/%f & & cp %p /home/postgres/backup/wal/%f'
2016-06-02 16:07:25 +02:00
```
Example backup (assuming PostgreSQL is running):
```bash
# Init pg_aramn backup folder
2016-11-16 19:34:21 +02:00
pg_probackup init -B /home/postgres/backup
2016-06-02 16:07:25 +02:00
# Make full backup with 2 thread and verbose mode.
2016-11-16 19:34:21 +02:00
pg_probackup backup -B /home/postgres/backup -D /home/postgres/pgdata -b full -v -j 2
2016-06-02 16:07:25 +02:00
# Show backups information
2016-11-16 19:34:21 +02:00
pg_probackup show -B /home/postgres/backup
2016-06-02 16:07:25 +02:00
# Now you can insert or update some data in your database
# Then start the incremental backup.
2016-11-16 19:34:21 +02:00
pg_probackup backup -B /home/postgres/backup -D /home/postgres/pgdata -b page -v -j 2
2016-06-02 16:07:25 +02:00
# You should see that increment is really small
2016-11-16 19:34:21 +02:00
pg_probackup show -B /home/postgres/backup
2016-06-02 16:07:25 +02:00
```
For restore after remove your pgdata you can use:
```
2016-11-16 19:34:21 +02:00
pg_probackup restore -B /home/postgres/backup -D /home/postgres/pgdata -j 4 --verbose
2016-06-02 16:07:25 +02:00
```
### Retrieving changed blocks from ptrack
The advantage of this approach is that you don't have to save WAL archive. You will need to enable ptrack in postgresql.conf (restart required).
```
ptrack_enable = on
```
2016-11-16 19:34:21 +02:00
Also, some WALs still need to be fetched in order to get consistent backup. pg_probackup can fetch them trough the streaming replication protocol. Thus, you also need to [enable streaming replication connection ](https://wiki.postgresql.org/wiki/Streaming_Replication ).
2016-06-02 16:07:25 +02:00
Example backup (assuming PostgreSQL is running):
```bash
# Init pg_aramn backup folder
2016-11-16 19:34:21 +02:00
pg_probackup init -B /home/postgres/backup
2016-06-02 16:07:25 +02:00
# Make full backup with 2 thread and verbose mode.
2016-11-16 19:34:21 +02:00
pg_probackup backup -B /home/postgres/backup -D /home/postgres/pgdata -b full -v -j 2 --stream
2016-06-02 16:07:25 +02:00
# Show backups information
2016-11-16 19:34:21 +02:00
pg_probackup show -B /home/postgres/backup
2016-06-02 16:07:25 +02:00
# Now you can insert or update some data in your database
# Then start the incremental backup.
2016-11-16 19:34:21 +02:00
pg_probackup backup -B /home/postgres/backup -D /home/postgres/pgdata -b ptrack -v -j 2 --stream
2016-06-02 16:07:25 +02:00
# You should see that increment is really small
2016-11-16 19:34:21 +02:00
pg_probackup show -B /home/postgres/backup
2016-06-02 16:07:25 +02:00
```
For restore after remove your pgdata you can use:
```
2016-11-16 19:34:21 +02:00
pg_probackup restore -B /home/postgres/backup -D /home/postgres/pgdata -j 4 --verbose --stream
2016-06-02 16:07:25 +02:00
```
2013-12-15 16:05:36 +03:00
License
-------
2016-11-16 19:34:21 +02:00
pg_probackup can be distributed under the PostgreSQL license. See COPYRIGHT
2014-01-27 06:19:00 +03:00
file for more information. pg_arman is a fork of the existing project
2016-01-19 06:06:34 +02:00
pg_rman, initially created and maintained by NTT and Itagaki Takahiro.