Bug Fixes:
* Fix option warnings breaking async archive-get/archive-push. (Reviewed by Cynthia Shang. Reported by Lev Kokotov.)
* Fix memory leak in backup during archive copy. (Reviewed by Cynthia Shang. Reported by Christian ROUX, Efremov Egor.)
* Fix stack overflow in cipher passphrase generation. (Reviewed by Cynthia Shang. Reported by bsiara.)
* Fix repo-ls / on S3 repositories. (Reviewed by Cynthia Shang. Reported by Lesovsky Alexey.)
Features:
* Multiple repository support. (Contributed by Cynthia Shang, David Steele. Reviewed by Stefan Fercot, Stephen Frost.)
* GCS support for repository storage. (Reviewed by Cynthia Shang.)
* Add archive-header-check option. (Reviewed by Stephen Frost, Cynthia Shang. Suggested by Hans-Jürgen Schönig.)
Improvements:
* Include recreated system databases during selective restore. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.)
* Exclude content-length from S3 signed headers. (Reviewed by Cynthia Shang. Suggested by Brian P Bockelman.)
* Consolidate less commonly used repository storage options. (Reviewed by Cynthia Shang.)
* Allow custom config-path default with ./configure --with-configdir. (Contributed by Michael Schout. Reviewed by David Steele.)
* Log archive copy during backup. (Reviewed by Cynthia Shang, Stefan Fercot.)
Documentation Improvements:
* Update reference to include links to user guide examples. (Contributed by Cynthia Shang. Reviewed by David Steele.)
* Update selective restore documentation with caveats. (Reviewed by Cynthia Shang, Stefan Fercot.)
* Add compress-type clarification to archive-copy documentation. (Reviewed by Cynthia Shang, Stefan Fercot.)
* Add compress-level defaults per compress-type value. (Contributed by Cynthia Shang. Reviewed by David Steele.)
* Add note about required NFS settings being the same as PostgreSQL. (Contributed by Cynthia Shang. Reviewed by David Steele.)
hrnReplaceKey() was added to the TEST_ERROR*() macros in 58760486 but some calls to TEST_ERROR*() already used it. This led to the function being called twice on the same buffer which had no effect but valgrind definitely did not like.
Remove extraneous calls to make valgrind happy. Since this is test code there are no implications for production.
The command-example and command-example-list elements were removed from the documentation rendering some time ago so these tags were dead code. The tags, however, contained some examples and information that were pertinent to the command, so where possible, the information was included in the description of the command and/or the user-guide and links to the relevant user guide sections were added.
Note that some commands could not be updated with user guide references since doing so would cause a cyclical reference in the user guide. These commands have an internal comment to indicate this.
In addition, some clarifications were added (e.g. expire --set option) where information was lacking.
Enabled by default, this option checks the WAL header against the PostgreSQL version and system identifier to ensure that the WAL is being copied to the correct stanza. This is in addition to checking pg_control against the stanza and verifying that WAL is being copied from the same PostgreSQL data directory where pg_control is located.
Therefore, disabling this check is fairly safe but should only be done when required, e.g. if the WAL is encrypted.
3b8f0ef missed some cases that could cause archive-push to fail:
* Checking archive info.
* Checking to see if a WAL segment already exists.
These cases are now handled so archive-push can succeed on any valid repos.
This improvement reduces the number of errors thrown; these errors will now be reported as a status for the stanza or repo as appropriate. Invalid option configurations are still thrown but all other errors are caught, formatted and reported. This was necessary for multiple repositories so that the command can complete gathering information from each repository and report the results rather than immediately aborting when an error occurs.
Two new error codes were introduced:
6 = requested backup not found
99 = other, which is used to indicate an error has occurred that requires more details to be provided
A new stanza name of "[invalid]" was created for instances where a stanza was not specified and no stanza can be found.
If there is only one repository configured the error will move up to the stanza level with the standard error formatting of 'error (message)' where the message will be "other" and the details of the error will be listed on the next line(s):
stanza: stanza1
status: error (other)
[CryptoError] unable to load info file '/var/lib/pgbackrest/repo/backup/stanza1/backup.info' or '/var/lib/pgbackrest/repo/backup/stanza1/backup.info.copy':
CryptoError: cipher header invalid
HINT: is or was the repo encrypted?
FileMissingError: unable to open missing file '/var/lib/pgbackrest/repo/backup/stanza1/backup.info.copy' for read
HINT: backup.info cannot be opened and is required to perform a backup.
HINT: has a stanza-create been performed?
HINT: use option --stanza if encryption settings are different for the stanza than the global
cipher: aes-256-cbc
If a backup set is requested but is not found on any repo, a stanza-level status error of 'requested backup not found' is reported when there are no other errors:
pgbackrest info --stanza=demo --set=bogus
stanza: demo
status: error (requested backup not found)
cipher: mixed
repo1: aes-256-cbc
repo2: none
If there are multiple repositories configured and a single repo is in error but the other repos are ok or have a different error:
pgbackrest info --stanza=demo --set=20210322-171211F
stanza: demo
status: mixed
repo1: error
[CryptoError] unable to load info file '/var/lib/pgbackrest/repo/backup/stanza1/backup.info' or '/var/lib/pgbackrest/repo/backup/stanza1/backup.info.copy':
CryptoError: cipher header invalid
HINT: is or was the repo encrypted?
FileMissingError: unable to open missing file '/var/lib/pgbackrest/repo/backup/stanza1/backup.info.copy' for read
HINT: backup.info cannot be opened and is required to perform a backup.
HINT: has a stanza-create been performed?
HINT: use option --stanza if encryption settings are different for the stanza than the global
repo2: ok
cipher: mixed
repo1: aes-256-cbc
repo2: none
db (current)
wal archive min/max (12): 000000010000000000000001/000000010000000000000003
full backup: 20210322-171211F
timestamp start/stop: 2021-03-22 17:12:11 / 2021-03-22 17:12:28
wal start/stop: 000000010000000000000002 / 000000010000000000000002
database size: 23.4MB, database backup size: 23.4MB
repo2: backup set size: 2.8MB, backup size: 2.8MB
database list: postgres (13359)
Json output will include the repository information and any error information. If no stanzas are found, then [invalid] will be set as the name:
[
{
"archive":[],
"backup":[],
"cipher":"none",
"db":[],
"name":"[invalid]",
"repo":[
{
"cipher":"none",
"key":1,
"status":{
"code":99,
"message":"[PathOpenError] unable to list file info for path '/var/lib/pgbackrest/repo2/backup': [13] Permission denied"
}
}
],
"status":{
"code":99,
"lock":{"backup":{"held":false}},
"message":"other"
}
}
]
The content-length header was being signed since it was the only header that didn't need to be and it seemed simpler just to sign it as well. Also, the S3 documentation encourages signing as many headers as possible to avoid tampering.
However, some proxies munge this header causing authentication failure, so skip signing content-length.
Make protocol handlers have one function per command. This allows the logic of finding the handler to be in ProtocolServer, isolates each command to a function, and removes the need to test the "not found" condition for each handler.
S3 returns 200 for HEAD / which indicates it is a file but does not return the expected headers which causes an error.
Rather than fix this for S3, just automatically return / as not existing for any storage that does not support paths.
Also add some defensive checks to prevent this from generating a segfault if it happens again.
Some standard system databases (e.g. postgres) may be recreated by the user and have an OID that makes them look like user databases.
Identify the standard three system databases (template0, template1, postgres) and restore them non-zeroed no matter what OID they have.
Cipher type was inferred from the presence of cipherSubPass rather than being passed explicitly in order to maintain compatibility with Perl backupFile().
Now that Perl is gone it makes sense to pass it explicitly, as we do elsewhere.
This test was added to take the place of another test, which turned out not to be workable.
Even so, it adds coverages at little cost so it seems worth keeping.
Recovery may error unless --type=immediate is specified. This is because after consistency is reached PostgreSQL will flag zeroed pages as errors even for a full-page write.
For PostgreSQL ≥ 13 the ignore_invalid_pages setting may be used to ignore invalid pages. In this case it is important to check the logs after recovery to ensure that no invalid pages were reported in the selected databases.
It is best if the archive-push and backup commands have the same compress-type (e.g. lz4) when using archive-copy. Otherwise, the WAL segments will need to be recompressed with the compress-type used by the backup, which can be fairly expensive depending on how much WAL was generated during the backup.
When the FUNCTION_*_RESULT*() macros were renamed to FUNCTION_*_RETURN_*() in the core code the test harness macros were missed.
Update them to make the naming consistent.
There was already leakage here but when the compression transcoding was added it became a deluge.
There is some argument to be made that the filters should clean themselves up better but a temp mem context makes sense here anyway so do that.
The stanza-create, stanza-upgrade and stanza-delete were required to be run on the repository host. When there was only one repository allowed this was not a problem.
However, with the introduction of multiple repository support, this becomes more of a burden to the user, therefore the stanza-create, stanza-upgrade and stanza-delete commands have been improved to allow for them to be run remotely.
Moving to YAML allows the configuration data to be read by C programs.
Also go back to using YAML::XS since it is the only implementation that has proper boolean support.
Up to four repositories may be configured. A potential benefit is the ability to have a local repository for fast restores and a remote repository for redundancy.
Some commands, e.g. stanza-create/stanza-update, will automatically work with all configured repositories while others, e.g. stanza-delete, will require a repository to be specified using the repo option. See the command reference for details on which commands require the repository to be specified.
Note that the repo option is not required when only repo1 is configured in order to maintain backward compatibility. However, the repo option is required when a single repo is configured as, e.g. repo2. This is to prevent command breakage if a new repository is added later.
The archive-push command will always push WAL to the archive in all configured repositories but backups will need to be scheduled individually for each repository. In many cases this is desirable since backup types and retention will vary by repository. Likewise, restores must specify a repository. It is generally better to specify a repository for restores that has low latency/cost even if that means more recovery time. Only restore testing can determine which repository will be most efficient.
For single repository configurations there should be no change in behavior.
The HTML command reference was showing some options that were not valid because it did not properly understand the new role validity system. Also, the custom section for the new repo option was not being honored.
This is a bit messy because it leads to some duplicated code in help.c but there doesn't seem to be any way to fix that with the Perl data structures as they are.
This code is being migrated to C so it doesn't seem worth messing with it too much with the risk of breaking other things.
Some commands (repo-*, verify) still required the --repo option but it makes sense to give them the same treatment as backup and simply use the first repo when one is not specified.
This leaves stanza-delete as the only remaining command that requires --repo. This is by design to enhance safe usage.
The following options are renamed as specified:
repo1-azure-ca-file -> repo1-storage-ca-file
repo1-azure-ca-path -> repo1-storage-ca-path
repo1-azure-host -> repo1-storage-host
repo1-azure-port -> repo1-storage-port
repo1-azure-verify-tls -> repo1-storage-verify-tls
repo1-s3-ca-file -> repo1-storage-ca-file
repo1-s3-ca-path -> repo1-storage-ca-path
repo1-s3-host -> repo1-storage-host
repo1-s3-port -> repo1-storage-port
repo1-s3-verify-tls -> repo1-storage-verify-tls
The old option names (e.g. repo1-s3-port) will continue to work for repo1, but repo2, etc. will require the new names.
This allows the removal of the callback in the S3/Azure storage drivers that existed only to parse the size/time information.
The extra callback was required because not all callers of storage*ListInternal() want size/time info, so it was wasteful to add it to storage*ListInternal(). Now those callers can request type info only.
This wasn't exposed before because the remote protocol directly uses the storage driver, which bypasses the writeable checks.
However, the upcoming GCS driver explicitly requests write permissions so remote operations fail when a write is required.
It would be far better if the remote itself was marked as writeable but that will require much more work.