Bug Fixes:
* Fix remote timeout in delta restore. When performing a delta restore on a largely unchanged cluster the remote could timeout if no files were fetched from the repository within protocol-timeout. Add keep-alives to prevent remote timeout. (Reported by James Sewell, Jens Wilke.)
* Fix handling of repeated HTTP headers. When HTTP headers are repeated they should be considered equivalent to a single comma-separated header rather than generating an error, which was the prior behavior. (Reported by donicrosby.)
Improvements:
* JSON output from the info command is no longer pretty-printed. Monitoring systems can more easily ingest the JSON without linefeeds. External tools such as jq can be used to pretty-print if desired. (Contributed by Cynthia Shang.)
* The check command is implemented entirely in C. (Contributed by Cynthia Shang.)
Documentation Improvements:
* Document how to contribute to pgBackRest. (Contributed by Cynthia Shang.)
* Document maximum version for auto-stop option. (Contributed by Brad Nicholson.)
Test Suite Improvements:
* Fix container test path being used when --vm=none. (Suggested by Stephen Frost.)
* Fix mismatched timezone in expect test. (Suggested by Stephen Frost.)
* Don't autogenerate embedded libc code by default. (Suggested by Stephen Frost.)
When HTTP headers are repeated they should be considered equivalent to a single comma-separated header rather than generating an error, which was the prior behavior.
Reported by donicrosby.
We had some problems with newer versions so had held off on updating. Those problems appear to have been resolved.
In addition, the --compat flag is no longer required. Prior versions of MinIO required all parts of a multi-part upload (except the last) to be of equal size. The --compat flag was introduced to restore the default S3 behavior. Now --compat is only required when ETag is being used for MD5 verification, which we don't do.
This documentation shows how to build a development environment on Ubuntu 19.04 and should work for other Debian-based distros.
Note that this document is not included in automated testing due to some unresolved issues with Docker in Docker on Travis CI. We'll address this in the future when we add contributing documentation to the website.
This is only needed when new code is added to the Perl C library, which is becoming rare as the migration progresses.
Also, the code will vary slightly based on the Perl version used for generation so for normal users it is just noise.
Suggested by Stephen Frost.
When performing a delta restore on a largely unchanged cluster the remote could timeout if no files were fetched from the repository within protocol-timeout.
Add keep-alives to prevent remote timeout.
Reported by James Sewell, Jens Wilke.
Note that building the manifest on each host has been temporarily removed.
This feature will likely be brought back as a non-default option (after the manifest code has been fully migrated to C) since it can be fairly expensive.
Features:
* PostgreSQL 12 support.
* Add info command set option for detailed text output. The additional details include databases that can be used for selective restore and a list of tablespaces and symlinks with their default destinations. (Contributed by Cynthia Shang. Suggested by Stephen Frost, ejberdecia.)
* Add standby restore type. This restore type automatically adds standby_mode=on to recovery.conf for PostgreSQL < 12 and creates standby.signal for PostgreSQL ≥ 12, creating a common interface between PostgreSQL versions. (Reviewed by Cynthia Shang.)
Improvements:
* The restore command is implemented entirely in C. (Reviewed by Cynthia Shang.)
Documentation Improvements:
* Document the relationship between db-timeout and protocol-timeout. (Contributed by Cynthia Shang. Suggested by James Chanco Jr.)
* Add documentation clarifications regarding standby repositories. (Contributed by Cynthia Shang.)
* Add FAQ for time-based Point-in-Time Recovery. (Contributed by Cynthia Shang.)
Recovery settings are now written into postgresql.auto.conf instead of recovery.conf. Existing recovery_target* settings will be commented out to help avoid conflicts.
A comment is added before recovery settings to identify them as written by pgBackRest since it is unclear how, in general, old settings will be removed.
recovery.signal and standby.signal are automatically created based on the recovery settings.
The additional details include databases that can be used for selective restore and a list of tablespaces and symlinks with their default destinations.
This information is not included in the JSON output because it requires reading the manifest which is too IO intensive to do for all manifests. We plan to include this information for JSON in a future release.
This restore type automatically adds standby_mode=on to recovery.conf.
This could be accomplished previously by setting --recovery-option=standby_mode=on but PostgreSQL 12 requires standby mode to be enabled by a special file named standby.signal.
The new restore type allows us to maintain a common interface between PostgreSQL versions.
We haven't had the time to complete this documentation and it has suffered bit rot.
This prevents us from building the docs on PostgreSQL >= 11 so just comment it all out until it can be updated.
For the most part this is a direct migration of the Perl code into C.
There is one important behavioral change with regard to how file permissions are handled. The Perl code tried to set ownership as it was in the manifest even when running as an unprivileged user. This usually just led to errors and frustration.
The C code works like this:
If a restore is run as a non-root user (the typical scenario) then all files restored will belong to the user/group executing pgBackRest. If existing files are not owned by the executing user/group then an error will result if the ownership cannot be updated to the executing user/group. In that case the file ownership will need to be updated by a privileged user before the restore can be retried.
If a restore is run as the root user then pgBackRest will attempt to recreate the ownership recorded in the manifest when the backup was made. Only user/group names are stored in the manifest so the same names must exist on the restore host for this to work. If the user/group name cannot be found locally then the user/group of the PostgreSQL data directory will be used and finally root if the data directory user/group cannot be mapped to a name.
Reviewed by Cynthia Shang.
The backup manifest stores a complete list of all files, links, and paths in a backup along with metadata such as checksums, sizes,
timestamps, etc. A list of databases is also included for selective restore.
The purpose of the manifest is to allow the restore command to confidently reconstruct the PostgreSQL data directory and ensure that
nothing is missing or corrupt. It is also useful for reporting, e.g. size of backup, backup time, etc.
For now, migrate enough functionality to implement the restore command.
Reviewed by Cynthia Shang.
These features finally make the ls command practical.
Currently the JSON contains only name, type, and size. We may add more fields in the future, but these seem like the minimum needed to be useful.
Info files required three copies in memory to be loaded (the original string, an ini representation, and the final info object). Not only was this memory inefficient but the Ini object does sequential scans when searching for keys making large files very slow to load.
This has not been an issue since archive.info and backup.info are very small, but it becomes a big deal when loading manifests with hundreds of thousands of files.
Instead of holding copies of the data in memory, use a callback to deliver the ini data directly to the object when loading. Use a similar method for save to avoid having an intermediate copy. Save is a bit complex because sections/keys must be written in alpha order or older versions of pgBackRest will not calculate the correct checksum.
Also move the load retry logic to helper functions rather than embedding it in the Info object. This allows for more flexibility in loading and ensures that stack traces will be available when developing unit tests.
Reviewed by Cynthia Shang.
286a106a updated the documentation to build pgBackRest as an unprivileged user, but the wget command was missed. This command is not actually run, just displayed, because the release is not yet available when the documentation is built.
Update the wget command to run as the local user.
Bug Fixes:
* Improve slow manifest build for very large quantities of tables/segments. (Reported by Jens Wilke.)
* Fix exclusions for special files. (Reported by CluelessTechnologist, Janis Puris, Rachid Broum.)
Improvements:
* The stanza-create/update/delete commands are implemented entirely in C. (Contributed by Cynthia Shang.)
* The start/stop commands are implemented entirely in C. (Contributed by Cynthia Shang.)
* Create log directories/files with 0750/0640 mode. (Suggested by Damiano Albani.)
Documentation Bug Fixes:
* Fix yum.p.o package being installed when custom package specified. (Reported by Joe Ayers, John Harvey.)
Documentation Improvements:
* Build pgBackRest as an unprivileged user. (Suggested by Laurenz Albe.)
The {[os-type-is-centos]} expression was missing parens which meant "and" expressions built on it would always evaluate true if the os-type was centos6.
Reported by Joe Ayers, John Harvey.
Prior to 2.16 the Perl manifest code would skip any file that began with a dot. This was not intentional but it allowed PostgreSQL socket files to be located in the data directory. The new C code in 2.16 did not have this unintentional exclusion so socket files in the data directory caused errors.
Worse, the file type error was being thrown before the exclusion check so there was really no way around the issue except to move the socket files out of the data directory.
Special file types (e.g. socket, pipe) will now be automatically skipped and a warning logged to notify the user of the exclusion. The warning can be suppressed with an explicit --exclude.
Reported by CluelessTechnologist, Janis Puris, Rachid Broum.
Putting the checksum at the beginning of the file made it impossible to stream the file out when saving. The entire file had to be held in memory while it was checksummed so the checksum could be written at the beginning.
Instead place the checksum at the end. This does not break the existing Perl or C code since the read is not order dependent.
There are no plans to improve the Perl code to take advantage of this change, but it will make the C implementation more efficient.
Reviewed by Cynthia Shang.
pgBackRest was being built by root in the documentation which is definitely not best practice.
Instead build as the unprivileged default container user. Sudo privileges are still required to install.
Suggested by Laurenz Albe.
storagePosixInfoList() processed each directory in a single memory context. If the directory contained hundreds of thousands of files processing became very slow due to the number of allocations.
Instead, reset the memory context every thousand files to minimize the number of allocations active at once, improving both speed and memory consumption.
Reported by Jens Wilke.
The log directories/files were being created with a mix of modes depending on whether they were created in C or Perl. In particular, the C code was creating log files with the execute bit set for the user and group which was just odd.
Standardize on 750/640 for both code paths.
Suggested by Damiano Albani.
The Perl versions remain because they are still being used by the Perl stanza commands. Once the stanza commands are migrated they can be removed.
Contributed by Cynthia Shang.