For the most part this is a direct migration of the Perl code into C.
There is one important behavioral change with regard to how file permissions are handled. The Perl code tried to set ownership as it was in the manifest even when running as an unprivileged user. This usually just led to errors and frustration.
The C code works like this:
If a restore is run as a non-root user (the typical scenario) then all files restored will belong to the user/group executing pgBackRest. If existing files are not owned by the executing user/group then an error will result if the ownership cannot be updated to the executing user/group. In that case the file ownership will need to be updated by a privileged user before the restore can be retried.
If a restore is run as the root user then pgBackRest will attempt to recreate the ownership recorded in the manifest when the backup was made. Only user/group names are stored in the manifest so the same names must exist on the restore host for this to work. If the user/group name cannot be found locally then the user/group of the PostgreSQL data directory will be used and finally root if the data directory user/group cannot be mapped to a name.
Reviewed by Cynthia Shang.
286a106a updated the documentation to build pgBackRest as an unprivileged user, but the wget command was missed. This command is not actually run, just displayed, because the release is not yet available when the documentation is built.
Update the wget command to run as the local user.
The {[os-type-is-centos]} expression was missing parens which meant "and" expressions built on it would always evaluate true if the os-type was centos6.
Reported by Joe Ayers, John Harvey.
pgBackRest was being built by root in the documentation which is definitely not best practice.
Instead build as the unprivileged default container user. Sudo privileges are still required to install.
Suggested by Laurenz Albe.
Implement switch WAL and archive check in C but leave the rest in Perl for now.
The main idea was to have some real integration tests for the new database code so the rest of the migration can wait.
Reviewed by Cynthia Shang.
This direct interface to libpq allows simple queries to be run against PostgreSQL and supports timeouts.
Testing is performed using a shim that can use scripted responses to test all aspects of the client code. The shim will be very useful for testing backup scenarios on complex topologies.
Reviewed by Cynthia Shang.
Maintaining the storage layer/drivers in two languages is burdensome. Since the integration tests require the Perl storage layer/drivers we'll need them even after the core code is migrated to C. Create an interface layer so the Perl code can be removed and new storage drivers/features introduced without adding Perl equivalents.
The goal is to move the integration tests to C so this interface will eventually be removed. That being the case, the interface was designed for maximum compatibility to ease the transition. The result looks a bit hacky but we'll improve it as needed until it can be retired.
This variable needs to be replaced right before being used without being added to the cache since the host repo path will vary from system to system.
This is frankly a bit of a hack to get the documentation to build in the Debian packages for the upcoming release. We'll need to come up with something more flexible going forward.
The documentation was using wal_level=hot_standby which is a deprecated setting.
Also remove the reference to wal_level=archive since it is no longer supported and is not recommended for older versions.
Suggested by Patrick McLaughlin.
Allows listing repo paths/files from the command-line, to be used primarily for testing and debugging.
This command is internal-only so the interface may change at any time without notice.
The documentation was relying on a ScalityS3 container built for testing which wasn't very transparent. Instead, use the stock minio container and configure it in the documentation.
Also, install certificates and CA so that TLS verification can be enabled.
It would be better if the documentation could be generated on multiple operating systems all in one go, but the doc system currently does not allow vars to be changed once they are set.
The solution is to run the docs for each required OS and stitch the documentation together. It's not pretty but it works and the automation in release.pl should at least make it easy to use.
Use autoconf to provide a basic configure script. WITH_BACKTRACE is yet to be migrated to configure and the unit tests still use a custom Makefile.
Each C file must include "build.auto.conf" before all other includes and defines. This is enforced by test.pl for includes, but it won't detect incorrect define ordering.
Update packages to call configure and use standard flags to pass options.
IMPORTANT NOTE: The new TLS/SSL implementation forbids dots in S3 bucket names per RFC-2818. This security fix is required for compliant hostname verification.
Bug Fixes:
* Fix issues when a path option is / terminated. (Reported by Marc Cousin.)
* Fix issues when log-level-file=off is set for the archive-get command. (Reported by Brad Nicholson.)
* Fix C code to recognize host:port option format like Perl does. (Reported by Kyle Nevins.)
* Fix issues with remote/local command logging options.
Improvements:
* The archive-push command is implemented entirely in C.
* Increase process-max limit to 999. (Suggested by Rakshitha-BR.)
* Improve error message when an S3 bucket name contains dots.
Documentation Improvements:
* Clarify that S3-compatible object stores are supported. (Suggested by Magnus Hagander.)
The documentation mentioned Amazon S3 frequently but failed to mention that other S3-compatible object stores are also supported.
Tone down the specific mentions of Amazon S3 and replace them with "S3-compatible object store" when appropriate.
Suggested by Magnus Hagander.
This new implementation should behave exactly like the old Perl code with the exception of a few updated log messages.
Remove as much of the Perl code as possible without breaking other commands.
This prevented packages from being passed to the documentation unless they were in the /backrest directory on the host.
Also make the local path /pgbackrest instead of the deprecated /backrest.
Reported by Heath Lord.
Admonitions call out places where the user should take special care.
Support added for HTML, PDF, Markdown and help text renderers. XML files have been updated accordingly.
Contributed by Cynthia Shang.
This was caused by a new container version that was released around December 5th. The new version explicitly denies user logons by leaving /var/run/nologin in place after boot.
The solution is to enable the service that is responsible for removing this file on a successful boot.
This code was generated during testing and it seemed a good idea to keep it. It is only a partial solution since the primary also needs additional configuration to be able to fail back and forth.
By default the documentation builds pgBackRest from source, but the documentation is also a good way to smoke-test packages.
Allow a package file to be specified by passing --var=package=/path/to/package.ext. This works for Debian and CentOS 6 builds.
Keywords were extremely limited and prevented us from generating multi-version documentation and other improvements.
Replace keywords with an if statement that can evaluate a Perl expression with variable replacement.
Since keywords were used to generate cache keys, add a --key-var parameter to identify which variables should make up the key.
This allows the documentation to be built more quickly and offline during development when --pre is specified on the command line.
Each host gets a pre-built container with all the execute elements marked pre. As long as the pre elements do not change the container will not need to be rebuilt.
The feature should not be used for CI builds as it may hide errors in the documentation.
Add XmlDocument, XmlNode, and XmlNodeList objects as a thin interface layer on libxml2.
This interface is not intended to be comprehensive. Only a few libxml2 capabilities are exposed but more can be added as needed.
Documentation block syntax requires that at least one var be specified.
This limitation should be removed but for now add a comment to describe why a bogus var is defined.
Unsecured, passwordless SSH can be a scary thing. If an attacker gains access to one system they can easily hop to other systems.
Add documentation on how to use the command parameter in authorized_keys to limit ssh to running a single command, pgbackrest. There is more that could be done for security but this likely addresses most needs.
Also change references to "trusted ssh" to "passwordless ssh" since this seems more correct.
Suggested by Stephen Frost, Magnus Hagander.
Bug Fixes:
* Fix issue where relative links in $PGDATA could be stored in the backup with the wrong path. This issue did not affect absolute links and relative tablespace links were caught by other checks. (Reported by Cynthia Shang.)
* Remove incompletely implemented online option from the check command. Offline operation runs counter to the purpose of this command, which is to check if archiving and backups are working correctly. (Reported by Jason O'Donnell.)
* Fix issue where errors raised in C were not logged when called from Perl. pgBackRest properly terminated with the correct error code but lacked an error message to aid in debugging. (Reported by Douglas J Hunley.)
* Fix issue when a boolean option (e.g. delta) was specified more than once. (Reported by Yogesh Sharma.)
Features:
* Allow any option to be set in an environment variable. This includes options that previously could only be specified on the command line, e.g. stanza, and secret options that could not be specified on the command-line, e.g. repo1-s3-key-secret.
* Exclude temporary and unlogged relation (table/index) files from backup. Implemented using the same logic as the patches adding this feature to PostgreSQL, 8694cc96 and 920a5e50. Temporary relation exclusion is enabled in PostgreSQL ≥ 9.0. Unlogged relation exclusion is enabled in PostgreSQL ≥ 9.1, where the feature was introduced. (Contributed by Cynthia Shang.)
* Allow arbitrary directories and/or files to be excluded from a backup. Misuse of this feature can lead to inconsistent backups so read the --exclude documentation carefully before using. (Reviewed by Cynthia Shang.)
* Add log-subprocess option to allow file logging for local and remote subprocesses.
* PostgreSQL 11 Beta 3 support.
Improvements:
* Allow zero-size files in backup manifest to reference a prior manifest regardless of timestamp delta. (Contributed by Cynthia Shang.)
* Improve asynchronous archive-get/archive-push performance by directly checking status files. (Contributed by Stephen Frost.)
* Improve error message when a command is missing the stanza option. (Suggested by Sarah Conway.)
This includes PostgreSQL installation which had previously been included in the documentation. This way produces faster builds and there is no need for us to document PostgreSQL installation.
IMPORTANT NOTE: This release fixes a critical bug in the backup resume feature. All resumed backups prior to this release should be considered inconsistent. A backup will be resumed after a prior backup fails, unless resume=n has been specified. A resumed backup can be identified by checking the backup log for the message "aborted backup of same type exists, will be cleaned to remove invalid files and resumed". If the message exists, do not use this backup or any backup in the same set for a restore and check the restore logs to see if a resumed backup was restored. If so, there may be inconsistent data in the cluster.
Bug Fixes:
* Fix critical bug in resume that resulted in inconsistent backups. A regression in v0.82 removed the timestamp comparison when deciding which files from the aborted backup to keep on resume. See note above for more details. (Reported by David Youatt, Yogesh Sharma, Stephen Frost.)
* Fix error in selective restore when only one user database exists in the cluster. (Fixed by Cynthia Shang. Reported by Nj Baliyan.)
* Fix non-compliant ISO-8601 timestamp format in S3 authorization headers. AWS and some gateways were tolerant of space rather than zero-padded hours while others were not. (Fixed by Andrew Schwartz.)
Features:
* PostgreSQL 11 Beta 2 support.
Improvements:
* Improve the HTTP client to set content-length to 0 when not specified by the server. S3 (and gateways) always set content-length or transfer-encoding but HTTP 1.1 does not require it and proxies (e.g. HAProxy) may not include either. (Suggested by Adam K. Sumner.)
* Set search_path = 'pg_catalog' on PostgreSQL connections. (Suggested by Stephen Frost.)
* Build containers from scratch for more accurate testing.
* Allow environment load to be skipped.
* Allow bash wrapping to be skipped.
* Allow forcing a command to run as a user without sudo.
Configuration files are loaded from the directory specified by the --config-include-path option.
Add --config-path option for overriding the default base path of the --config and --config-include-path option.
Contributed by Cynthia Shang.
The Perl process was exiting directly when called but that interfered with proper locking for the forked async process. Now Perl returns results to the C process which handles all errors, including signals.
* Perform apt-get update to ensure packages are up to date before installing.
* Add -p to the repository mkdir so it won't fail if the directory already exists, handy for testing packages.
Features:
* The archive-push command is now partially coded in C which allows the PostgreSQL archive_command to run significantly faster when processing status messages from the asynchronous archive process. (Reviewed by Cynthia Shang.)
Improvements:
* Improve check command to verify that the backup manifest can be built. (Contributed by Cynthia Shang.)
* Improve performance of HTTPS client. Buffering now takes the pending bytes on the socket into account (when present) rather than relying entirely on select(). In some instances the final bytes would not be flushed until the connection was closed.
* Improve S3 delete performance. The constant S3_BATCH_MAX had been replaced with a hard-coded value of 2, probably during testing.
* Allow any non-command-line option to be reset to default on the command-line. This allows options in pgbackrest.conf to be reset to default which reduces the need to write new configuration files for specific needs.
* The C library is now required. This eliminates conditional loading and eases development of new library features.
* The pgbackrest executable is now a C binary instead of Perl. This allows certain time-critical commands (like async archive-push) to run more quickly.
* Rename db-* options to pg-* and backup-* options to repo-* to improve consistency. repo-* options are now indexed although currently only one is allowed.
It would be better if the hostnames were also pg1 and pg2 to illustrate that primaries and standbys can change hosts, but at this time the configuration ends up being confusing since pg1, pg2, etc. are also used in the option naming. So, for now leave the names as pg-primary and pg-standby to avoid confusion.
* More optimized container suite that greatly improves build time.
* Added static Debian packages for Devel::Cover to reduce build time.
* Add deprecated state for containers. Deprecated containers may only be used to build packages.
* Remove Debian 8 from CI because it does not provide additional coverage over Ubuntu 14.04 and Ubuntu 16.04.
Bug Fixes:
* Fixed the info command so the WAL archive min/max displayed is for the current database version. (Fixed by Cynthia Shang.)
* Fixed the backup command so the backup-standby option is reset (and the backup proceeds on the master) if the standby is not configured and/or reachable. (Fixed by Cynthia Shang.)
* Fixed config warnings raised from a remote process causing errors in the master process. (Fixed by Cynthia Shang.)
Features:
* Amazon S3 repository support. (Reviewed by Cynthia Shang.)
Refactoring:
* Refactor storage layer to allow for new repository filesystems using drivers. (Reviewed by Cynthia Shang.)
* Refactor IO layer to allow for new compression formats, checksum types, and other capabilities using filters. (Reviewed by Cynthia Shang.)
* Move modules in Protocol directory in subdirectories.
* Move backup modules into Backup directory.
Refactor storage layer to allow for new repository filesystems using drivers. (Reviewed by Cynthia Shang.)
Refactor IO layer to allow for new compression formats, checksum types, and other capabilities using filters. (Reviewed by Cynthia Shang.)
Making this dynamic in commit 5d2e792 broke doc builds from cache. The long-term solution is to create a special user for doc builds but that’s beyond the scope of this release.