Document maintainer options in a separate section with appropriate explanation and caveats.
Also make the pg-version-force option user visible now that maintainer caveats have been documented.
The reference documentation was still using a very old version of rendering from before the user guide was introduced. This was preserved in the initial C migration to reduce the diff between Perl and C for testing purposes. The old version used hard linefeeds to simulate paragraphs and reduce the amount of markup that needed to be used. In retrospect this was not a great idea.
Instead use more natural rendering that does not depend on using hard linefeeds between paragraphs.
For some reason the internal section id was included in the title. This was probably copied from another section title where it made more sense, e.g. including the option name after the title.
Also add release note missed in 1eb01622.
Bug Fixes:
* Fix issue restoring block incremental without a block list. (Reviewed by Stephen Frost, Burak Yurdakul. Reported by Burak Yurdakul.)
Features:
* Add --repo-storage-tag option to create object tags. (Reviewed by Stephen Frost, Stefan Fercot, Timothée Peignier.)
* Add known hosts checking for SFTP storage driver. (Contributed by Reid Thompson. Reviewed by Stephen Frost, David Steele.)
* Support for dual stack connections. (Reviewed by Stephen Frost.)
* Add backup size completed/total to info command JSON output. (Contributed by Stefan Fercot. Reviewed by David Steele.)
Improvements:
* Multi-stanza check command. (Reviewed by Stephen Frost.)
* Retry reads of pg_control until checksum is valid. (Reviewed by Stefan Fercot, Stephen Frost.)
* Optimize WAL segment check after successful backup. (Reviewed by Stephen Frost.)
* Improve GCS multi-part performance. (Reviewed by Reid Thompson.)
* Allow archive-get command to run when stanza is stopped. (Reviewed by Tom Swartz, David Christensen, Reid Thompson.)
* Accept leading tilde in paths for SFTP public/private keys. (Contributed by Reid Thompson. Reviewed by David Steele.)
* Reload GCS credentials before renewing authentication token. (Reviewed by Stephen Frost. Suggested by Daniel Farina.)
Documentation Bug Fixes:
* Fix configuration reference example for the tls-server-address option. (Fixed by Hartmut Goebel. Reviewed by David Steele.)
* Fix command reference example for the filter option.
Test Suite Improvements:
* Allow storage/sftp unit test to run without libssh2 installed. (Contributed by Reid Thompson. Reviewed by David Steele. Suggested by Wu Ning.)
This example was broken by 24f7252. Revert to (almost) the prior code to fix this example until something better can be committed. The something better is in progress but it adds new build requirements so it is too late to include it for the release.
Technically this breaks some other examples, but they are all internal and not visible in the user-facing documentation.
It is currently possible for a block map to be written without any accompanying blocks. This happens when a file timestamp is updated but the file has not changed. On restore, this caused problems when encryption was enabled, the block map was bundled after a file that had been stored without block incremental, and both files were included in a single bundle read. In this case the block map would not be decrypted and the encrypted data was passed to blockMapNewRead() with unpredictable results. In many cases built-in retries would rectify this problem as long as delta was enabled since block maps would move to the beginning of the bundle read and be decrypted properly. If enough files were affected, however, it could overwhelm the retries and throw an error. Subsequent delta restores would eventually be able to produce a valid result.
Fix this by moving block map decryption so it works correctly no matter where the block map is located in the read. This has the additional benefit of limiting how far the block map can read so it will error earlier if corrupt. Though in this case there was no repository corruption involved, it appeared that way to blockMapNewRead() since it was reading encrypted data.
Arguably block maps without blocks should not be written at all, but it would be better to consider that as a separate change. This pattern clearly exists in the wild and needs to be handled, plus the new implementation has other benefits.
By default require a known hosts match as part of the SFTP storage driver's authentication process, i.e. repo-sftp-host-key-check-type=strict. The match is expected to be found in the default list or in a list of known hosts files provided by the user. An exception is made if a fingerprint has been manually configured with repo-sftp-host-fingerprint or repo-sftp-host-key-check-type=accept-new can be used to automatically add new hosts.
Also allow host key verification to be skipped, as before, but require the user to explicitly set this (repo-sftp-host-key-check-type=none) rather than it being the default.
This option is intended to eventually create a comprehensive report about the state of the pgBackRest configuration based on the results of the check command.
Implement a detailed report of the configuration options in the environment and configuration files. This should be useful information when debugging configuration errors, since invalid options and configurations are automatically noted. While custom config locations will not be found automatically, it will at least be clear that the config is not in a standard location.
For now keep this option internal since there is a lot of work to be done, but commit it so that it can be used when needed and tested in various environments.
Note that for now when --report is specified, the check command is not being run at all. Only the config report is generated. This behavior will be improved in the future.
This new option allows tags to be added to objects in S3, GCS, and Azure repositories.
This was fairly straightforward for S3 and Azure, but GCS does not allow tags for a simple upload using the JSON interface. If tags are required then the resumable interface must be used even if the file falls below the limit that usually triggers a resumable upload (i.e. size < repo-storage-upload-chunk-size).
This option is structured so that tags must be specified per-repo rather than globally for all repos. This seems logical since the tag keys and values may vary by service, e.g. S3 vs GCS.
These storage tags are independent of backup annotations since they are likely to be used for different purposes, e.g. billing, while the backup annotations are primarily intended for monitoring.
The prior code would only connect to the first address provided by getaddrinfo().
Instead try each address in the list. If all connections fail then wait and try them all again until timeout.
Currently a round robin approach is used where each connection attempt must fail before the next connection is attempted. This works fine, for example, when an ipv6 address has no route to the host, but will work less well when a host answers but doesn't respond in a timely fashion.
We may consider a Happy Eyeballs approach in the future, but since pgBackRest is primarily a background process, it is not clear that slightly improved response time (in the case of failed connections) is worth the extra complexity.
The prior code did one list command against the storage for each WAL segment. This led to a lot of lists and was especially inefficient when the WAL (or the majority of it) was already present.
Optimize to keep the contents of a WAL directory and use them on a subsequent search. Leave the optimizations for a single WAL segment since other places still use that mode.
On certain file systems (e.g. ext4) pg_control may appear torn if there is a concurrent write while reading the file. To prevent an invalid read, retry until the checksum matches the control data.
Special handling is required for the pg-version-force feature since the offset of the checksum is not known. In this case, scan from the default position to the end of the data looking for a checksum match. This is a bit imprecise, but better than nothing, and the chance of a random collision in the control data seems very remote considering the ratio of data size (< 512 bytes) to checksum size (4 bytes).
This was discovered and a possible solution proposed for PostgreSQL in [1]. The proposed solution may work for backup, but pgBackRest needs to be able to read pg_control reliably outside of backup. So no matter what fix is adopted for PostgreSQL, pgBackRest need retries. Further adjustment may be required as the PostgreSQL fix evolves.
[1] https://www.postgresql.org/message-id/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de
The prior example (::*) was not valid and would result in the following error:
ERROR: [049]: unable to get address for '::*': [-2] Name or service not known
Correct values are either * for IPv4 or :: for IPv6. The IPv4 value is used as the example since only one example is allowed.
The release.xml file was getting pretty unwieldy so break release notes into separate files. Also break contributors into a separate file.
In theory most of release.xml could now be generated automatically but adding a new release does not represent a serious maintenance burden, so for the time being it does not seem worth it.