This was included primarily for testing but now that storageS3New() is not called directly in the tests the value needs to be modified after configuring storage anyway.
Since there is no need to make this parameter user-configurable just remove it.
Rather than calling storageS3New() directly, create the storage by loading a configuration and calling repoStorageGet(). This is a better end-to-end test and cuts down on a lot of redundant tests.
Add tests that include security tokens in error messages to ensure they are redacted.
Move sckSessionReadyRead()/Write() into the IoRead/IoWrite interfaces. This is a more logical place for them and the alternative would be to add them to the IoSession interface, which does not seem like a good idea.
This is mostly a refactor, but a big change is the select() logic in fdRead.c has been replaced by ioReadReady(). This was duplicated code that was being used by our protocol but not TLS. Since we have not had any problems with requiring poll() in the field this seems like a good time to remove our dependence on select().
Also, IoFdWrite now requires a timeout so update where required, mostly in the tests.
These interfaces allow the HttpClient and HttpSession objects to work with protocols other than TLS, .e.g. plain sockets. This is necessary to allow standard HTTP -- right now only HTTPS is allowed, i.e. HTTP over TLS.
For now only TlsClient and TlsSession have been converted to the new interfaces. SocketClient and SocketSession will also need to be converted but first sckSessionReadyRead() and sckSessionReadyWrite() need to be moved into the IoRead and IoWrite interfaces, since they are not a good fit for IoSession.
If storageReadPosixFreeResource() errors then it will be called again when the object is freed, which is not ideal since it might error in a different way and lose the original error.
Pretty much everywhere handle is used what is really meant is file descriptor (fd). This terminology got migrated over from Perl and is just not quite correct, or at least not as correct as fd.
There were also plenty of places fd was used so now all uses are consistent.
The Perl code was not updated but might be in a future commit.
This does not appear to have been used in quite some time and the tests are equally useless because they don't prove the correct port was passed to httpClientNew().
Before 9f2d647 TLS errors included additional details in at least some cases. After 9f2d647 a connection to an HTTP server threw `TLS error [1]` instead of `unable to negotiate TLS connection: [336031996] unknown protocol`.
Bring back the detailed messages to make debugging TLS errors easier. Since the error routine is now generic the `unable to negotiate TLS connection context` is not available so the error looks like `TLS error [1:336031996] unknown protocol`.
PostgreSQL may be using most of the available file descriptors when it executes the the archive-get/archive-push commands (especially archive-get). This can lead to problems depending on how many file descriptors are needed for parallelism in the async process.
Proactively free file descriptors between 3 and 1023 to help ensure there are enough available for reasonable values of process-max, i.e. <= 300.
This loop was using a lot of memory without freeing it at intervals.
Rewrite to use char arrays when possible to reduce memory that needs to be allocated and freed.
Zigzag encoding places the sign bit in the least significant bit so that -1 is encoded as 1, 1 as 2, etc. This moves as many bits as possible into the low order bits which is good for other types of encoding, e.g. base-128.
See https://en.wikipedia.org/wiki/Variable-length_quantity#Zigzag_encoding.
It seems like overkill to encode this when other enums (e.g. StorageInfoLevel) are passed as integers.
Instead note that StorageType values should not be changed and remove the special encoding.
The fix for = characters in info files (039d314) added JSON validation but discarded the resulting Variant which means the JSON is being parsed twice. This nearly doubles the time to load a manifest since a lot of complex JSON is involved.
Time to load a million file manifest:
Before 039d314: 7.8s
039d314: 15.5s
This patch: 7.5s
To fix this regression return the Variant in the callback so the caller does not have to parse it again. The new code appears slightly more efficient overall, probably because there are fewer operations against Strings.
We use the Z suffix in many functions to indicate that we are expecting a zero-terminated string so make this function conform to the pattern.
As a bonus the new name is a bit shorter, which is a good quality in a commonly-used function.
The manifest uses the = character as the key/value separator so = characters in the key cause parsing errors and lead to an error or segfault.
Since the value must be valid JSON we can keep checking the value on the right side of the = and stop building the key when the value is valid. It's a bit hackish but it does seem to do the job without breaking the manifest format.
Unsurprisingly this makes parsing about 50% slower but it's still more than fast enough. Parsing 10 million key/values takes about 6.5s for the old code and 10s for the new code. Since the value is used as JSON downstream we can reclaim most of this time by just passing the JSON value rather than making the callback reparse it. We'll save that for another commit, though.
Since the command has completed it is counterproductive to throw an error but still warn to indicate that something unusual happened.
Also fix the related issue that the local processes were not being shut down when they completed, which meant that they might timeout before being closed when pgbackrest terminated.
Use a test storage driver to allow manifestNewBuild() to be run against a test cluster at any scale without having to write files to disk.
Simplify the test by using the output of manifestNewBuild() to feed manifestSave() and manifestNewLoad().
Also add manifest size to the output.
Calculates the memory used by the context and all child contexts.
This is primarily useful for debugging but it is not conditional on DEBUG because it is useful for profile/performance tests.
A number of tests used invalid JSON values where an error was expected or the value would be ignored.
Update these tests to use valid JSON values so all values in the file can be validated even if they are not used.
Something like 3="string" would return an Int64 variant and ignore the invalid portion after the integer. Other JSON interface functions have this check but it was forgotten here.
There are no current issues because of this but we want to be able to validate arbitrary JSON strings and this function was not working correctly for that usage.
This function is not used in the core code so remove it and update the test where it was used.
There may eventually be a need for a strLstNewP() function but it doesn't seem worth the code churn until there is an actual requirement.
The old constructor was left around to reduce code churn during the migration but it just makes the code harder to read and search.
Remove the old constructor and rename all remaining instances to lstNewP(), which by default has the same semantics.
Also update the policy in doc/RELEASE.md to get the latest versions at the beginning of the release cycle. The older policy was created when we were getting new versions right before the release.
Testing against static checksums is valuable but it can be become burdensome when supporting multiple architectures.
Reduce the number of tests we are doing against static checksums when the architecture can cause the checksum to vary.
Little-endian architectures store the low-order bytes in the lowest memory location so this worked even in the case that size_t and int had different byte representations. Since buffer sizes are constrained there was no chance of the integer becoming negative and causing a problem that way.
On big-endian architectures this cast caused the low-order bytes to get loaded into the high-order bytes resulting in a huge buffer size that immediately triggered an assertion (and without the assertion would have certainly segfaulted).
Instead use a temporary int variable and cast that to size_t after the function call. This is the correct way to do it regardless of architecture.
This issue was detected while testing on the s390x architecture.
Bug Fixes:
* Fix restore --force acting like --force --delta. This caused restore to replace files based on timestamp and size rather than overwriting, which meant some files that should have been updated were left unchanged. Normal restore and restore --delta were not affected by this issue. (Reviewed by Cynthia Shang.)
Features:
* Azure support for repository storage. (Reviewed by Cynthia Shang, Don Seiler.)
* Add expire-auto option. This allows automatic expiration after a successful backup to be disabled. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang, David Steele.)
Improvements:
* Asynchronous S3 multipart upload. (Reviewed by Stephen Frost.)
* Automatic retry for backup, restore, archive-get, and archive-push. (Reviewed by Cynthia Shang.)
* Disable query parallelism in PostgreSQL sessions used for backup control. (Reviewed by Stefan Fercot.)
* PostgreSQL 13 beta2 support. Changes to the control/catalog/WAL versions in subsequent betas may break compatibility but pgBackRest will be updated with each release to keep pace.
* Improve handling of invalid HTTP response status. (Reviewed by Cynthia Shang.)
* Improve error when pg1-path option missing for archive-get command. (Reviewed by Cynthia Shang.)
* Add hint when checksum delta is enabled after a timeline switch. (Reviewed by Matt Bunter, Cynthia Shang.)
* Use PostgreSQL instead of postmaster where appropriate. (Reviewed by Cynthia Shang.)
Documentation Bug Fixes:
* Fix incorrect example for repo-retention-full-type option. (Reported by Höseyin Sönmez.)
* Remove internal commands from HTML and man command references. (Reported by Cynthia Shang.)
Documentation Improvements:
* Update PostgreSQL versions used to build user guides. Also add version ranges to indicate that a user guide is accurate for a range of PostgreSQL versions even if it was built for a specific version. (Reviewed by Stephen Frost.)
* Update FAQ for expiring a specific backup set. (Contributed by Cynthia Shang. Reviewed by David Steele.)
* Update FAQ to clarify default PITR behavior. (Contributed by Cynthia Shang. Reviewed by David Steele.)
The postgresql.auto.conf file was being used instead of recovery.conf, but there were still instances in the text that used recovery.conf. Update to postgresql.auto.conf for PostgreSQL >= 10 and change wording where needed.