It is not clear why these were split out, but it probably had something to do with testing before storageList() could return NULL for an empty directory.
Also remove the tests that depended on a boolean return, which are no longer needed for coverage.
Previously read/writing JSON required parsing/render via a variant, which add many more memory allocations and loops.
Instead allow JSON to be read/written serially to improve performance and simplify the code. This also allows us to get rid of many String and Variant constant which are no longer required.
The goal is to be able to read/write very large (e.g. gigabyte manifest) JSON structures, which would not be practical with the current code.
Note that external JSON (GCS, S3, etc) is still handled using variants. Converting these will require more consideration about key ordering since it cannot be guaranteed as in our own formats.
This allows code to run after the return type has been generated in the case where it is an expression.
No new functionality here yet, but this will be used by a future commit that audits memory usage.
All fields should be alphabetical. Currently the read code is tolerant of this, but that will not always be the case.
Fields are always written alphabetically so this is just a test issue introduced by d8d41321.
This is not a very realistic case since archive start/stop are always written, but it appears in many other unit tests so it should also be tested here.
Packs support stronger typing than JSON and are more efficient. For the small result sets that we deal with efficiency is probably not very important, but this removes another place where we are using JSON instead of Pack.
Push checking for result struct (e.g. single row) down into PgClient since it has easy access to this information rather than needing to parse the result set to find out.
Refactor all code downstream that depends on PgClient results.
There have been some behavioral changes in libpq which require changes to the test.
Also update the instructions since it is now a bit easier to run against a real cluster.
There is no need to process the stats so a KeyValue is overkill.
Also remove the performance tests that check the stat totals since this is covered in the unit tests.
A missing field and a NULL field are not exactly the same so it seems best to test both.
Because of the way KeyValue objects work the error is the same, but that will not always be true.
The line number was one less than it should have been, which could cause some confusion.
Since this only affected ini files with JSON values, which are always written programmatically, there is almost zero chance this has ever been a problem in the field.
Previously the process id was skipped if it did not exist. Instead, throw an error and handle the errors in downstream code.
This was probably ignored at some point to provide backward-compatibility, but that is no longer required, if it ever was.
Sometimes we need to read a lock from another process. This was done two different ways and in the case of cmdStop() was definitely hacky.
Centralize the logic to make it easier to read the locks for another process. This will also make it easier to add new lock data.
When archive-mode-check is disabled and archive-push is running from multiple hosts, it is very likely that the file will already exist with the same checksum, so disable the warning.
However, if the checksums do not match, an error will still be thrown.
Using the path variable directly resulted in a path with (null) in it, which caused the remove to fail.
The pathFull variable already exists for this purpose so use it.
Determining the length of arrays that could be calculated at compile time was a bit piecemeal, with special macros used sometimes and with the math done directly other times.
This macro makes the task easier, uses less space, and automatically adjusts when the type changes.
Most of these looked like copy/paste from a prior required strCatFmt() call.
There is no issue here since strCatFmt() works the same in these cases, but using strCat()/strCatZ() is more efficient.
If a boolean option had an unresolved dependency then the value would be NULL, which meant the dependency would need to be checked in the code to avoid an error. For example, cfgOptionBool(cfgOptOnline) needed to be checked before it was safe to call cfgOptionBool(cfgOptArchiveCheck).
Allow a default for boolean options when they are unresolved to simplify the code. This makes using the options easier and less prone to error. Not all boolean options get a dependency default in this commit, but more may be added in the future.
In offline mode the pg_wal directory is copied, but that is not the same as archive-copy, which copies the exact set of WAL required from the archive.
This flag is purely for informational purposes so there is no live bug here, but the prior behavior was certainly misleading.
For PITR with --type=lsn, attempt to auto-select the appropriate backup set based on the --target LSN provided. Pick the most recent backup where backup-lsn-stop is less than or equal to the provided LSN.
The unit tests were ignoring stderr but nothing being output there was important. Now a test will fail if there is anything on stderr.
This makes it easier to work with -fsanitize, which outputs to stderr.
The manifest test module was setting a blank value here and causing a stack overflow because memcpy() is used instead of strcpy().
This was really just a test issue but add an assert just in case the same were to happen in production code.
Also update a bogus checksum in the integration tests to the correct length to avoid running afoul of the assert.
Found with -fsanitize=address.
If a variable assigned with STRDEF() is referenced out of scope of the STRDEF() assignment then the value is undefined.
Luckily most of the instances are in tests but there is one in the core code. It is not clear if this is a live bug or not but it certainly needs to be fixed.
Found with -fsanitize=address.
If the value and multiplier were large enough then the return value could overflow unpredictably.
Check the value to make sure it will not overflow with the current multiplier.
It would be better to present an "out of range" error to the user rather than "is not valid" but it doesn't seem worth the effort since the error is extremely unlikely.
Found with -fsanitize=undefined.
It is possible that a file will be be truncated to zero-length after the backup manifest has been built. We could build logic into backupFile() to handle this case but it is hard to test well because of the race condition so tests would need to written directly against backupFile() and backupJobResult(). It hardly seems worth all that effort for a condition that occurs rarely, if ever.
Instead just remove the manifest check and add tests to restore to make sure it handles bundled zero-length files correctly. Logging will show that the file was bundled so if it happens a lot (which seems very unlikely) then we can think about an alternate implementation.
This rule was added because there were not sufficient tests to demonstrate that the repo-hardlink option could be changed in a backup set.
Remove the restriction and add/update tests to show that it works.
This is necessary now because bundling requires that hardlinking be disabled. Rather than add code complexity, it seems better just to address this limitation.
Check for invalid path in repo-* commands. Perform path validation and throw an error when appropriate. Path may not contain '//'. Strip trailing '/' from path. Absolute path must fall under repo path.
IMDSv2 provides additional security to prevent instance metadata from being read by an attacker.
All AWS instances should provide IMDSv2 but still fail back to IMDSv1 if the IMDSv2 token request fails. This is in case there are any services outside AWS that are emulating IMDSv1 but have not implemented IMDSv2.
It seems best for these to be repo options so they can be configured per repo, rather than globally.
All clarify usage for repo-bundle-size and repo-bundle-limit.
Since files are stored sequentially in a bundle, it is often possible to restore multiple files with a single read. Previously, each restored file required a separate read. Reducing the number of reads is particularly beneficial for object stores, but performance should benefit on any file system.
Currently if there is a gap then a new read is required. In the future we might set a limit for how large a gap we'll skip without starting a new read.
Improve the stop command, when force and stanza options are specified, to terminate only processes holding lock files for the given stanza. Prior to these changes, termination of all processes holding lock files regardless of stanza occurred.
For very large backups only getting an update per percent may not be often enough.
Add hundredths to the percent complete logging to provide more timely information.
Checking percentage and size in every test can cause quite a bit of churn when changes are made.
Follow the example of the backup tests and replace percentage and size after the few tests to reduce churn.
These tests were written before the restore command was fully migrated to C so many of them have become redundant.
In the cases were they still provide coverage, add tests to synthetic restores to replace them. In general, these higher level tests provide better coverage than poking at the restoreFile() function directly.
IMPORTANT NOTE: Repository size reported by the info command is now entirely based on what pgBackRest has written to storage. Previously, in certain cases, pgBackRest could detect if additional compression was being applied by the storage but this is no longer supported.
Bug Fixes:
* Retry errors in S3 batch file delete. (Reviewed by Reid Thompson. Reported by Alex Richman.)
* Allow case-insensitive matching of HTTP connection header values. (Reviewed by Reid Thompson. Reported by Rémi Vidier.)
Features:
* Add support for AWS S3 server-side encryption using KMS. (Contributed by Christoph Berg. Reviewed by David Steele, Tharindu Amila.)
* Add archive-missing-retry option. (Reviewed by Stefan Fercot.)
* Add backup type filter to info command. (Contributed by Stefan Fercot. Reviewed by David Steele.)
Improvements:
* Retry on page validation failure during backup. (Reviewed by Stephen Frost, David Christensen.)
* Handle TLS servers that do not close connections gracefully. (Reviewed by Rémi Vidier, David Christensen, Stephen Frost.)
* Add backup LSNs to info command output. (Contributed by Stefan Fercot. Reviewed by David Steele.)
* Automatically strip trailing slashes for repo-ls paths. (Contributed by David Christensen. Reviewed by David Steele.)
* Do not retry fatal errors. (Reviewed by Reid Thompson.)
* Remove support for PostgreSQL 8.3/8.4. (Reviewed by Reid Thompson, Stefan Fercot.)
* Remove logic that tried to determine additional file system compression. (Reviewed by Reid Thompson, Stefan Fercot.)
Documentation Bug Fixes:
* Move repo options in TLS documentation to the global section. (Reported by Anton Kurochkin.)
* Remove unused backup-standby option from stanza commands. (Reported by Stefan Fercot.)
* Fix typos in help and release notes. (Fixed by Daniel Gustafsson. Reviewed by David Steele.)
Documentation Improvements:
* Add aliveness check to systemd service configuration. (Suggested by Yogesh Sharma.)
* Add FAQ explaining WAL archive suffix. (Contributed by Stefan Fercot. Reviewed by David Steele.)
* Note that replications slots are not restored. (Contributed by Reid Thompson. Reviewed by David Steele, Stefan Fercot. Suggested by Christophe Courtois.)