The prior timeouts were a bit aggressive and were causing timeouts in the Azure tests. There have also been occasional timeouts in other storage drivers.
The performance of CI environments is pretty variable so increased timeouts should make the tests more stable.
Double spaces have fallen out of favor in recent years because they no longer contribute to readability.
We have been using single spaces and editing related paragraphs for some time, but now it seems best to update the remaining instances to avoid churn in unrelated commits and to make it clearer what spacing contributors should use.
Ubuntu 18.04 will be EOL before the next release, so update to the oldest available Debian version.
Also fix one incorrect return value type, a test cast, and adjust some test timeouts.
Bug Fixes:
* Skip writing recovery.signal by default for restores of offline backups. (Reviewed by Stefan Fercot. Reported by Marcel Borger.)
Features:
* Block incremental backup (BETA). (Reviewed by John Morris, Stephen Frost, Stefan Fercot.)
Improvements:
* Keep only one all-default group index. (Reviewed by Stefan Fercot.)
Documentation Improvements:
* Add explicit instructions for upgrading between 2.x versions. (Contributed by Christophe Courtois. Reviewed by David Steele.)
* Remove references to SSH made obsolete when TLS was introduced.
This allows options to be marked as beta, which will require that the --beta option be supplied to prevent accidental usage of a beta feature.
The online and command-line documentation also show warnings when options are beta.
xxHash is significantly faster than SHA-1 so this helps reduce the overhead of the feature.
A variable number of bytes are used from the xxHash depending on the block size with a minimum of six bytes for the smallest block size. This keeps the maps smaller while still providing enough bits to detect block changes.
Raw encryption was already being used for block incremental. This commit adds raw compression to block incremental where possible (see da918587).
Raw compression/encryption is also added to bundling for a backup set when block incremental is enabled on the full backup. This prevents a break in backward compatibility since block incremental is not backward compatible.
Running valgrind and backtrace together has been causing tests to timeout in CI, mostly likely due to limited resources. This has not been a problem in normal development environments.
Since it is still important to run backtraces for debugging, split the u22 test that was doing all this work to run coverage and backtrace together and valgrind-only as a separate test. As a bonus these tests run faster separately and since they run in parallel the total execution time is faster.
The primary goal of the block incremental backup is to save space in the repository by only storing changed parts of a file rather than the entire file. This implementation is focused on restore performance more than saving space in the repository, though there may be substantial savings depending on the workload.
The repo-block option enables the feature (when repo-bundle is already enabled). The block size is determined based on the file size and age. Very old or very small files will not use block incremental.
The libbacktrace feature has not been working since the move to meson because libbacktrace detection was not added to the meson build. Add libbacktrace to meson and improve the feature so that it can be compiled into release builds.
The prior implementation fetched line numbers with each stack trace push. Not only was this slow but it missed any functions that were not being tracked on our stack.
Instead just examine the backtrace when an error happens and merge it with the info we have on our stack. If the backtrace is not available then the output remains as before.
Also remove --backtrace from test.pl since the library is now auto-detected.
Leave this library out of the production build for now to give it a little time to shake out in testing.
When this code was migrated to C the unit tests were not included because there were more important priorities at the time.
This also requires some adjustments to coverage because of the new code location.
Calculate a checksum of the data stored in the repository when a file is transformed (e.g. compressed). This allows resume and verify to operate without needing to decompress/decrypt the data.
This can also be used to verify more complex formats such as block incremental and allow backups from the repository without needing to decompress the data to verify the checksum.
Add some basic encrypted tests to maintain coverage. These will be expanded in a future commit.
Our new policy is to support ten versions of PostgreSQL, the five supported releases and the last five EOL releases. As of PostgreSQL 15, that means 9.0/9.1/9.2 are no longer supported by pgBackRest.
Remove all logic associated with 9.0/9.1/9.2 and update the tests.
Document the new support policy.
Update InfoPg to read/write control versions for the history in backup.info, since we can no longer rely on the mappings being available. In theory this could have been an issue after removing 8.3/8.4 if anybody was using a version that old.
The option to specify the path to psql was shown in the command-line help as --psql-bin but the option was actually named --pgsql-bin.
Rename to match the help so they are consistent.
The reference list was previously built at load time from whichever references existed in the file list. This was sufficient since the list was for informational purposes only.
The block incremental feature will require a reference list that contains all prior backups, even those that are not explicitly referenced from the manifest. Therefore it makes sense to build and persist a manifest list rather than building it at load time.
This list can still be used for informational purposes, though it needs to be sorted since the list it sill built for older manifest versions and may not be in sorted order.
Add strLstFindIdx() to find references in the list.
This appears to have been an oversight in 34d6495. Storing the reference is not really correct since the file is not stored in a prior backup. It also uses more space.
There is no real harm in storing the reference, since it is always ignored on restore, but the code is simpler if the zero-length files can be dealt with during the manifest and don't need additional handling later on. This is also an important part of some upcoming optimizations.
All unit and performance tests are now built by the C harness.
Remove all unit/performance test build code from Perl.
Remove code from C harness that is no longer used. This code was included so the C harness could be run separately, but that is no longer needed with this full integration.
The C test harness is used for unit tests from the Perl harness where possible. Currently, unit tests can be run in the C harness when --no-coverage is specified and --profile is not specified.
C harness tests work on meson 0.45.
The C harness runs with valgrind by default. Valgrind can be disabled with --no-valgrind.
Also rebuild containers to add meson and update the documentation so that meson builds will work (even though we don't do them yet).
Both have newer gcc and OpenSSL 3.
Fedora 36 runs horribly slow with valgrind enabled so run the valgrind tests on Ubuntu 22.04. Fedora 36 has a newer gcc so it is still worth testing on.
Meson is a new build system that offers simpler syntax and superior performance to autoconf/make. In addition, Windows is supported natively.
The Meson build appears complete, but currently is used only for auto-generation of code and the host build of pgbackrest. Some container upgrades will be required before Meson can be used for container builds.
Also patch the Debian package to force autoconf/make rather than Meson.
Stopping the cluster has started consistently running out of memory on PostgreSQL 9.1. This seems to have happened after pulling in new packages at some point so it might be build related.
Stopping the cluster is not critical for 9.1 so skip it.
These files were never intended to be compiled on their own so the .c extension was a bit misleading. In particular Meson does not like .c files that are not intended to be compiled independently.
Leave header files as is since they are already protected against being included more than once and are never expected to be compiled.
Remove VM_OS_REPO since it is no longer required.
Rebalance PostgreSQL versions for more efficient test times.
Always print version of PostgreSQL when testing. This helps verify that new minor releases are being used.
Integration expect log testing was originally used as a rough-and-ready way to make sure that certain code paths were being executed before the unit tests existed. Now that we have 100% unit test coverage (with expect log testing) the value of the integration expect tests seems minimal at best.
But they do cause numerous issues:
- Maintenance of the expect code and replacements that are required to keep logs reproducible.
- Even a trivial change can cause massive churn in the expect logs, e.g. d9088b2. These changes should be minutely audited but since the expect logs have little value now it is seldom worth the effort.
- The OS version used to do expect testing (RHEL7) can only be used to test one version of PostgreSQL. This makes it hard to balance the PostgreSQL version testing between OS versions.
- When a commit affects expect logs it is not clear (especially for new developers) how to regenerate them and our contributing guide is silent on the issue.
The goal is to migrate the integration tests to C and expect testing is not part of that plan. It seems best to get rid of them now.
This helps rebalance some of the tests that are running long, i.e. d9 and u20.
I would be better to move more PostgreSQL versions to d9, but the base VM does not contain more versions. New minor versions will be out later in the week so that seems a better time to be rebuilding containers.
PostgreSQL 15 drops support for exclusive backup and renames the start/stop backup commands.
This is based on the pgdg-testing repo since beta1 has not been released yet, but it seems unlikely that breaking changes will be made at this point. beta1 should be tagged just before our next release so we'll retest before the release.
Only set -DDEBUG_MEM for the modules currently being tested rather than globally.
Also run tests in a temp mem context. Running in the top context can confuse memory accounting when a new context is created in the top context.
In offline mode the pg_wal directory is copied, but that is not the same as archive-copy, which copies the exact set of WAL required from the archive.
This flag is purely for informational purposes so there is no live bug here, but the prior behavior was certainly misleading.
The unit tests were ignoring stderr but nothing being output there was important. Now a test will fail if there is anything on stderr.
This makes it easier to work with -fsanitize, which outputs to stderr.
The manifest test module was setting a blank value here and causing a stack overflow because memcpy() is used instead of strcpy().
This was really just a test issue but add an assert just in case the same were to happen in production code.
Also update a bogus checksum in the integration tests to the correct length to avoid running afoul of the assert.
Found with -fsanitize=address.
This rule was added because there were not sufficient tests to demonstrate that the repo-hardlink option could be changed in a backup set.
Remove the restriction and add/update tests to show that it works.
This is necessary now because bundling requires that hardlinking be disabled. Rather than add code complexity, it seems better just to address this limitation.
It seems best for these to be repo options so they can be configured per repo, rather than globally.
All clarify usage for repo-bundle-size and repo-bundle-limit.