A few bugs have been reported over the years on Alpine with musl libc and generally it has been possible to figure them out without testing on that platform but a few newer ones cannot be reproduced elsewhere.
Also, testing with additional libc implementations helps portability so it makes sense to add support.
For the most part musl libc works as expected with a few caveats:
1) The FD_*() macros won't accept an int file descriptor without warning when -Wsign-conversion is enabled. I opened https://www.openwall.com/lists/musl/2025/05/23/4 to discuss this and I was referred to https://www.openwall.com/lists/musl/2024/07/16/1 which explains why this happens. It was not a very satisfying answer but clearly it is not going to be addressed so a meson probe was added to detect the issue and disable the warning.
2) Floating point numbers are rounded differently than in GNU/Mac libc when formatted with printf() and friends. This is fine for the core code but causes issues in the unit tests that expect log entries to match exactly. This was solved in ad7ba46b by adding our own rough and ready formatting routines.
3) Some error messages are different from GNU/Mac libc. This was solved with a new error macro that accepts multiple messages in b5fbb16c.
4) For some reason ninja on Alpine outputs "nothing to do" messages to stderr whereas they go to stdout on other distributions. Redirecting stderr to stdout is our standard fix for this issue so do that. A non-zero result code will let us know that something has gone wrong.
5) It appears that profiling is not supported on Alpine, which is pretty surprising. For now fix this by only unit testing the profiling code when coverage is enabled. This is not a great solution since we would rather test profiling on any system that supports it but for now this will do.
Since the addition of libcurl4-openssl-dev requires a rebuild of the Debian containers go ahead and rebuild all containers to include new PostgreSQL minor release versions.
This aligns with what is reported by 'uname -m' and required by Docker naming conventions.
Also convert a 'aarch64' string to the VM_ARCH_AARCH64 constant.
The Github action we were using for multi-architecture testing stopped working. The project does not seem to be getting regular maintenance so it seems better to roll multi-architecture testing into our existing container builds.
Introduce multi-architecture builds and testing into our test framework. For now this only works for unit tests -- integration tests will still only run on x86_64. That could be updated in the future but since emulation is so slow it is not clear if it would be useful.
Also fix an invalid 32-bit checksum. The d11 test had not been running as 32-bit since d8ff89a so the checksum was not updated when it should have been in 48f511d.
Per our policy to support five EOL versions of PostgreSQL, 9.4 is no longer supported by pgBackRest. Remove all logic associated with 9.4 and update the tests.
This includes a small fix in infoPg.c to allow backup.info files with old versions to be saved. This allows expire to function when old versions are present. Even though those older versions cannot be used, they can be expired.
Tests for 9.4 are left in the expire/info tests to demonstrate that these commands work with old versions present.
Ubuntu 20.04 has been having consistent errors starting PostgreSQL 10 so move 9.5 to this container instead. An older version makes sense with an older distro.
Also move PostgreSQL 12 from RHEL 8 since this version will be EOL soon.
Typically we use the oldest Debian/Ubuntu to run 32-bit unit and integration tests. However, 32-bit is no longer fully supported by Ubuntu (multiple packages we need are missing) and apt.postgresql.org no longer packages for any 32-bit version.
To address these changes, do 64-bit integration testing on the oldest Debian/Ubuntu (currently Ubuntu 20.04) and 32-bit unit/integration testing on the oldest Debian (currently 11) using the included version for integration testing.
These were broken while code was being migrated to C and went unnoticed because the options are generally only used when doing performance testing.
The C code can only take one --run param so add a check for that in test.pl.
This means valgrind is no longer built from source, which caused image builds to run for a very long time.
Valgrind is only required in a few images for testing.
lcov does not seem to be very well maintained and is often not compatible with the version of gcc it ships with until a few months after a new distro is released. In any case, lcov is that not useful for us because it generates reports on all coverage while we are mainly interested in missing coverage during development.
Instead use the JSON output generated by gcov to generate our minimal coverage report and metrics for the documentation.
There are some slight differences in the metrics. The difference in the common module was due to a bug in the old code -- build/common was being added into common as well as being reported separately. The source of the two additional branches in the backup module is unknown but almost certainly down to how exclusions are processed with regular expressions. Since there is additional coverage rather than coverage missing this seems fine.
Since this was pretty much a rewrite it was also a good time to migrate to C.
Update the catalog version for beta 1 so pgbackrest will not work with any prior development versions.
Also improve the integration/all test so the catalog version does not need to be updated again during the beta period.
Coverage of the documentation code is not important enough to report to users. If it were reported it should be in a separate section (along with test code coverage).
This is better than requiring a python3 binary to be on the path because some installations might have, e.g. python3.9.
Also add the python3-distutils package to Debian builds to make this work.
This should have been done in 434938e3 but somehow it didn't happen.
Fedora 38 requires 2048 bit keys so update the VM builds to use them. Update the documentation to use 2048 bit keys. This is not technically required by this commit but it makes sense to do it now.
Also update the key location for the yum.p.o repository.
Lastly, shuffle test PostgreSQL versions since PostgreSQL 11 is not longer available in the yum.p.o repository.
Meson has a lot of advantages over autoconf/make, primarily in ease-of-use and performance. Make meson the only build system used for testing and building the Debian documentation, but leave the RHEL documentation using autoconf/make for now so it gets some testing.
The Perl integration tests were migrated as faithfully as possible, but there was some cruft and a few unit tests that it did not make sense to migrate.
Also remove all Perl code made obsolete by this migration.
All unit, performance, and integration tests are now written in C but significant parts of the test harness remain to be migrated.
This has never been a problem for performance tests since they do not call functions that log at info level or above, but the upcoming integration tests may do so. In any case it is better to disable this functionality outside of unit tests.
This allows analysis of coverage failures that only happen in CI. It is not ideal since the report needs to be copied from the log output into an HTML file where it can be viewed, but better than nothing.
Per our policy to support five EOL versions of PostgreSQL, 9.3 is no longer supported by pgBackRest.
Remove all logic associated with 9.3 and update the tests.
Migrate generation of these files from help.xml to the intermediate documentation format. This allows us to share a lot of code that is already in C and remove duplicated code in Perl. More duplicate code can be removed in Perl once man generation is migrated.
Also update the unit test harness to allow testing of modules in the doc directory.
The tests expect the group name/id to match between the host system and the container. If there is a conflict rename the group with the required id to the expected name.
This could have unintended consequences but it seems reasonably safe since we control everything that runs in the container and there should never be any system processes running.
This was missed in the C unit test migration and since then a new test was added that was not setting its timezone correctly.
This feature exists to make sure the tests will run on systems with different timezones and has no impact on the core code.
The --no-log-timestamp option was missed when unit test building was migrated to C, which caused test timings to show up in the contributing guide. This caused no harm but did create churn in this file during releases.
Also improve the formatting when test timing is disabled.
Double spaces have fallen out of favor in recent years because they no longer contribute to readability.
We have been using single spaces and editing related paragraphs for some time, but now it seems best to update the remaining instances to avoid churn in unrelated commits and to make it clearer what spacing contributors should use.
Ubuntu 18.04 will be EOL before the next release, so update to the oldest available Debian version.
Also fix one incorrect return value type, a test cast, and adjust some test timeouts.
Bug Fixes:
* Skip writing recovery.signal by default for restores of offline backups. (Reviewed by Stefan Fercot. Reported by Marcel Borger.)
Features:
* Block incremental backup (BETA). (Reviewed by John Morris, Stephen Frost, Stefan Fercot.)
Improvements:
* Keep only one all-default group index. (Reviewed by Stefan Fercot.)
Documentation Improvements:
* Add explicit instructions for upgrading between 2.x versions. (Contributed by Christophe Courtois. Reviewed by David Steele.)
* Remove references to SSH made obsolete when TLS was introduced.
Running valgrind and backtrace together has been causing tests to timeout in CI, mostly likely due to limited resources. This has not been a problem in normal development environments.
Since it is still important to run backtraces for debugging, split the u22 test that was doing all this work to run coverage and backtrace together and valgrind-only as a separate test. As a bonus these tests run faster separately and since they run in parallel the total execution time is faster.
The libbacktrace feature has not been working since the move to meson because libbacktrace detection was not added to the meson build. Add libbacktrace to meson and improve the feature so that it can be compiled into release builds.
The prior implementation fetched line numbers with each stack trace push. Not only was this slow but it missed any functions that were not being tracked on our stack.
Instead just examine the backtrace when an error happens and merge it with the info we have on our stack. If the backtrace is not available then the output remains as before.
Also remove --backtrace from test.pl since the library is now auto-detected.
Leave this library out of the production build for now to give it a little time to shake out in testing.
When this code was migrated to C the unit tests were not included because there were more important priorities at the time.
This also requires some adjustments to coverage because of the new code location.