Parse enough of config.yaml to auto-generate config.auto.h and config.auto.c.
This commit implements most of the infrastructure needed to migrate the rest of the build code to C, but each set of auto-generated files will present its own challenges.
The build is now dependent on libyaml. At this point there is no need for a hard requirement, but that will come soon so it seems better to add the dependency now.
Update Ubuntu 12.04 to 16.04. Version 16.04 is recently EOL but testing on an old version is beneficial.
Update Ubuntu 18.04 to 20.04.
Update Fedora 32 to 33. Version 34 would have been preferred but there were some build issues, i.e. the default shell did not work with configure, and after ksh was installed configure locked up.
Add --no-install-recommends to apt-get commands to save a bit of time and space.
Update test Dockerfile to run in multiple steps. This makes the container larger but also makes rebuilding after changes faster. The --squash option may be used to keep the container small.
Remove obsolete casts in protocol/parallel module. These casts were included in the original migration because Ubuntu 12.04 32-bit gcc required them, but Ubuntu 16.04 32-bit gcc complains. There is no production issue here since at this point in the code the file descriptors are guaranteed to be >= 0.
There are no code changes from PostgreSQL 13 so simply add the new version.
Add CATALOG_VERSION_NO_MAX to allow the catalog version to "float" during the PostgreSQL beta/rc period so new pgBackRest versions are not required when the catalog version changes.
Update the integration tests to handle new PostgreSQL startup messages.
A define was already added for TEST_PATH but it was not widely used. Replace all occurrences of testPath() with TEST_PATH in the tests.
Replace testUser() with TEST_USER, testGroup() with TEST_GROUP, testRepoPath() with HRN_PATH_REPO, testDataPath() with HRN_PATH, testProjectExe() with TEST_PROJECT_EXE, and testScale() with TEST_SCALE.
Replace {[path]}, {[user]}, {[group]}, etc. with defines and remove hrnReplaceKey(). This is better than having two ways to deal with replacements.
In some cases the original test*() getters were kept because they are used by the harness, which does not have access to the new defines. Move them to harnessTest.intern.h to indicate that the tests should no longer use them.
A shim allows a test harness to access static functions and variables in a C module, and also allows functions to be shimmed (i.e. overridden) for the purposes of testing.
For instance, coverage testing works when a process that is normally exec'd is run as a forked child process instead.
Some version interface test functions were integrated into the core code because they relied on the PostgreSQL versioned interface. Even though they were compiled out for production builds they cluttered the core code and made it harder to determine what was required by core.
Create a PostgreSQL version interface in a test harness to contain these functions. This does require some duplication but the cleaner core code seems a good tradeoff. It is possible for some of this code to be auto-generated but since it is only updated once per year the matter is not pressing.
The tests worked fine on multiple architectures, but would only run "bare metal", i.e. tests that required containers could not be run.
Enable basic multi-architecture support by allowing containers to be built using whatever architecture the host supports. Also allow cached containers to be defined for multiple architectures in container.yaml.
Add a Dockerfile which can be used as a container for other containers to provide a consistent development environment.
The primary goal is to allow development on Mac M1 but other architectures should find these improvements useful.
Bug Fixes:
* Fix option warnings breaking async archive-get/archive-push. (Reviewed by Cynthia Shang. Reported by Lev Kokotov.)
* Fix memory leak in backup during archive copy. (Reviewed by Cynthia Shang. Reported by Christian ROUX, Efremov Egor.)
* Fix stack overflow in cipher passphrase generation. (Reviewed by Cynthia Shang. Reported by bsiara.)
* Fix repo-ls / on S3 repositories. (Reviewed by Cynthia Shang. Reported by Lesovsky Alexey.)
Features:
* Multiple repository support. (Contributed by Cynthia Shang, David Steele. Reviewed by Stefan Fercot, Stephen Frost.)
* GCS support for repository storage. (Reviewed by Cynthia Shang.)
* Add archive-header-check option. (Reviewed by Stephen Frost, Cynthia Shang. Suggested by Hans-Jürgen Schönig.)
Improvements:
* Include recreated system databases during selective restore. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.)
* Exclude content-length from S3 signed headers. (Reviewed by Cynthia Shang. Suggested by Brian P Bockelman.)
* Consolidate less commonly used repository storage options. (Reviewed by Cynthia Shang.)
* Allow custom config-path default with ./configure --with-configdir. (Contributed by Michael Schout. Reviewed by David Steele.)
* Log archive copy during backup. (Reviewed by Cynthia Shang, Stefan Fercot.)
Documentation Improvements:
* Update reference to include links to user guide examples. (Contributed by Cynthia Shang. Reviewed by David Steele.)
* Update selective restore documentation with caveats. (Reviewed by Cynthia Shang, Stefan Fercot.)
* Add compress-type clarification to archive-copy documentation. (Reviewed by Cynthia Shang, Stefan Fercot.)
* Add compress-level defaults per compress-type value. (Contributed by Cynthia Shang. Reviewed by David Steele.)
* Add note about required NFS settings being the same as PostgreSQL. (Contributed by Cynthia Shang. Reviewed by David Steele.)
Moving to YAML allows the configuration data to be read by C programs.
Also go back to using YAML::XS since it is the only implementation that has proper boolean support.
This is useful for initialization that needs to be done for the test and all subsequent tests.
Use the new defines to implement initialization for sockets and statistics.
When building tests only include files covered by the current test or by prior tests. This increases performance (less compilation and linking) and also helps detect cross-dependencies in the code. Since there are currently cross-dependencies the depend option is used to document them and allow compilation. The idea is to resolve them incrementally over time.
Add the harness option to include harness modules when the minimum requirements for compilation are met.
Add the feature option to indicate which features are now available in the harness (based on source modules already tested). This allows conditional compilation in harness modules when some features are not yet available.
The unit test Makefile generation was a hodge-podge of constants and rules based on distros/versions that easily got out of date and did not work on an unknown system. All of this dates from the mixed Perl/C unit test implementation.
Instead use configure to generate most of the important Makefile variables, which allows the unit tests to run on multiple platforms, e.g. MacOS and FreeBSD.
There is plenty of work to be done here and not all the unit tests work on MacOS and FreeBSD for various reasons.
As a POC update the MacOS and FreeBSD tests on Cirrus-CI to run a few command unit tests.
MacOS does not allow files to be removed recursively unless the owner has write and execute permissions on all the directories.
Some tests leave the permissions in a bad state so fix them up before trying to delete.
YAML::XS requires libyaml so it not as portable as pure Perl versions of YAML.
Instead of using YAML:PP just use the general YAML::Any module which uses whatever is installed. We are not concerned about performance for YAML so whatever works is fine.
Messages on stderr were being lost due to the error suppression used to customize the error message.
Also update the formatting to be more informative and concise.
Multi-repository implementations for the archive-push, check, info, stanza-create, stanza-upgrade, and stanza-delete commands.
Multi-repo configuration is disabled so there should be no behavioral changes between these commands and their current single-repo implementations.
Multi-repo documentation and integration tests are still in the multi-repo development branch. All unit tests work as multi-repo since they are able to bypass the configuration restrictions.
All unit tests now require full coverage so the "full" keyword is obsolete and has been removed.
The covered code modules are simply listed, with only "no code" modules annotated.
Testing on Travis-CI has been getting slower (from ~18 minutes to 3-6 hours) and the travis-ci.org service will be terminated at the end of the year. Moving to travis-ci.com is an option but the quotas are too low for our purposes.
Instead use Github Actions, which does not currently have quotas, and runs our current tests with just a few tweaks.
This still leaves multi-architecture tests on Travis-CI but we may be able to run those and stay within the new quotas.
Also fix a minor bug in restoreTest.c exposed by Github Actions using a different name for the user and group.
Bug Fixes:
* Allow [, #, and space as the first character in database names. (Reviewed by Stefan Fercot, Cynthia Shang. Reported by Jefferson Alexandre.)
* Create standby.signal only on PostgreSQL 12 when restore type is standby. (Fixed by Stefan Fercot. Reviewed by David Steele. Reported by Keith Fiske.)
Features:
* Expire history files. (Contributed by Stefan Fercot. Reviewed by David Steele.)
* Report page checksum errors in info command text output. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang.)
* Add repo-azure-endpoint option. (Reviewed by Cynthia Shang, Brian Peterson. Suggested by Brian Peterson.)
* Add pg-database option. (Reviewed by Cynthia Shang.)
Improvements:
* Improve info command output when a stanza is specified but missing. (Contributed by Stefan Fercot. Reviewed by Cynthia Shang, David Steele. Suggested by uspen.)
* Improve performance of large file lists in backup/restore commands. (Reviewed by Cynthia Shang, Oscar.)
* Add retries to PostgreSQL sleep when starting a backup. (Reviewed by Cynthia Shang. Suggested by Vitaliy Kukharik.)
Documentation Improvements:
* Replace RHEL/CentOS 6 documentation with RHEL/CentOS 8.
Update RHEL/CentOS 7 to cover the versions that were previously covered by RHEL/CentOS 6.
Since RHEL/CentOS 7/8 work the same update the documentation logic and labels to reflect this compatibility.
CentOS6 EOL'd and the mirrors were swiftly deleted, leading to failures in tests and documentation.
Remove CentOS 6 for now to get builds going again with the intention to replace it in the near future with CentOS 8.
Improve locking on remote processes by introducing an exec-id that is unique to the main process and passed to all remote processes. This allows the remote processes to determine if a lock is held by a remote from the same main process. If so, the lock is allowed.
The exec-id is also useful for associating remote logs with main logs for debugging purposes.
Add older PostgreSQL versions to the u18 container that were not available before.
This also updates all minor versions for prior versions of PostgreSQL.
Currently each module that needs to collect statistics implements custom code to do so. This is cumbersome.
Create a general purpose module for collecting and reporting statistics. Statistics are output in the log at detail level, but there are other uses they could be put to eventually.
No new functionality is added. This is just a drop-in replacement for the current statistics, with the advantage of being more flexible.
The new stats are slower because they involve a list lookup, but performance testing shows stats can be updated at about 40,000/ms which seems fast enough for our purposes.
This loop was using a lot of memory without freeing it at intervals.
Rewrite to use char arrays when possible to reduce memory that needs to be allocated and freed.
There is no sense in generating detailed coverage reports in CI environments where they will never be seen. It takes time and format differences in some older versions can cause problems in the report generation code.
Note that missing coverage will still be reported on stdout and the test will fail.
This aligns better with general PostgreSQL usage and our own documentation (updated in 4bcef702).
Usage in the backup.manifest tests has not been updated since it might break the file format.
There don't appear to be any behavioral changes since PostgreSQL 12 and all the tests pass.
Changes to the control/catalog/WAL versions in subsequent betas may break compatibility but pgBackRest will be updated with each release to keep pace.
Vendorized code is copied from another project when a library is not available and a git subproject won't work. Currently all the vendorized code is copied from PostgreSQL but it makes sense to have a more general mechanism for indicating vendorized code.
The .vendor extension will be used to denote vendorized code in the same way that .auto is used to denote auto-generated code.
These tests required sudo to achieve complete coverage.
Add a new coverage exception, vm_covered, that applies to code that can only be covered in a container. When the test is run outside of a container code sections that require a container will be excluded with TEST_CONTAINER_REQUIRED and the coverage exception will be added to prevent a coverage error.
This does require marking up the core code with vm_covered, which in some modules (e.g. common/io/tls/client) can be extensive. It's possible that some of these tests can be rewritten to be less dependent on sudo but no attempt was made to do that here.
Only allow coverage summaries in a vm since coverage summaries outside a vm will not be complete, which was true even before this commit.
Newer versions of sudo output this message to stderr when run in a container:
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
See https://github.com/sudo-project/sudo/issues/42 for details.
A simple workaround is to prevent sudo from disabling core dumps. This seems safe enough because if sudo is segfaulting then core files are the least of our worries.
There are a number of Valgrind errors on Ubuntu 12.04 which do not happen on newer distro versions. However, suppressions for these errors have masked legitimate issues in subsequent code.
Instead, make suppressions VM specific so errors in other VMs are not masked.
bzip2 is a widely available, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), while being around twice as fast at compression and six times faster at decompression.
bzip2 is currently available on all supported platforms.
Zstandard is a fast lossless compression algorithm targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by Huff0 and FSE library.
Zstandard version >= 1.0 is required, which is generally only available on newer distributions.