PostgreSQL 11 introduces configurable WAL segment sizes, from 1MB to 1GB.
There are two areas that needed to be updated to support this: building the archive-get queue and checking that WAL has been archived after a backup. Both operations require the WAL segment size to properly build a list.
Checking the archive after a backup is still implemented in Perl and has an active database connection, so just get the WAL segment size from the database.
The archive-get command does not have a connection to the database, so get the WAL segment size from pg_control instead. This requires a deeper inspection of pg_control than has been done in the past, so it seemed best to copy the relevant data structures from each version of PostgreSQL and build a generic interface layer to address them. While this approach is a bit verbose, it has the advantage of being relatively simple, and can easily be updated for new versions of PostgreSQL.
Since the integration tests generate pg_control files for testing, teach Perl how to generate files with the correct offsets for both 32-bit and 64-bit architectures.
Use checksums rather than timestamps to determine if files have changed. This is useful in cases where the timestamps may not be trustworthy, e.g. when performing an incremental after failing over to a standby.
If checksum delta is enabled then checksums will be used for verification of resumed backups, even if they are full. Resumes have always used checksums to verify the files in the repository, enabling delta performs checksums on the database files as well.
Note that the user must manually enable this feature in cases were it would be useful or just keep in enabled all the time. A future commit will address automatically enabling the feature in cases where it seems likely to be useful.
Contributed by Cynthia Shang.
Apparently we never needed to run this function remotely.
It will be needed by the backup checksum delta feature, so implement it now.
Contributed by Cynthia Shang.
This is a workaround for inefficient handling of many setjmps in gcc >= 4.9. Setjmp is used in all error handling, but in the unit tests each test macro contains an error handling block so they add up pretty quickly for large unit tests.
Enabling -ftree-coalesce-vars in affected versions reduces build time and memory requirements by nearly an order of magnitude. Even so, compiles are much slower than gcc <= 4.8.
We submitted a bug for this at: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87316
Which was marked as a duplicate of: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63155
For read-only repositories the Posix and CIFS drivers behave exactly the same. Since that's all we support in C right now it's valid to treat them as the same thing. An assertion has been added to remind us to add the CIFS driver before allowing the repository to be writable.
Mostly we want to make sure that the C code does not blow up when the repository type is CIFS.
Storing the expect log (created by common/harnessLog) in the regular test directory was not ideal. It showed up in tests and made it difficult to clear the test directory between each run.
Move the expect log to a purpose-built directory one level up so it does not interfere with regular testing.
C or Perl coverage tests can now be run on any VM provided a recent enough version of Devel::Cover or lcov is available.
For now, leave u18 as the only VM to run coverage tests due to some issues with older versions of lcov.
% characters caused issues in backup/restore due to filenames being appended directly into a format string.
Reserved XML characters (<>&') caused issues in the S3 driver due to improper escaping.
Add a file with all common special characters to regression testing.
By default Valgrind does not exit with an error code when a non-fatal error is detected, e.g. unfreed memory. Use the --error-exitcode option to enabled this behavior.
Update some minor issues discovered in the tests as a result. Luckily, no issues were missed in the core code.
The new archive-get C code can't run (yet) when encryption is enabled. Therefore move the encryption tests so we can test the new C code. We'll move it back when encryption is enabled in C.
Also, push one WAL segment with compression to test decompression in the C code.
A return code of 1 from the archive-get was being logged as an error message at info level but otherwise worked correctly.
Also improve info messages when an archive segment is or is not found.
Previously an error would be generated if other files were present and not owned by the PostgreSQL user. This hasn't been a big deal in practice but it could cause issues.
Also add tests to make sure the same logic applies with links to files, i.e. all other files in the directory should be ignored. This was actually working correctly, but there were no tests for it before.
The log-subprocess feature added in 22765670 failed to take into account the naming for remote processes spawned by local processes. Not only was the local command used for the naming of log files but the process id was not pass through. This meant every remote log was named "[stanza]-local-remote-000" which is confusing and meant multiple processes were writing to the same log.
Instead, pass the real command and process id to the remote. This required a minor change in locking to ignore locks if process id is greater than 0 since remotes started by locals never lock.
Relative link paths were being combined with the paths of previous links (relative or absolute) due to the $strPath variable being modified in the current iteration rather than simply being passed to the next level of recursion.
This issue did not affect absolute links and relative tablespace links were caught by other checks, though the error was confusing.
Reported by Cynthia Shang.
Offline operation runs counter to the purpose of this command, which is to check if archiving and backups are working correctly.
Reported by Jason O'Donnell.
Implemented using the same logic as the patches adding this feature to PostgreSQL, 8694cc96 and 920a5e50. Temporary relation exclusion is enabled in PostgreSQL ≥ 9.0. Unlogged relation exclusion is enabled in PostgreSQL ≥ 9.1, where the feature was introduced.
Contributed by Cynthia Shang.
This allows setting the test log level independently from the general test harness setting, but current only works for the C tests. It is useful for seeing log output from functions on the console while a test is running.
This is more efficient overall and allows the caller to specify how many bytes will be read on each call. Reads are appended if the buffer already contains data but the buffer size will never increase.
Allow Buffer object "used size" to be different than "allocated size". Add functions to manage used size and remaining size and update automatically when possible.
A regression in v0.82 removed the timestamp comparison when deciding which files from the aborted backup to keep on resume. All resumed backups should be considered inconsistent. A resumed backup can be identified by checking the log for the message "aborted backup of same type exists, will be cleaned to remove invalid files and resumed".
Reported by David Youatt, Yogesh Sharma, Stephen Frost.
S3 (and gateways) always set content-length or transfer-encoding but HTTP 1.1 does not require it and proxies (e.g. HAProxy) may not include either.
Suggested by Adam K. Sumner.
* Build containers from scratch for more accurate testing.
* Allow environment load to be skipped.
* Allow bash wrapping to be skipped.
* Allow forcing a command to run as a user without sudo.
Bug Fixes:
* Fix potential buffer overrun in error message handling. (Reported by Lætitia.)
* Fix archive write lock being taken for the synchronous archive-get command. (Reported by Uspen.)
Improvements:
* Embed exported C functions and Perl modules directly into the pgBackRest executable.
* Use time_t instead of __time_t for better portability. (Suggested by Nick Floersch.)
* Print total runtime in milliseconds at command end.
Low-level functions only include stack trace in test builds while higher-level functions ship with stack trace built-in. Stack traces include all parameters passed to the function but production builds only create the parameter list when the log level is set high enough, i.e. debug or trace depending on the function.
* Allow more than one test to provide coverage for the same module.
* Add option to disable valgrind.
* Add option to disabled coverage.
* Add option to disable debug build.
* Add option to disable compiler optimization.
* Add --dev-test mode.
pgBackRest currently has no way to request new credentials so the entire command (e.g. backup, restore) must complete before the credentials expire.
Contributed by Yogesh Sharma.
Many options that were set per test can instead be inferred from the types, i.e. container, c, expect, and individual.
Also finish renaming Perl unit tests with the -perl suffix.
Configuration files are loaded from the directory specified by the --config-include-path option.
Add --config-path option for overriding the default base path of the --config and --config-include-path option.
Contributed by Cynthia Shang.
Mainly this helps with unit tests that need to do log expect testing. Add harnessCfgLoad() test function, which allows a new config to be loaded for unit testing without resetting log functions, opening a log file, or taking locks.
The Perl process was exiting directly when called but that interfered with proper locking for the forked async process. Now Perl returns results to the C process which handles all errors, including signals.
Now only two types of locks can be taken: archive and backup. Most commands use one or the other but the stanza-* commands acquire both locks. This provides better protection than the old command-based locking scheme.
This makes it easier to create objects and then copy them to another context when they are complete without having to worry about freeing them on error. Update List, StringList, and Buffer to allow moves. Update Ini and Storage to take advantage of moves.
Switch from Devel::Cover because it would not report on branch coverage for reports converted from gcov.
Branch coverage is not complete, so for the time being errors will only be generated when statement coverage is not complete. Coverage of unit tests is not displayed in the report unless they are incomplete for either statement or branch coverage.
* Replace remaining NDEBUG blocks with the more granular DEBUG_UNIT.
* Remove some debug memset() calls in MemContext since valgrind is more useful for these checks.
Move command begin to C except when it must be called after another command in Perl (e.g. expire after backup). Command begin logs correctly for complex data types like hash and list. Specify which commands will log to file immediately and set the default log level for log messages that are common to all commands. File logging is initiated from C.
Buffering now takes the pending bytes on the socket into account (when present) rather than relying entirely on select(). In some instances the final bytes would not be flushed until the connection was closed.
The coverage report shows some code as never being run -- but that makes no sense because the tests pass. This may be due to trying to combine the C and Perl coverage reports and overwriting some runs.
Suppress for now with a plan to implement LCOV for the C unit tests.
This provides correct matching in the event there are system-id and db-version duplicates (e.g. after reverting a pg_upgrade).
Fixed by Cynthia Shang.
Reported by Adam K. Sumner.
* Add strCmp*() and strFirst*() to String.
* Add strLstSort() and strLstNewSplitSize() to StringList.
* Add strLstNewSplitZ() to StringList a update calls to strLstNewSplit() as needed.
* Add lstSort to List.
* Add strBeginsWith(), strEndsWith(), strEq(), and strBase().
* Enable compiler type checking for strNewFmt() and strCatFmt().
* Rename strNewSzN() to strNewN().
This allows specific options in pgbackrest.conf to be ignored (and set to default) which reduces the need to write new configuration files for specific needs.
Note that boolean, non-command-line options are already negatable.
When a backup host is present, backups should only be allowed on the backup host and restores should only be allowed on the database host unless an alternate configuration is created that ignores the remote host.
Reported by Lardière Sébastien.
Required to test restores on the backup server, a fairly common scenario.
Improve the restore function to accept optional parameters rather than a long list of parameters. In passing, clean up extraneous use of strType and strComment variables.