Most of these tests are just checking that errors are thrown when required. These are well covered in various unit tests.
The "cannot resume" tests are also well covered in the backup unit tests.
Finally, config warnings are well covered in the config unit tests.
There is more to be done here, but this accounts for the low-hanging fruit.
Set log-level-file=off when more that one test will run. In this case is it impossible to see the logs anyway since they will be automatically cleaned up after the test. This improves performance pretty dramatically since trace-level logging is expensive. If a singe integration test is run then log-level-file is trace by default but can be changed with the --log-level-test-file option.
Reduce buffer-size to 64k to save memory during testing and allow more processes to run in parallel.
Update log replacement rules so that these options can change without affecting expect logs.
The co6 tests were occasionally running out of space so bump up the size of the ramdisk a bit to hopefully prevent this.
A longer term solution would be to disable the trace-level file logs when running on Travis CI since they seem to be using most of the space.
PostgreSQL >= 9.6 uses non-exclusive backup which has implicit stop-auto since the backup will stop when the connection is terminated.
The warning was made more verbose in 1f2ce45e but this now seems like a bad idea since there are likely users with mixed version environments where stop-auto is enabled globally. There's no reason to fill their logs with warnings over a harmless option. If anything we should warn when stop-auto is explicitly set to false but this doesn't seem very important either.
Revert to the prior behavior, which is to warn and reset when stop-auto is enabled on PostgreSQL < 9.3.
\ was not being properly escaped when calculating the manifest checksum which prevented the manifest from loading.
Use jsonFromStr() to properly quote and escape \.
Since instances of \ in cluster filenames should be rare to nonexistent this does not seem likely to be a serious problem in the field.
Remove embedded Perl from the distributed binary. This includes code, configure, Makefile, and packages. The distributed binary is now pure C.
Remove storagePathEnforceSet() from the C Storage object which allowed Perl to write outside of the storage base directory. Update mock/all and real/all integration tests to use storageLocal() where they were violating this rule.
Remove "c" option that allowed the remote to tell if it was being called from C or Perl.
Code to convert options to JSON for passing to Perl (perl/config.c) has been moved to LibC since it is still required for Perl integration tests.
Update build and installation instructions in the user guide.
Remove all Perl unit tests.
Remove obsolete Perl code. In particular this included all the Perl protocol code which required modifications to the Perl storage, manifest, and db objects that are still required for integration testing but only run locally. Any remaining Perl code is required for testing, documentation, or code generation.
Rename perlReq to binReq in define.yaml to indicate that the binary is required for a test. This had been the actual meaning for quite some time but the key was never renamed.
For the most part this is a direct migration of the Perl code into C except as noted below.
A backup can now be initiated from a linked directory. The link will not be stored in the manifest or recreated on restore. If a link or directory does not already exist in the restore location then a directory will be created.
The logic for creating backup labels has been improved and it should no longer be possible to get a backup label earlier than the latest backup even with timezone changes or clock skew. This has never been an issue in the field that we know of, but we found it in testing.
For online backups all times are fetched from the PostgreSQL primary host (before only copy start was). This doesn't affect backup integrity but it does prevent clock skew between hosts affecting backup duration reporting.
Archive copy now works as expected when the archive and backup have different compression settings, i.e. when one is compressed and the other is not. This was a long-standing bug in the Perl code.
Resume will now work even if hardlink settings have been changed.
Reviewed by Cynthia Shang.
/ takes precedence over & but the appropriate parens were not provided.
By some bad luck the tests worked either way, so add a new test that only works the correct way to prevent a regression.
Bug Fixes:
* Fix archive-push/archive-get when PGDATA is symlinked. These commands tried to use cwd() as PGDATA but this would disagree with the path configured in pgBackRest if PGDATA was symlinked. If cwd() does not match the pgBackRest path then chdir() to the path and make sure the next cwd() matches the result from the first call. (Reported by Stephen Frost, Milosz Suchy.)
* Fix reference list when backup.info is reconstructed in expire command. Since the backup command is still using the Perl version of reconstruct this issue will not express unless 1) there is a backup missing from backup.info and 2) the expire command is run directly instead of running after backup as usual. This unlikely combination of events means this is probably not a problem in the field.
* Fix segfault on unexpected EOF in gzip decompression. (Reported by Stephen Frost.)
The TZ environment variable was not reliably pushed down to the test processes.
Instead pass TZ via a command line parameter and set explicitly in the test process.
Using gmtime() produced output skewed by the local timezone.
Since this function is currently only used for debug logging this is not a live bug in the field.
Commit 7168e074 tried to use cwd() as PGDATA but this would disagree with the path configured in pgBackRest if PGDATA was symlinked.
If cwd() does not match the pgBackRest path then chdir() to the path and make sure the next cwd() matches the result from the first call.
If the compressed stream terminated early then the decompression process would get a flush request (NULL input buffer) since the filter was not marked as done. This could happen on a zero-length or truncated (i.e. invalid) compressed file.
Change the existing assertion to an error to catch this condition in production gracefully.
82df7e6f and 9856fef5 updated tests that used test points in preparation for the feature not being available in the C code.
Since tests points are no longer used remove the infrastructure.
Also remove one stray --test option in mock/all that was essentially a noop but no longer works now that the option has been removed.
This module will eventually contain various useful zero-terminated string functions.
For now, using NULL_Z instead of strPtr(NULL_STR) avoids a strict aliasing warning on RHEL 6. This is likely a compiler issue, but adding these constants seems like a good idea anyway and we are not going to get a fix in a gcc that old.
Pq script errors are now printed in test output in case they are being masked by a later error.
Once a script error occurs, the same error will be thrown forever rather than throwing a new error on the next item in the script.
HRNPQ_MACRO_CLOSE() is not required in scripts unless harnessPqScriptStrictSet(true) is called. Most higher-level tests should not need to run in strict mode.
The command/check test seems to require strict mode but there's no apparent reason why it should. This would be a good thing to look into at some point.
Some log output (e.g. time) is hard to test because the values can change between tests.
Add expressions to replace substrings in the log with predictable values to simplify testing.
This is similar to the log replacement facility available for Perl expect log testing.
A recopy would occur if the size or checksum was invalid but on error the backup would terminate.
Instead, recopy the resumed file on any error. If the error is systemic (e.g. network failure) then it should show up again during the recopy.
Since there is only one driver that supports (or is likely to support) links (Posix), require the path feature to make logic in the code simpler.
The checks are added just in case another driver supports links.
These were not getting updated to match the directory name when the manifests were copied.
The Perl code didn't care but the C code expects labels to be set correctly.
For now this is only used in testing but there are places where it could be useful in the core code.
Even if that turns out not to be true, it doesn't seem worth implementing a new version in testing just to capture a few values that we already have.
This is to maintain compatibility with the older Perl code that returned the lowest sorted order item in a tie.
For other datatypes the C code returns the same value, often enough at least to not cause churn in the expect tests.