1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
Commit Graph

109 Commits

Author SHA1 Message Date
David Steele
47aa765375 Add Zstandard compression support.
Zstandard is a fast lossless compression algorithm targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by Huff0 and FSE library.

Zstandard version >= 1.0 is required, which is generally only available on newer distributions.
2020-05-04 15:25:27 -04:00
Cynthia Shang
c5241e5007 Expire WAL archive only when repo-retention-archive threshold is met.
Previously when retention-archive was set (either by the user or by default), archives prior to the archive-start of the oldest remaining full backup (after backup expiration occurred) would be expired even though the retention-archive threshold had not been met. For example, if there were 1 full backup remaining after backup expiration and the retention-archive was set to 2 and retention-archive-type=full, then archives prior to the archive-start of the remaining full backup would still be removed even though retention-archive required 2 full backups remaining before archives should be expired.

The thought was to keep the archive directory clean and since the full backup did not require prior archives, it was safe to delete them. However, this has caused problems for some users in the past (because they needed the WAL for other purposes) and with the new adhoc and time-based retention features, it was decided that the archives should remain until the threshold was met. The archives will eventually be removed and if having them causes space issues, the expire command and the retention-archive can always be run and adjusted.
2020-04-29 08:06:49 -04:00
David Steele
4a5bd002c0 Move pgBackRest::Version module to pgBackRestDoc::ProjectInfo.
The primary source for project info is now src/version.h.

The pgBackRestDoc::ProjectInfo module loads the project info from src/version.h at runtime so there is no need to update it.
2020-03-10 17:57:02 -04:00
David Steele
731b862e6f Rename BackRestDoc Perl module to pgBackRestDoc.
This is consistent with the way BackRest and BackRest test were renamed way back in 18fd2523.

More modules will be moving to pgBackRestDoc soon so renaming now reduces churn later.
2020-03-10 15:41:56 -04:00
David Steele
36d4ab9bff Move Perl modules out of lib directory.
This directory was once the home of the production Perl code but since f0ef73db this is no longer true.

Move the modules to test in most cases, except where the module is expected to be useful for the doc engine beyond the expected lifetime of the Perl test code (about a year if all goes well).

The exception is pgBackRest::Version which requires more work to migrate since it is used to track pgBackRest versions.
2020-03-10 15:12:44 -04:00
David Steele
c279a00279 Add lz4 compression support.
LZ4 compresses data faster than gzip but at a lower ratio.  This can be a good tradeoff in certain scenarios.

Note that setting compress-type=lz4 will make new backups and archive incompatible (unrestorable) with prior versions of pgBackRest.
2020-03-10 14:45:27 -04:00
David Steele
79cfd3aebf Remove LibC.
This was the interface between Perl and C introduced in 36a5349b but since f0ef73db has only been used by the Perl integration tests.  This is expensive code to maintain just for testing.

The main dependency was the interface to storage, no matter where it was located, e.g. S3.  Replace this with the new-introduced repo commands (d3c83453) that allow access to repo storage via the command line.

The other dependency was on various cfgOption* functions and CFGOPT_ constants that were convenient but not necessary.  Replace these with hard-coded strings in most places and create new constants for commonly used values.

Remove all auto-generated Perl code.  This means that the error list will no longer be maintained automatically so copy used errors to Common::Exception.pm.  This file will need to be maintained manually going forward but there is not likely to be much churn as the Perl integration tests are being retired.

Update test.pl and related code to remove LibC builds.

Ding, dong, LibC is dead.
2020-03-09 17:41:59 -04:00
David Steele
3c4f91b319 Remove Perl unit tests made obsolete in 434cd832.
These were replaced by C unit tests but not all the unit test setup code was removed in the Perl module.
2020-03-09 13:35:26 -04:00
David Steele
438b957f9c Add infrastructure for multiple compression type support.
Add compress-type option and deprecate compress option. Since the compress option is boolean it won't work with multiple compression types. Add logic to cfgLoadUpdateOption() to update compress-type if it is not set directly. The compress option should no longer be referenced outside the cfgLoadUpdateOption() function.

Add common/compress/helper module to contain interface functions that work with multiple compression types. Code outside this module should no longer call specific compression drivers, though it may be OK to reference a specific compression type using the new interface (e.g., saving backup history files in gz format).

Unit tests only test compression using the gz format because other formats may not be available in all builds. It is the job of integration tests to exercise all compression types.

Additional compression types will be added in future commits.
2020-03-06 14:41:03 -05:00
David Steele
02aa03d1a2 Remove obsolete methods in pgBackRest::Storage::Storage module.
All the methods in this module will need to be implemented via the command-line in order to get rid of LibC, so the first step is to reduce the code in the module as much as possible.

First remove storageDb() and use storageTest() instead.  Then create storageTest() using pgBackRestTest::Common::Storage which has no dependencies on LibC.  Now the only storage using the LibC interface is storageRepo().

Remove all link functions since those operations cannot be performed on a repo unless it is Posix, in which case the LibC interface is not needed.  Same for owner().

Remove pathSync() because syncs are not required in the tests.  No test data is reused after a crash.

Path create/exists functions should never be explicitly performed on a repo so remove those.  File exists can be implemented by calling info() instead.

Remove encryption detection functions which were only used by Backup/Archive::Info reconstruct() which are now obsolete.

Remove all filters except pgBackRest::Storage::Filter::CipherBlock since they are not being used.  That also means there are no filters returning results so remove all the result code.

Move hashSize() and pathAbsolute() into pgBackRest::Storage::Base where they can be shared between pgBackRest::Storage::Storage and pgBackRestTest::Common::Storage.
2020-03-06 14:10:09 -05:00
David Steele
eb4347f20b Use static checksums in mock/all integration tests.
Using static values serves as a better cross-check against the page checksum code. The downside is that these checksums may not work with some big endian systems but in that case neither will the unit tests.

We can also remove the page checksum interface from LibC which brings us one step closer to eliminating it.
2020-03-05 13:56:20 -05:00
David Steele
4ab8943ca8 Use PG_PAGE_SIZE_DEFAULT constant instead of pageSize variable.
Page size is passed around a lot but in fact it can only have one value, PG_PAGE_SIZE_DEFAULT, which is checked when pg_control is loaded. There may be an argument for supporting multiple page sizes in the future but for now just use the constant to simplify the code.

There is also a significant performance benefit.  Because pageSize was being used in pageChecksumBlock() the main loop was neither unrolled nor vectorized (-funroll-loops -ftree-vectorize) as it is now with a constant loop boundary.
2020-03-05 09:14:27 -05:00
David Steele
91f321fb86 Rename old page*() functions to conform to new conventions.
The general convention now is to prefix PostgreSQL functions with "pg".
2020-03-04 14:24:40 -05:00
David Steele
620386f034 Remove integration tests that are now covered in the unit tests.
Most of these tests are just checking that errors are thrown when required.  These are well covered in various unit tests.

The "cannot resume" tests are also well covered in the backup unit tests.

Finally, config warnings are well covered in the config unit tests.

There is more to be done here, but this accounts for the low-hanging fruit.
2019-12-17 20:14:45 -05:00
David Steele
977ec2e307 Integration test improvements for disk and memory efficiency.
Set log-level-file=off when more that one test will run.  In this case is it impossible to see the logs anyway since they will be automatically cleaned up after the test.  This improves performance pretty dramatically since trace-level logging is expensive.  If a singe integration test is run then log-level-file is trace by default but can be changed with the --log-level-test-file option.

Reduce buffer-size to 64k to save memory during testing and allow more processes to run in parallel.

Update log replacement rules so that these options can change without affecting expect logs.
2019-12-17 15:23:07 -05:00
David Steele
f0ef73db70 pgBackRest is now pure C.
Remove embedded Perl from the distributed binary.  This includes code, configure, Makefile, and packages.  The distributed binary is now pure C.

Remove storagePathEnforceSet() from the C Storage object which allowed Perl to write outside of the storage base directory.  Update mock/all and real/all integration tests to use storageLocal() where they were violating this rule.

Remove "c" option that allowed the remote to tell if it was being called from C or Perl.

Code to convert options to JSON for passing to Perl (perl/config.c) has been moved to LibC since it is still required for Perl integration tests.

Update build and installation instructions in the user guide.

Remove all Perl unit tests.

Remove obsolete Perl code.  In particular this included all the Perl protocol code which required modifications to the Perl storage, manifest, and db objects that are still required for integration testing but only run locally.  Any remaining Perl code is required for testing, documentation, or code generation.

Rename perlReq to binReq in define.yaml to indicate that the binary is required for a test.  This had been the actual meaning for quite some time but the key was never renamed.
2019-12-13 17:55:41 -05:00
David Steele
1f2ce45e6b The backup command is implemented entirely in C.
For the most part this is a direct migration of the Perl code into C except as noted below.

A backup can now be initiated from a linked directory.  The link will not be stored in the manifest or recreated on restore.  If a link or directory does not already exist in the restore location then a directory will be created.

The logic for creating backup labels has been improved and it should no longer be possible to get a backup label earlier than the latest backup even with timezone changes or clock skew.  This has never been an issue in the field that we know of, but we found it in testing.

For online backups all times are fetched from the PostgreSQL primary host (before only copy start was).  This doesn't affect backup integrity but it does prevent clock skew between hosts affecting backup duration reporting.

Archive copy now works as expected when the archive and backup have different compression settings, i.e. when one is compressed and the other is not.  This was a long-standing bug in the Perl code.

Resume will now work even if hardlink settings have been changed.

Reviewed by Cynthia Shang.
2019-12-13 17:14:26 -05:00
David Steele
d0ba8ff58c Remove test point infrastructure.
82df7e6f and 9856fef5 updated tests that used test points in preparation for the feature not being available in the C code.

Since tests points are no longer used remove the infrastructure.

Also remove one stray --test option in mock/all that was essentially a noop but no longer works now that the option has been removed.
2019-12-10 13:16:47 -05:00
David Steele
e632c60525 Fix backup labels in mock/all resume integration tests.
These were not getting updated to match the directory name when the manifests were copied.

The Perl code didn't care but the C code expects labels to be set correctly.
2019-12-06 11:48:41 -05:00
David Steele
8dfe0e48e2 Use more general error code when tablespace linked into PGDATA.
The specific error code was not that useful since we also test the error message which contains details of the link error.
2019-12-02 10:49:25 -05:00
David Steele
fc291b6f28 Reduce the scope of mock/all exclusion tests.
Run exclusions only on the tests where they will have an effect to reduce churn in the expect logs when they change.
2019-12-01 17:47:47 -05:00
David Steele
686b6f91da Set archive-check option in manifest correctly when offline.
Archive check does not run when in offline backup mode but the option was set to true in the manifest.  It's harmless since these options are informational only but it could cause confusion when debugging.
2019-11-28 08:27:21 -05:00
David Steele
8800f32ad9 Remove exclusions once they have been tested in mock/all.
The exclusions no longer have any effect after a restore and just add noise to the expect log.
2019-11-25 08:35:26 -05:00
David Steele
9856fef586 Update integration tests in mock/all that use test points.
Test points will not be available in the C code so update these tests as best as possible without using them.

This represents a loss of coverage for the Perl code (soon to be removed) which will be made up in the C code with unit tests.
2019-11-25 07:48:52 -05:00
David Steele
3cd45a7411 Remove start/stop --force integration tests in mock/all.
These tests require test points which are not being implemented in the C code.

This functionality is fully tested in the command/control unit tests so integration tests are no longer required.
2019-11-25 07:45:58 -05:00
David Steele
01aefc563d Update Perl page checksum expression.
This expression determines which files contain page checksums but it was also including the directory above the relation directories.  In a real PostgreSQL installation this not a problem because these directories don't contain any files.

However, our tests place a file in `base` which the Perl code thought should have page checksums while the new C code says no.

Update the expression to document the change and avoid churn in the expect logs later.
2019-11-25 07:37:09 -05:00
David Steele
c524ec4f95 Remove obsolete integration tests from mock/all.
The protocol timeout tests have been superceded by unit tests.

The TEST_BACKUP_RESUME test point was incorrectly included into a number of tests, probably a copy pasto.  It didn't hurt anything but it did add 200ms to each test where it appeared.

Catalog and control version tests were redundant.  The database version and system id tests covered the important code paths and the C code gets these values from a lookup table.

Finally, fix an incomplete update to the backup.info file while munging for tests.
2019-11-21 16:06:27 -05:00
David Steele
8b682b75d2 Allow mock integration tests for all VM types.
Previously the mock integration tests would be skipped for VMs other than the standard four used in CI.  Now VMs outside the standard four will run the same tests as VM4 (currently U18).
2019-11-02 10:35:48 +01:00
David Steele
fa6a54bb45 Update last tests that required sudo.
All tests should now run in a sudo-less environment.
2019-10-16 17:05:24 +02:00
David Steele
11c7c8fabb Remove pgbackrest test user.
This user was created before we tested in containers to ensure isolation between the pg and repo hosts which were then just directories.  The downside is that this resulted in a lot of sudos to set the pgbackrest user and to remove files which did not belong to the main test user.

Containers provide isolation without needing separate users so we can now safely remove the pgbackrest user.  This allows us to remove most sudos, except where they are explicitly needed in tests.

While we're at it, remove the code that installed the Perl C library (which also required sudo) and simply add the build path to @INC instead.
2019-10-12 09:45:18 -04:00
Cynthia Shang
2972580566 Remove info expect tests from mock/all and mock/stanza.
These tests are redundant now that we have full coverage in the unit tests are are not worth maintaining anymore.
2019-10-11 12:38:03 -04:00
David Steele
528f4c4347 Remove dependency on aws cli for testing.
This tool was only being used it a few places but was a pretty large dependency.

Rework the forceStorageMove() code using our storage layer and replace one aws cli cp with a storage put.

Also, remove the Dockerfile that was once used to build the Scality S3 test container.
2019-10-09 14:38:24 -04:00
David Steele
451ae397be The restore command is implemented entirely in C.
For the most part this is a direct migration of the Perl code into C.

There is one important behavioral change with regard to how file permissions are handled.  The Perl code tried to set ownership as it was in the manifest even when running as an unprivileged user.  This usually just led to errors and frustration.

The C code works like this:

If a restore is run as a non-root user (the typical scenario) then all files restored will belong to the user/group executing pgBackRest. If existing files are not owned by the executing user/group then an error will result if the ownership cannot be updated to the executing user/group. In that case the file ownership will need to be updated by a privileged user before the restore can be retried.

If a restore is run as the root user then pgBackRest will attempt to recreate the ownership recorded in the manifest when the backup was made. Only user/group names are stored in the manifest so the same names must exist on the restore host for this to work. If the user/group name cannot be found locally then the user/group of the PostgreSQL data directory will be used and finally root if the data directory user/group cannot be mapped to a name.

Reviewed by Cynthia Shang.
2019-09-26 07:52:02 -04:00
David Steele
3a28b68b8b Disable S3 and encryption on u18 integration tests for mock/all/1.
This test is commonly used for sanity checking but the combination of S3 and encryption makes it hard to use and encourages temporary changes to make it usable.

Acknowledge this and disable S3 and encryption for this test and move them to mock/all/2.
2019-09-02 19:06:12 -04:00
Josh Soref
c2771e5469 Fix comment typos.
This includes some variable names in tests which don't seem important enough for their own commits.

Contributed by Josh Soref.
2019-08-26 12:05:36 -04:00
David Steele
01c2669b97 Fix exclusions for special files.
Prior to 2.16 the Perl manifest code would skip any file that began with a dot.  This was not intentional but it allowed PostgreSQL socket files to be located in the data directory.  The new C code in 2.16 did not have this unintentional exclusion so socket files in the data directory caused errors.

Worse, the file type error was being thrown before the exclusion check so there was really no way around the issue except to move the socket files out of the data directory.

Special file types (e.g. socket, pipe) will now be automatically skipped and a warning logged to notify the user of the exclusion.  The warning can be suppressed with an explicit --exclude.

Reported by CluelessTechnologist, Janis Puris, Rachid Broum.
2019-08-23 07:47:54 -04:00
Cynthia Shang
c733319063 The stanza-create/update/delete commands are implemented entirely in C.
Contributed by Cynthia Shang.
2019-08-21 16:26:28 -04:00
David Steele
3df075bf40 Fix test writing "null" into manifest files.
"null" is not allowed in the manifest format (null values should be missing instead) but Perl was treating the invalid values written by this test as if they were missing.

Update the test code to remove the values rather than setting them to "null".
2019-08-18 15:29:18 -04:00
David Steele
e10577d0b0 Fix incorrect offline upper bound for ignoring page checksum errors.
For offline backups the upper bound was being set to 0x0000FFFF0000FFFF rather than UINT64_MAX.  This meant that page checksum errors might be ignored for databases with a lot of past WAL in offline mode.

Online mode is not affected since the upper bound is retrieved from pg_start_backup().
2019-07-11 09:13:56 -04:00
David Steele
1708f1d151 Use minio for integration testing.
ScalityS3 has not received any maintenance in years and is slow to start which is bad for testing.  Replace it with minio which starts quickly and ships as a single executable or a tiny container.

Minio has stricter limits on allowable characters but should still provide enough coverage to show that our encoding is working correctly.

This commit also includes the upgrade to openssl 1.1.1 in the Ubuntu 18.04 container.
2019-07-02 22:20:35 -04:00
David Steele
4815752ccc Add Perl interface to C storage layer.
Maintaining the storage layer/drivers in two languages is burdensome.  Since the integration tests require the Perl storage layer/drivers we'll need them even after the core code is migrated to C.  Create an interface layer so the Perl code can be removed and new storage drivers/features introduced without adding Perl equivalents.

The goal is to move the integration tests to C so this interface will eventually be removed.  That being the case, the interface was designed for maximum compatibility to ease the transition.  The result looks a bit hacky but we'll improve it as needed until it can be retired.
2019-06-26 08:24:58 -04:00
Cynthia Shang
b498188f01 Error on db history mismatch when expiring.
Amend commit 434cd832 to error when the db history in archive.info and backup.info do not match.

The Perl code would attempt to reconcile the history by matching on system id and version but we are not planning to migrate that code to C.  It's possible that there are users with mismatches but if so they should have been getting errors from info for the last six months.  It's easy enough to manually fix these files if there are any mismatches in the field.

Contributed by Cynthia Shang.
2019-06-24 11:59:44 -04:00
David Steele
434cd83285 The expire command is implemented entirely in C.
This implementation duplicates the functionality of the Perl code but does so with different logic and includes full unit tests.

Along the way at least one bug was fixed, see issue #748.

Contributed by Cynthia Shang.
2019-06-18 15:19:20 -04:00
David Steele
86482c7db9 Reduce log level for all expect tests to detail.
The C code is designed to be efficient rather than deterministic at the debug log level.  As we move more testing from integration to unit tests it makes less sense to try and maintain the expect logs at this log level.

Most of the expect logs have already been moved to detail level but mock/all still had tests at debug level.  Change the logging defaults in the config file and remove as many references to log-level-console as possible.
2019-05-22 18:23:44 -04:00
David Steele
e3fe3434b4 Rename repo-s3-verify-ssl option to repo-s3-verify-tls.
The new name is preferred because pgBackRest does not support any SSL protocol versions (they are all considered to be insecure).

The old name will continue to be accepted.
2019-05-21 10:14:41 -04:00
Cynthia Shang
19d8358cba Update mock/expire module test matrix so expect tests output.
Also add an error message to prevent regression.

Contributed by Cynthia Shang.
2019-05-16 09:53:55 -04:00
David Steele
1b48684713 The archive-push command is implemented entirely in C.
This new implementation should behave exactly like the old Perl code with the exception of updated log messages.

Remove as much of the Perl code as possible without breaking other commands.
2019-03-29 13:26:33 +00:00
David Steele
9382283586 Fix issues when a path option is / terminated.
This condition was not being properly checked for in the C code and it caused problems in the info command, at the very least.

Instead of applying a local fix, introduce a new path option type that will rigorously check the format of any incoming paths.

Reported by Marc Cousin.
2019-03-14 13:48:33 +04:00
blogh
e4e2606fce Add additional options to backup.manifest for debugging purposes.
Add the buffer-size, compress-level, compress-level-network, and process-max options to the backup:option section in backup.manifest to aid in debugging.

It may also make sense to propagate these options up to backup.info so they can be displayed in the info command, but for now this is deemed sufficient.

Contributed by blogh.
2019-03-10 11:03:52 +02:00
David Steele
d441061168 Create test matrix for mock/all to increase coverage and reduce tests.
The same test configurations are run on all four test VMs, which seems a real waste of resources.

Vary the tests per VM to increase coverage while reducing the total number of tests. Be sure to include each major feature (remote, s3, encryption) in each VM at least once.
2019-03-02 15:01:02 +02:00