1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
Commit Graph

69 Commits

Author SHA1 Message Date
David Steele
01b8e2258f
Improve archive-push command fault tolerance.
3b8f0ef missed some cases that could cause archive-push to fail:

* Checking archive info.
* Checking to see if a WAL segment already exists.

These cases are now handled so archive-push can succeed on any valid repos.
2021-03-25 12:54:49 -04:00
David Steele
d1aa765a9d
Consolidate less commonly used repository storage options.
The following options are renamed as specified:

repo1-azure-ca-file -> repo1-storage-ca-file
repo1-azure-ca-path -> repo1-storage-ca-path
repo1-azure-host -> repo1-storage-host
repo1-azure-port -> repo1-storage-port
repo1-azure-verify-tls -> repo1-storage-verify-tls
repo1-s3-ca-file -> repo1-storage-ca-file
repo1-s3-ca-path -> repo1-storage-ca-path
repo1-s3-host -> repo1-storage-host
repo1-s3-port -> repo1-storage-port
repo1-s3-verify-tls -> repo1-storage-verify-tls

The old option names (e.g. repo1-s3-port) will continue to work for repo1, but repo2, etc. will require the new names.
2021-03-02 13:51:40 -05:00
David Steele
a1341b4af0 Make S3/Azure file missing error messages match Posix.
The S3 driver was missed when the constants were added and then Azure was copied from S3.
2021-02-28 17:00:41 -05:00
David Steele
bec3e20b2c Add archive-get command multi-repo support.
Repositories will be searched in order for the requested archive file.

Errors will be reported as warnings as long as a valid copy of the archive file is found.
2021-02-23 15:34:28 -05:00
Cynthia Shang
f32eb9b94e
Partial multi-repository implementation.
Multi-repository implementations for the archive-push, check, info, stanza-create, stanza-upgrade, and stanza-delete commands.

Multi-repo configuration is disabled so there should be no behavioral changes between these commands and their current single-repo implementations.

Multi-repo documentation and integration tests are still in the multi-repo development branch. All unit tests work as multi-repo since they are able to bypass the configuration restrictions.
2021-01-21 15:21:50 -05:00
David Steele
a8fb285756
Improve archive-get performance.
Check that archive files exist in the main process instead of the local process. This means that the archive.info file only needs to be loaded once per execution rather than once per file to get.

Stop looking when a file is missing or in error. PostgreSQL will never request anything past the missing file so there is no point in getting them. This also reduces "unable to find" logging in the async process.

Cache results of storageList() when looking for multiple files to reduce storage I/O.

Look for all requested archive files in the archive-id where the first file is found. They may not all be there, but this reduces the number of list calls. If subsequent files are in another archive id they will be found on the next archive-get call.
2021-01-15 10:15:52 -05:00
David Steele
22fd223fc3 Improve logging in archive-get command.
Append "asynchronously" to messages when the async process fetched the file (not in the actual async process log, though).

Add "repo1" to make it clear what archive we are talking about. This is not very useful by itself but soon we'll be able to add the archive id, which is very useful.

Add constants for messages that are used multiple times to ensure they stay consistent.
2021-01-13 10:24:47 -05:00
David Steele
96fd678662
Add job-retry and job-retry-interval options.
These options specify the number of local worker job retries and the retry interval after one immediate retry.

There is some value in allowing retries to be specified by the user but for the most part these options are for suppressing retries during testing, which can save a lot of time. The bug introduced in d1d25c7 and fixed in 8b86d5e also suggests it is better not to use retries in tests.

Remove the default delayed retries for archive-get/archive-push, leaving only the immediate retry. These commands are retried by PostgreSQL so it doesn't make sense to do too many retries internally.

These options are currently internal.
2021-01-11 15:15:25 -05:00
David Steele
6e7a3eb383 Remove archive-timeout from test in mock/archive.
No timeout is expected here but the small timeout prevents errors from being thrown.

This is not a bug since the error would be thrown on the next archive-get call but it does make the tests harder to debug when there is an error.

It is not clear why there was a timeout here at all. It is likely cruft from a prior test or a copy/paste error.
2021-01-05 18:11:28 -05:00
David Steele
0acfcb669e Audit options valid for start/stop commands. 2020-12-31 11:10:48 -05:00
David Steele
7fda83b31e
Allow multiple remote locks from the same main process.
Improve locking on remote processes by introducing an exec-id that is unique to the main process and passed to all remote processes. This allows the remote processes to determine if a lock is held by a remote from the same main process. If so, the lock is allowed.

The exec-id is also useful for associating remote logs with main logs for debugging purposes.
2020-11-23 12:41:54 -05:00
David Steele
959f77cd6a
Add general-purpose statistics collector.
Currently each module that needs to collect statistics implements custom code to do so. This is cumbersome.

Create a general purpose module for collecting and reporting statistics. Statistics are output in the log at detail level, but there are other uses they could be put to eventually.

No new functionality is added. This is just a drop-in replacement for the current statistics, with the advantage of being more flexible.

The new stats are slower because they involve a list lookup, but performance testing shows stats can be updated at about 40,000/ms which seems fast enough for our purposes.
2020-08-20 14:04:26 -04:00
David Steele
0680cfc8dc Rename most instances of master to primary in tests.
This aligns better with general PostgreSQL usage and our own documentation (updated in 4bcef702).

Usage in the backup.manifest tests has not been updated since it might break the file format.
2020-06-16 14:06:38 -04:00
David Steele
b5dd14e6f3 Make storage type more generic in the integration tests.
Rather than bS3 use strStorage which can indicate more than two storage types.

For the moment there are still only two storage types but this change is required before more can be added.
2020-05-12 18:55:20 -04:00
David Steele
47aa765375 Add Zstandard compression support.
Zstandard is a fast lossless compression algorithm targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by Huff0 and FSE library.

Zstandard version >= 1.0 is required, which is generally only available on newer distributions.
2020-05-04 15:25:27 -04:00
David Steele
8989118cc6 Add SocketClient object.
This functionality was embedded into TlsClient but that was starting to get unwieldy.

Add SocketClient to contain all socket-related client functionality.
2020-03-31 12:43:29 -04:00
David Steele
438b957f9c Add infrastructure for multiple compression type support.
Add compress-type option and deprecate compress option. Since the compress option is boolean it won't work with multiple compression types. Add logic to cfgLoadUpdateOption() to update compress-type if it is not set directly. The compress option should no longer be referenced outside the cfgLoadUpdateOption() function.

Add common/compress/helper module to contain interface functions that work with multiple compression types. Code outside this module should no longer call specific compression drivers, though it may be OK to reference a specific compression type using the new interface (e.g., saving backup history files in gz format).

Unit tests only test compression using the gz format because other formats may not be available in all builds. It is the job of integration tests to exercise all compression types.

Additional compression types will be added in future commits.
2020-03-06 14:41:03 -05:00
David Steele
dbf6255ab8 Remove compress/compress-level options from commands where unused.
These commands (e.g. restore, archive-get) never used the compress options but allowed them to be passed on the command line. Now they will error when these options are passed on the command line. If these errors occur then remove the unused options.
2020-02-27 12:25:32 -05:00
David Steele
977ec2e307 Integration test improvements for disk and memory efficiency.
Set log-level-file=off when more that one test will run.  In this case is it impossible to see the logs anyway since they will be automatically cleaned up after the test.  This improves performance pretty dramatically since trace-level logging is expensive.  If a singe integration test is run then log-level-file is trace by default but can be changed with the --log-level-test-file option.

Reduce buffer-size to 64k to save memory during testing and allow more processes to run in parallel.

Update log replacement rules so that these options can change without affecting expect logs.
2019-12-17 15:23:07 -05:00
David Steele
8d6a8c3bf0 Store base path for remote storage locally.
It wasn't practical for the main process to be ignorant of the remote path, and in any case knowing the path makes debugging easier.

Pull the remote path when connecting and pass the result of local storagePath() to the remote when making calls.
2019-11-16 17:12:16 -05:00
David Steele
11c7c8fabb Remove pgbackrest test user.
This user was created before we tested in containers to ensure isolation between the pg and repo hosts which were then just directories.  The downside is that this resulted in a lot of sudos to set the pgbackrest user and to remove files which did not belong to the main test user.

Containers provide isolation without needing separate users so we can now safely remove the pgbackrest user.  This allows us to remove most sudos, except where they are explicitly needed in tests.

While we're at it, remove the code that installed the Perl C library (which also required sudo) and simply add the build path to @INC instead.
2019-10-12 09:45:18 -04:00
David Steele
4d84820021 Improve performance of info file load/save.
Info files required three copies in memory to be loaded (the original string, an ini representation, and the final info object). Not only was this memory inefficient but the Ini object does sequential scans when searching for keys making large files very slow to load.

This has not been an issue since archive.info and backup.info are very small, but it becomes a big deal when loading manifests with hundreds of thousands of files.

Instead of holding copies of the data in memory, use a callback to deliver the ini data directly to the object when loading. Use a similar method for save to avoid having an intermediate copy. Save is a bit complex because sections/keys must be written in alpha order or older versions of pgBackRest will not calculate the correct checksum.

Also move the load retry logic to helper functions rather than embedding it in the Info object. This allows for more flexibility in loading and ensures that stack traces will be available when developing unit tests.

Reviewed by Cynthia Shang.
2019-09-06 13:48:28 -04:00
Cynthia Shang
c733319063 The stanza-create/update/delete commands are implemented entirely in C.
Contributed by Cynthia Shang.
2019-08-21 16:26:28 -04:00
David Steele
4815752ccc Add Perl interface to C storage layer.
Maintaining the storage layer/drivers in two languages is burdensome.  Since the integration tests require the Perl storage layer/drivers we'll need them even after the core code is migrated to C.  Create an interface layer so the Perl code can be removed and new storage drivers/features introduced without adding Perl equivalents.

The goal is to move the integration tests to C so this interface will eventually be removed.  That being the case, the interface was designed for maximum compatibility to ease the transition.  The result looks a bit hacky but we'll improve it as needed until it can be retired.
2019-06-26 08:24:58 -04:00
David Steele
86482c7db9 Reduce log level for all expect tests to detail.
The C code is designed to be efficient rather than deterministic at the debug log level.  As we move more testing from integration to unit tests it makes less sense to try and maintain the expect logs at this log level.

Most of the expect logs have already been moved to detail level but mock/all still had tests at debug level.  Change the logging defaults in the config file and remove as many references to log-level-console as possible.
2019-05-22 18:23:44 -04:00
David Steele
e3fe3434b4 Rename repo-s3-verify-ssl option to repo-s3-verify-tls.
The new name is preferred because pgBackRest does not support any SSL protocol versions (they are all considered to be insecure).

The old name will continue to be accepted.
2019-05-21 10:14:41 -04:00
David Steele
1b48684713 The archive-push command is implemented entirely in C.
This new implementation should behave exactly like the old Perl code with the exception of updated log messages.

Remove as much of the Perl code as possible without breaking other commands.
2019-03-29 13:26:33 +00:00
David Steele
0913523096 Cleanup local/remote protocol interaction from 9367cc46.
The command option was not being set correctly when a remote was started from a local.  It was being set as 'local' rather than the command that the local was running as.

Also automatically select the remote protocol id based on whether it is started from a local (use the local protocol id) or from the main process (use 0).

These were not live issues but could cause strange behaviors as new features are added that might be hard to diagnose.
2019-02-28 09:51:19 +02:00
David Steele
d489eb87f7 Create test matrix for mock/archive to increase coverage and reduce tests.
The same test configurations are run on all four test VMs, which seems a real waste of resources.

Vary the tests per VM to increase coverage while reducing the total number of tests.  Be sure to include each major feature (remote, s3, encryption) in each VM at least once.
2019-02-23 15:59:39 +02:00
David Steele
59d7958914 Reduce expect log level in mock/archive tests.
The expect tests were originally a rough-and-ready type of unit test so monitoring changes in the expect log helped us detect changes in behavior.

Now the archive code is heavily unit-tested so the detailed logs mainly cause churn and don't have any measurable benefit.

Reduce the log level to DETAIL to make the logs less verbose and volatile, yet still check user-facing log messages.
2019-02-23 15:05:06 +02:00
David Steele
b1eb8af7d5 Resolve storage path expressions before passing to remote.
Expressions such as <REPO:ARCHIVE> require a stanza name in order to be resolved correctly.  However, if the stanza name is passed to the remote then that remote will only work correctly for that one stanza.

Instead, resolved the expressions locally but still pass a relative path to the remote.  That way, a storage path that is only configured on the remote does not need to be known locally.
2019-02-21 15:40:21 +02:00
David Steele
b0b5989aca Migrate remote archive-get command to C.
All required protocol commands are implemented so this is mostly a matter of enabling the feature and updating expect logs.
2019-02-20 22:57:18 +02:00
David Steele
73be64ce49 Add separate archive-get-async command.
This command was previously forked off from the archive-get command which required a bit of artificial option and log manipulation.

A separate command is easier to test and will work on platforms that don't have fork(), e.g. Windows.
2019-02-20 15:52:07 +02:00
David Steele
057e2e2782 Add unimplemented S3 driver method required for archive-get.
This was not being caught because the integration tests for S3 were running remotely and going through the Perl code rather than the new C code.

Implement the exists method for the S3 driver and add tests to prevent a regression.

Reported by mibiio.
2019-02-09 18:57:30 +02:00
David Steele
7c2fcb63e4 Enable encryption for archive-get command in C.
The decryption filter was added in archiveGetFile() and archiveGetCheck() was modified to return the WAL decryption key stored in archive.info.  The rest was plumbing.

The mock/archive/1 integration test added encryption to provide coverage for the new code paths while mock/archive/2 dropped encryption to provide coverage for the existing code paths. This caused some churn in the expect logs but there was no change in behavior.
2018-11-28 14:56:26 -05:00
Cynthia Shang
b6b2c915b2 Allow hashSize() to run on remote storage.
Apparently we never needed to run this function remotely.

It will be needed by the backup checksum delta feature, so implement it now.

Contributed by Cynthia Shang.
2018-09-18 11:39:48 -04:00
David Steele
b5f749b21c Add CIFS driver to storage helper for read-only repositories.
For read-only repositories the Posix and CIFS drivers behave exactly the same.  Since that's all we support in C right now it's valid to treat them as the same thing.  An assertion has been added to remind us to add the CIFS driver before allowing the repository to be writable.

Mostly we want to make sure that the C code does not blow up when the repository type is CIFS.
2018-09-16 18:41:30 -04:00
David Steele
9e574a37dc Make archive-get info messages consistent between C and Perl implementations.
The info messages were spread around and logged differently based on the execution path and in some cases logged nothing at all.

Temporarily track the async server status with a flag so that info messages are not output in the async process.  The async process will be refactored as a separate command to be exec'd in a future commit.
2018-09-11 12:30:48 -04:00
David Steele
de1b74da0c Move encryption in mock/archive tests to remote tests.
The new archive-get C code can't run (yet) when encryption is enabled.  Therefore move the encryption tests so we can test the new C code.  We'll move it back when encryption is enabled in C.

Also, push one WAL segment with compression to test decompression in the C code.
2018-09-06 09:35:34 -07:00
David Steele
6361a06181 Fix incorrectly reported error return in info logging.
A return code of 1 from the archive-get was being logged as an error message at info level but otherwise worked correctly.

Also improve info messages when an archive segment is or is not found.
2018-09-04 21:46:41 -04:00
David Steele
d41570c37a Improve log file names for remote processes started by locals.
The log-subprocess feature added in 22765670 failed to take into account the naming for remote processes spawned by local processes.  Not only was the local command used for the naming of log files but the process id was not pass through.  This meant every remote log was named "[stanza]-local-remote-000" which is confusing and meant multiple processes were writing to the same log.

Instead, pass the real command and process id to the remote.  This required a minor change in locking to ignore locks if process id is greater than 0 since remotes started by locals never lock.
2018-08-31 11:31:13 -04:00
David Steele
0ed37ab9e7 Update Archive::Info->archiveIdList() to return a valid error code instead of unknown. 2018-08-24 12:13:10 -04:00
David Steele
2276567027 Add log-subprocess option to allow file logging for local and remote subprocesses. 2018-08-22 20:05:49 -04:00
David Steele
52bc073234 Add stack trace macros to all functions.
Low-level functions only include stack trace in test builds while higher-level functions ship with stack trace built-in. Stack traces include all parameters passed to the function but production builds only create the parameter list when the log level is set high enough, i.e. debug or trace depending on the function.
2018-05-18 11:57:32 -04:00
David Steele
91be372e6a Set log-timestamp=n for integration tests.
This means less filtering of logs needs to be done and new timestamps can be added without adding new filters.
2018-05-11 11:24:38 -04:00
David Steele
54dd6f3ed4 Add asynchronous, parallel archive-get.
This feature maintains a queue of WAL segments to help reduce latency when PostgreSQL requests a WAL segment with restore_command.
2018-04-30 17:27:39 -04:00
David Steele
89d3476e32 Refactor archive common functions in preparation for parallel async archive-get. 2018-04-29 10:16:59 -04:00
David Steele
0381945caa Show command parameters as well as command options in initial info message. 2018-04-17 18:47:14 -04:00
David Steele
f0250dab4b Move async forking and more error handling to C.
The Perl process was exiting directly when called but that interfered with proper locking for the forked async process. Now Perl returns results to the C process which handles all errors, including signals.
2018-04-12 20:42:26 -04:00
David Steele
6fd0c3dcaa Improved lock implementation written in C.
Now only two types of locks can be taken: archive and backup. Most commands use one or the other but the stanza-* commands acquire both locks. This provides better protection than the old command-based locking scheme.
2018-04-11 09:36:12 -04:00