The restore command automatically defaults to selecting the latest backup from a single repository. With multiple repositories configured, the restore command will now default to selecting the latest backup from the first repository where backups exist. The order in which the repositories are checked is dictated by the pgbackrest.conf order.
To select from a specific repository, the --repo option can be passed (e.g. --repo=1). The --set option can be passed if a backup other than the latest is desired.
Repositories will be searched in order for the requested archive file.
Errors will be reported as warnings as long as a valid copy of the archive file is found.
Errors are logged to the log file rather than thrown. If, after processing all repos, one or more errors occurred, then a single error error will be thrown to indicate there were errors and the log file should be inspected.
Also update log messages to be more consistent with new patterns.
These constructors wrap encodeToStr() and decodeToBin(), making them convenient and safe by eliminating the need to create intermediate buffers. Encoding/decoding is performed directly into the target String/Buffer. Sizing of the destination buffer is handled by the new functions so it doesn't have to be done at each call site.
If the second letter is capital or a digit then the word is likely an acronym so don't lower-case the first letter.
For now only the digit case is checked since there are no summaries with a capital as the second letter.
The destination buffer on the stack was not large enough to contain the zero-terminating character.
Increase the buffer size and add an assertion to prevent regressions.
Found on arm64 running musl libc. Other architectures and glibc do not seem to be affected though it is clearly a bug.
The expire command has been enhanced to expire backups and archives from all configured repositories by default.
In addition, it will accept the --repo option to expire backups and archives only from the specified repository. Using the --repo options the --set option can also be refined further to the specified repo. If --set is provided but the --repo option has not, then all repositories will be searched and retention settings will be applied on each whether the backup set has been found or not.
In preparation for multi-repo support, a repo tag is added in this commit to the expire command log and error messages. This change also affects the expect logs and the user-guide. The format of the tag is "repoX:" where X is the repo key used in the configuration.
Until multi-repo support has been completed, this tag will always be "repo1:".
The original intention was to enclose complex code in braces but somehow braces got propagated almost everywhere.
Document the standard for braces in switch statements and update the code to reflect the standard.
This is phase 2 of verify command development (phase 1 was processing the archives and phase 3 will be reconciling the archives and backups). In this phase the backups are verified by verifying each file listed in the manifest for the backup and creating a result set with the list of invalid files, if any. A summary is then rendered.
Unit tests have been added and duplicate tests have been removed.
The info command provides total sizes for files in the backup on the database as well as the repository. The text output and associated user documentation has been updated to provide more clarity regarding the sizes being displayed.
In addition, the info command is updated to allow a user to optionally specify the repository when requesting a specific backup set. In this case, the text output will reflect the status of the stanza, the cipher types and archive min/max over all the repositories instead of a single repository when the repo option is specified.
This is more accurate since we don't really want lf/cr anyway, though the lines have already been split so that's not possible in this code for lf.
Found on MacOS M1. FreeBSD also seems to be fine with the new expression.
Multi-repository implementations for the archive-push, check, info, stanza-create, stanza-upgrade, and stanza-delete commands.
Multi-repo configuration is disabled so there should be no behavioral changes between these commands and their current single-repo implementations.
Multi-repo documentation and integration tests are still in the multi-repo development branch. All unit tests work as multi-repo since they are able to bypass the configuration restrictions.
Check that archive files exist in the main process instead of the local process. This means that the archive.info file only needs to be loaded once per execution rather than once per file to get.
Stop looking when a file is missing or in error. PostgreSQL will never request anything past the missing file so there is no point in getting them. This also reduces "unable to find" logging in the async process.
Cache results of storageList() when looking for multiple files to reduce storage I/O.
Look for all requested archive files in the archive-id where the first file is found. They may not all be there, but this reduces the number of list calls. If subsequent files are in another archive id they will be found on the next archive-get call.
Append "asynchronously" to messages when the async process fetched the file (not in the actual async process log, though).
Add "repo1" to make it clear what archive we are talking about. This is not very useful by itself but soon we'll be able to add the archive id, which is very useful.
Add constants for messages that are used multiple times to ensure they stay consistent.
The FUNCTION_LOG_RETURN() macro requires logging macros (e.g. FUNCTION_LOG_*_TYPE and FUNCTION_LOG_*_FORMAT) when returning a struct but these macros don't deliver much value since they only output the name of the struct rather than the contents. A copy of the struct is also made during this operation, which is wasteful.
FUNCTION_LOG_RETURN_STRUCT() does not make a copy of the struct and does not require any logging macros. Returned structures are logged as "struct" but this could be made more accurate using __typeof in the future.
Structures as parameters are not addressed here and work as before, i.e. they require logging macros.
Missing files would indicate that another process is running on the same spool path, which would be a very bad thing.
This check doesn't cost any additional I/O so it seems like a good idea.
If files other than backup.manifest.copy were left in a backup path by a prior resume then the next resume would skip the backup rather than removing it. Since the backup path still existed, it would be found during backup label generation and cause an error if it appeared to be later than the new backup label. This occurred if the skipped backup was full.
The error was only likely on object stores such as S3 because of the order of file deletion. Posix file systems delete from the bottom up because directories containing files cannot be deleted. Object stores do not have directories so files are deleted in whatever order they are provided by the list command. However, the issue can be reproduced on a Posix file system by manually deleting backup.manifest.copy from a resumable backup path.
Fix the issue by removing the resumable backup if it has no manifest files. Also add a new warning message for this condition.
Note that this issue could be resolved by running expire or a new full backup.
These options specify the number of local worker job retries and the retry interval after one immediate retry.
There is some value in allowing retries to be specified by the user but for the most part these options are for suppressing retries during testing, which can save a lot of time. The bug introduced in d1d25c7 and fixed in 8b86d5e also suggests it is better not to use retries in tests.
Remove the default delayed retries for archive-get/archive-push, leaving only the immediate retry. These commands are retried by PostgreSQL so it doesn't make sense to do too many retries internally.
These options are currently internal.
This call was removed by d1d25c71, which worked for archivePushProtocol() and verifyProtocol() since the encryption options are passed from the main process.
archiveGetProtocol() still retrieves these options in the local process so the repo storage must be loaded first.
This option was added in advance of the multi-repo functionality but it has no purpose and it is not clear what the validity rules should be.
The option will be added back when multi-repo functionality is committed.
There is an inconsistency when the JSON is output for the case when a stanza is requested and it does not exist in the repo. This was the only case where the archive array was not added to the JSON. Adding it will simplify the upcoming multi-repo support code.
Also, a redundant test was removed rather than updating it for this case.
Validity by command was not granular enough so numerous options needed be marked internal so users would not stumble across them. Options were also needlessly being passed to roles that had no use for them.
Introduce per-role validity lists that depend on what roles are valid per command. Also add a check to ensure that only valid roles are used with a command.
This commit adds the functionality but does not introduce any new behavior, i.e. all options are valid for all roles that the command is valid for. A subsequent commit will introduce the new role restrictions to make the changes easier to audit.
Data required for parsing was spread between the config and defined modules, mostly for historical reasons because the same data was used by Perl.
Requiring all the parse rules to be accessed with function interfaces makes the code more complicated and new rules harder to implement.
Instead, move the data to the parse module so in the most complex cases no interface functions are needed. This reduces the total amount of code and paves the way for more complex parse rules.
The help data can be represented more compactly in a pack and this separates data needed for help from data needed for parsing, freeing each to have a more appropriate representation.
The C code does not use doubles to represent seconds like the Perl code did so time can be represented as an integer which reduces the number of data types that config has to understand.
Also remove Variant doubles since they are no longer used.
Note that not all double code was removed since we still need to display times to the user in seconds and it is possible for the times to be fractional. In the future this will likely be simplified by storing the original user input and using that value when the time needs to be displayed.
Inaccuracies in sleep time or clock skew might make a single sleep insufficient to reach the next second.
Add a few retries to make the process more reliable but still avoid an infinite loop if something is seriously wrong.
These calls are not required since cipher info is passed explicitly. They are probably a copy-pasto from some past time when one of these functions required it.
Refactor the code to allow a dynamic number of indexes for indexed options, e.g. pg-path. Our reliance on getopt_long() still limits the number of indexes we can have per group, but once this limitation is removed the rest of the code should be happy with dynamic numbers of indexes (with a reasonable maximum).
Add an option to set a default in each group. This was previously handled by the host-id option but now there is a specific option for each group, pg and repo. These remain internal until they can be fully tested with multi-repo support. They are fully tested for internal usage.
Remove the ConfigDefineOption enum and use the ConfigOption enum instead. They are now equal since the indexed options (e.g. cfgOptRepoHost2) have been removed from ConfigOption.
Remove the config/config test module and add required tests to the config/parse test module. Parsing is now the only way to load a config so this removes some redundancy.
Split new internal config structures and functions into a new header file, config.intern.h. More functions will need to be moved over from config.h but that will need to be done in a future commit to reduce churn.
Add repoIdx to repoIsLocal() and storageRepo*(). Multi-repository support requires that repo locality and storage be accessible by index. This allows, for example, multiple repos to be iterated in a loop. This could be done in a separate commit but doesn't seem worth it since the code is related.
Remove the type parameter from storageRepoGet(). This parameter existed solely to provide coverage for the case where the storage type was invalid. A better pattern is to check that the type is S3 once all other types have been ruled out.
Improve locking on remote processes by introducing an exec-id that is unique to the main process and passed to all remote processes. This allows the remote processes to determine if a lock is held by a remote from the same main process. If so, the lock is allowed.
The exec-id is also useful for associating remote logs with main logs for debugging purposes.
When restore type standby is provided, the recovery.signal isn't needed and may lead to some confusion (see #1236).
Lately, when using pg_basebackup --write-recovery-conf, only the standby.signal file is created. This change would then align with that behaviour.
The result structure for the archive id being processed only needs to be retrieved once so moving it outside of the WAL path list processing loop is more efficient.
This call to storageRepo() was used to fetch cipher options from a remote to determine if a repo cipher was enabled.
Now the main process does this work and passes the cipher options directly to the local so there is no need to pre-load the repo storage here.
If the push queue limit has been exceeded then nothing will be pushed to the repo so there is no point in checking it. Worse, a failure in the check would cause drop not to run and potentially fill up the disk, exactly the case this feature was designed to prevent.
The async version already checks the push queue limit before checking the repository so now both versions have the same behavior.
These warnings were only being reported to PostgreSQL on the console. Now they are also recorded in the async log increasing the chance that they will be seen.
This also improves coverage by requiring a warning during async processing to have a test case, which has been added.
Checking the default here was fragile. If the default were to change the code would break.
This also removes the only dependency on cfgOptionDefault() outside of the help command.
Return a path missing error when a stanza is specified for the info command but the stanza does not exist in the repository.
Previously [] was returned, which is still the case if no stanza is specified and the repository does not exist.
There were a number of places in the code where "hostId" was used, but hostId is just the option group index + 1 so this led to a lot of +1 and -1 to convert the id to an index and vice versa.
Instead just use the zero based index wherever possible. This is pretty much everywhere except when the host-id option is read or set, or where a message is being formatted for the user.
Also fix a bug in protocolRemoteParam() where remotes spawned from the main process could get process ids that were not 0. Only the locals should spawn remotes with process id > 0. This seems to have been harmless since the process id is only a label, but it could be confusing when debugging.
The defines for FUNCTION_LOG_VERIFY_WAL_RANGE* are not used in the current verify.c and are currently not planned in the continuing development of the verify command, so they are dead code and are therefore being removed.
The tests were originally written by loading values directly into the configuration before the parser was available.
Update to use harnessCfgLoadRaw() to simplify the tests and make them compatible with upcoming config changes.
Note that some unreachable conditions were removed since they could not be reached via a parsed config, only by munging values directly into the config. cfgOptionTest(optionId) was removed because a non-default value must always be set. cfgOptionValid(cfgOptLogTimestamp) was removed because it is true for all commands except for cfgCmdNone, which is checked with an assert.
cfgOptionId() did not recognize deprecated options which made the help command throw errors when they were specified on the command line. cfgParseOption() will correctly identify deprecated options.
cfgParseOption() can also be used in cfgParse() to reduce code duplication when parsing info out of the option value returned by optionFind().
Finally, code the option key index separately in parse.auto.c. For now they are simply added back together but future code will need them separated.
This has always been equivalent to the ConfigCommand enum so it just adds complexity.
It was created for symmetry with ConfigDefineOption, which will also be removed soon.
These constants don't scale well as the index total is increased for an option.
The core code rarely uses these options and they are easily replaced with cfgOptionName().
The tests had started to make use of the constants, so provide functions that build the option name from the optionId and, optionally, the optionKey.
WAL timeline history files were not being expired because they were small and generally not very plentiful.
However, in some cases large numbers of history files may be generated so it makes sense to remove useless history files to keep things tidy.
The history file for the oldest retained timeline is kept for debugging purposes even though it is not used for recovery.
Group related options together so operations (e.g. valid, test, index total) can be performed on all options in the group.
Previously, options at the top of the hierarchy of the related options were used to do these tests. This was prone to error as option relationships changed and it was not always clear which option (or options) should be used.
Scan the WAL archive for missing or invalid files and build up ranges of WAL that will be used to verify backup integrity. A number of errors and warnings are currently emitted but they should not be considered authoritative (yet).
The command is incomplete so is marked internal.
Previously, catalog versions were fixed for all versions which made maintaining the catalog versions during PostgreSQL beta and release candidate cycles very painful. A version of pgBackRest which was functionally compatible was rendered useless by a catalog version bump in PostgreSQL.
Instead use only the control version to identify a PostgreSQL version when possible. Some older versions require a catalog version to positively identify a PostgreSQL version, so include them when required.
Since the catalog number is required to work with tablespaces it will need to be stored. There's already a copy of it in backup.info so use that (even though we have been ignoring it in the C versions).
Improve the wording of the error message and add a hint to make it clearer what is wrong and how the user can fix it.
Also change the assert to a regular error since this is not an internal error.
If a stop command has been issued the check command fails due to archiving timing out.
Provide a hint to document this situation and point the user in the proper direction.
When restoring a cluster that will be promoted but is not intended to be the new primary, it is important to disable archiving to avoid polluting the repository with useless WAL. This option makes disabling archiving a bit easier.
Currently each module that needs to collect statistics implements custom code to do so. This is cumbersome.
Create a general purpose module for collecting and reporting statistics. Statistics are output in the log at detail level, but there are other uses they could be put to eventually.
No new functionality is added. This is just a drop-in replacement for the current statistics, with the advantage of being more flexible.
The new stats are slower because they involve a list lookup, but performance testing shows stats can be updated at about 40,000/ms which seems fast enough for our purposes.
Following up on 111d33c, implement the new interfaces for socket client/session. Now HTTP objects can be used over TLS or plain sockets.
This required adding ioSessionFd() and ioSessionRole() to provide the functionality of sckSessionFd() and sckSessionType(). sckClientHost() and sckClientPort don't make sense in a generic interface so they were replaced with ioSessionName().
Only close the remote connection after verifying that the WAL files have been received. This is necessary if the archive_command on the PostgreSQL host is conditional, i.e. archiving only happens while a backup lock is held, to ensure all WAL segments are archived.
Move sckSessionReadyRead()/Write() into the IoRead/IoWrite interfaces. This is a more logical place for them and the alternative would be to add them to the IoSession interface, which does not seem like a good idea.
This is mostly a refactor, but a big change is the select() logic in fdRead.c has been replaced by ioReadReady(). This was duplicated code that was being used by our protocol but not TLS. Since we have not had any problems with requiring poll() in the field this seems like a good time to remove our dependence on select().
Also, IoFdWrite now requires a timeout so update where required, mostly in the tests.
Pretty much everywhere handle is used what is really meant is file descriptor (fd). This terminology got migrated over from Perl and is just not quite correct, or at least not as correct as fd.
There were also plenty of places fd was used so now all uses are consistent.
The Perl code was not updated but might be in a future commit.
PostgreSQL may be using most of the available file descriptors when it executes the the archive-get/archive-push commands (especially archive-get). This can lead to problems depending on how many file descriptors are needed for parallelism in the async process.
Proactively free file descriptors between 3 and 1023 to help ensure there are enough available for reasonable values of process-max, i.e. <= 300.
We use the Z suffix in many functions to indicate that we are expecting a zero-terminated string so make this function conform to the pattern.
As a bonus the new name is a bit shorter, which is a good quality in a commonly-used function.
The old constructor was left around to reduce code churn during the migration but it just makes the code harder to read and search.
Remove the old constructor and rename all remaining instances to lstNewP(), which by default has the same semantics.
The prior code was only able to use the main passphrase automatically and expected sub passphrases to be specified for each operation. This was fine for testing but hardly sufficient for a user-facing feature.
Update the code to determine which passphrase to use for any file in the repository and error when an invalid file or location is selected.
The repo-get command is still internal for now, but with this improvement it should be ready to be made public.
If a local command, e.g. backupFile(), fails it will stop the entire process. Instead, retry local commands to deal with transient errors.
Remove special logic in the S3 storage driver to retry RequestTimeTooSkewed errors since this is now handled by the general retry mechanism in the places where it is most likely to happen, i.e. file read/write. Also, this error should have been entirely eliminated by the asynchronous TLS implementation.
The Azure storage driver exposes secrets in the query when using SAS authorization. These secrets can show up during logging or when an error occurs.
Allow redaction of queries to prevent secrets from being exposed in logs and errors.
This caused restore to replace files based on timestamp and size rather than overwriting, which meant some files that should have been updated were left unchanged. Normal restore and restore --delta were not affected by this issue.
Azure and Azure-compatible object stores can now be used for repository storage.
Currently only shared key authentication is supported but SAS will be added soon.
When uploading large files the upload is split into multiple parts which are assembled at the end to create the final file. Previously we waited until each part was acknowledged before starting on the processing (i.e. compression, etc.) of the next part.
Now, the request for each part is sent while processing continues and the response is read just before sending the request for the next part. This asynchronous method allows us to continue processing while the S3 server formulates a response.
Testing from outside AWS in a high-bandwidth, low-latency environment showed a 35% improvement in the upload time of 1GB files. The time spent waiting for multipart notifications was reduced by ~300% (this measurement included the final part which is not uploaded asynchronously).
There are still some possible improvements: 1) the creation of the multipart id could be made asynchronous when it looks like the upload will need to be multipart (this may incur cost if the upload turns out not to be multipart). 2) allow more than one async request (this will use more memory).
A fair amount of refactoring was required to make the HTTP responses asynchronous. This may seem like overkill but having well-defined request, response, and session objects will also be advantageous for the upcoming HTTP server functionality.
Another advantage is that the lifecycle of an HttpSession is better defined. We only want to reuse sessions that complete the request/response cycle successfully, otherwise we consider the session to be in a bad state and would prefer to start clean with a new one. Previously, this required complex notifications to mark a session as "successfully done". Now, ownership of the session is passed to the request and then the response and only returned to the client after a successful response. If an error occurs anywhere along the way the session will be automatically closed by the object destructor when the request/response object is freed (depending on which one currently owns the session).
strCat() did not follow our convention of appending Z to functions that accept zero-terminated strings rather than String objects.
Add strCatZ() to accept zero-terminated strings and update strCat() to accept String objects.
Use LF_STR where appropriate but don't use other String constants because they do not improve readability.
Vendorized code is copied from another project when a library is not available and a git subproject won't work. Currently all the vendorized code is copied from PostgreSQL but it makes sense to have a more general mechanism for indicating vendorized code.
The .vendor extension will be used to denote vendorized code in the same way that .auto is used to denote auto-generated code.
These tests required sudo to achieve complete coverage.
Add a new coverage exception, vm_covered, that applies to code that can only be covered in a container. When the test is run outside of a container code sections that require a container will be excluded with TEST_CONTAINER_REQUIRED and the coverage exception will be added to prevent a coverage error.
This does require marking up the core code with vm_covered, which in some modules (e.g. common/io/tls/client) can be extensive. It's possible that some of these tests can be rewritten to be less dependent on sudo but no attempt was made to do that here.
Only allow coverage summaries in a vm since coverage summaries outside a vm will not be complete, which was true even before this commit.
The --repo-retention-full-type option allows retention of full backups based on a time period, specified in days.
The new option will default to 'count' and therefore will not affect current installations. Setting repo-retention-full-type to 'time' will allow the user to use a time period, in days, to indicate full backup retention. Using this method, a full backup can be expired only if the time the backup completed is older than the number of days set with repo-retention-full (calculated from the moment the 'expire' command is run) and at least one full backup meets the retention period. If archive retention has not been configured, then the default settings will expire archives that are prior to the oldest retained full backup. For example, if there are three full backups ending in times that are 25 days old (F1), 20 days old (F2) and 10 days old (F3), then if the full retention period is 15 days, then only F1 will be expired; F2 will be retained because F1 is not at least 15 days old.