pgbackrest

mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-16 10:20:02 +02:00

Author	SHA1	Message	Date
David Steele	a1280c41e5	Refactor archive-push command warnings to work like archive-get. Warnings are logged individually in the async log rather than all together.	2021-02-26 15:58:11 -05:00
Cynthia Shang	13dc8e68d7	Make --repo optional for backup command. If there are multiple repos and the --repo option is not specified then backup will automatically select the highest priority repo.	2021-02-26 14:49:50 -05:00
Cynthia Shang	065b2ff230	Refactor info command repoMin/Max.	2021-02-23 16:27:05 -05:00
Cynthia Shang	118d9e64fe	Enhance restore command multi-repo support. The restore command automatically defaults to selecting the latest backup from a single repository. With multiple repositories configured, the restore command will now default to selecting the latest backup from the first repository where backups exist. The order in which the repositories are checked is dictated by the pgbackrest.conf order. To select from a specific repository, the --repo option can be passed (e.g. --repo=1). The --set option can be passed if a backup other than the latest is desired.	2021-02-23 16:17:27 -05:00
David Steele	bec3e20b2c	Add archive-get command multi-repo support. Repositories will be searched in order for the requested archive file. Errors will be reported as warnings as long as a valid copy of the archive file is found.	2021-02-23 15:34:28 -05:00
Cynthia Shang	e28f6f11e9	Expire continues if an error occurs processing a repository. Errors are logged to the log file rather than thrown. If, after processing all repos, one or more errors occurred, then a single error error will be thrown to indicate there were errors and the log file should be inspected. Also update log messages to be more consistent with new patterns.	2021-02-23 12:20:02 -05:00
David Steele	6fb9de9a48	Use list to search for WAL segments to preserve in queueNeed(). The regular expression predates strLstFind() on sorted lists. Using the list is both simpler and faster.	2021-02-23 06:35:45 -05:00
David Steele	d485609658	Add strNewEncode(), strCatEncode(), and bufNewDecode(). These constructors wrap encodeToStr() and decodeToBin(), making them convenient and safe by eliminating the need to create intermediate buffers. Encoding/decoding is performed directly into the target String/Buffer. Sizing of the destination buffer is handled by the new functions so it doesn't have to be done at each call site.	2021-02-19 17:05:15 -05:00
David Steele	5b98968605	Do not lower-case help summaries when first word is an acronym. If the second letter is capital or a digit then the word is likely an acronym so don't lower-case the first letter. For now only the digit case is checked since there are no summaries with a capital as the second letter.	2021-02-19 10:29:29 -05:00
David Steele	d29855bd0b	Fix stack overflow in cipher passphrase generation. The destination buffer on the stack was not large enough to contain the zero-terminating character. Increase the buffer size and add an assertion to prevent regressions. Found on arm64 running musl libc. Other architectures and glibc do not seem to be affected though it is clearly a bug.	2021-02-12 10:08:47 -05:00
Cynthia Shang	3408f1ee2e	Enhance expire command multi-repo support. The expire command has been enhanced to expire backups and archives from all configured repositories by default. In addition, it will accept the --repo option to expire backups and archives only from the specified repository. Using the --repo options the --set option can also be refined further to the specified repo. If --set is provided but the --repo option has not, then all repositories will be searched and retention settings will be applied on each whether the backup set has been found or not.	2021-02-10 12:03:52 -05:00
Cynthia Shang	d350d1cc21	Improve expire command documentation.	2021-02-05 11:48:07 -05:00
David Steele	b65c370346	Add repo-get command.	2021-02-05 10:39:03 -05:00
David Steele	218cd078a6	Add repo-ls command.	2021-02-05 10:07:43 -05:00
Stefan Fercot	4b46115345	Add archive-mode-check option. This option disallows the PostgreSQL archive_mode=always setting and disabling it allows the setting.	2021-02-02 13:43:14 -05:00
David Steele	7d6c0319f0	Add lstEmpty(), strLstEmpty(), and varLstEmpty(). This seems more readable than lstSize() == 0. Hopefully this will also eliminate usage of lstSize() > 0/lst*Size() != 0 variants for the inverse.	2021-01-29 14:27:56 -05:00
Cynthia Shang	d5b919e657	Update expire command log messages with repo prefix. In preparation for multi-repo support, a repo tag is added in this commit to the expire command log and error messages. This change also affects the expect logs and the user-guide. The format of the tag is "repoX:" where X is the repo key used in the configuration. Until multi-repo support has been completed, this tag will always be "repo1:".	2021-01-27 16:33:01 -05:00
David Steele	5d34bf3f38	Move cvtDoubleToStr() to strNewDbl(). This is a more logical location and it reduces the dependencies required to compile the common/convert module.	2021-01-27 11:50:10 -05:00
David Steele	456a300bb7	Remove too-verbose braces in switch statements. The original intention was to enclose complex code in braces but somehow braces got propagated almost everywhere. Document the standard for braces in switch statements and update the code to reflect the standard.	2021-01-26 12:10:24 -05:00
Cynthia Shang	2e60b93709	Add backup verification to internal verify command. This is phase 2 of verify command development (phase 1 was processing the archives and phase 3 will be reconciling the archives and backups). In this phase the backups are verified by verifying each file listed in the manifest for the backup and creating a result set with the list of invalid files, if any. A summary is then rendered. Unit tests have been added and duplicate tests have been removed.	2021-01-26 11:21:36 -05:00
Cynthia Shang	5d48dd2fb3	Use explicit characters instead of Posix class in restore expression. It is not clear how portable/supported the Posix character classes are. This way seems simpler and more portable. Updated from `5c98157b`.	2021-01-25 11:33:41 -05:00
Cynthia Shang	e251ec574a	Add note about removing configuration to stanza-delete documentation.	2021-01-25 11:14:28 -05:00
Cynthia Shang	00fac1c0d1	Improve info command text output and --set handling. The info command provides total sizes for files in the backup on the database as well as the repository. The text output and associated user documentation has been updated to provide more clarity regarding the sizes being displayed. In addition, the info command is updated to allow a user to optionally specify the repository when requesting a specific backup set. In this case, the text output will reflect the status of the stanza, the cipher types and archive min/max over all the repositories instead of a single repository when the repo option is specified.	2021-01-25 09:19:05 -05:00
David Steele	5c98157bce	Use [[:blank:]] instead of \s for leading space matching in restore. This is more accurate since we don't really want lf/cr anyway, though the lines have already been split so that's not possible in this code for lf. Found on MacOS M1. FreeBSD also seems to be fine with the new expression.	2021-01-24 11:46:55 -05:00
David Steele	fdf1c299f9	Remove unused VerifyFileResult struct.	2021-01-22 11:51:36 -05:00
David Steele	547b297387	Use uint variant to store enum in verifyProtocol(). Avoid the need for a cast by using a matching type. Found on MacOS M1.	2021-01-22 09:04:28 -05:00
David Steele	bc25e9bf05	Change variant to uint when building option lists. Enums are uints on most platforms so this works without casting. Found on MacOS M1.	2021-01-21 17:36:33 -05:00
David Steele	6a992a5884	Add missing mode_t casts. Found on MacOS M1 where mode_t is short.	2021-01-21 17:23:24 -05:00
Cynthia Shang	f32eb9b94e	Partial multi-repository implementation. Multi-repository implementations for the archive-push, check, info, stanza-create, stanza-upgrade, and stanza-delete commands. Multi-repo configuration is disabled so there should be no behavioral changes between these commands and their current single-repo implementations. Multi-repo documentation and integration tests are still in the multi-repo development branch. All unit tests work as multi-repo since they are able to bypass the configuration restrictions.	2021-01-21 15:21:50 -05:00
Cynthia Shang	1333748550	Fix incorrect variable in parameter logging.	2021-01-21 14:20:39 -05:00
David Steele	a8fb285756	Improve archive-get performance. Check that archive files exist in the main process instead of the local process. This means that the archive.info file only needs to be loaded once per execution rather than once per file to get. Stop looking when a file is missing or in error. PostgreSQL will never request anything past the missing file so there is no point in getting them. This also reduces "unable to find" logging in the async process. Cache results of storageList() when looking for multiple files to reduce storage I/O. Look for all requested archive files in the archive-id where the first file is found. They may not all be there, but this reduces the number of list calls. If subsequent files are in another archive id they will be found on the next archive-get call.	2021-01-15 10:15:52 -05:00
David Steele	298cc4d5e5	Remove non-conforming periods and reformat some comments.	2021-01-14 10:39:25 -05:00
David Steele	22fd223fc3	Improve logging in archive-get command. Append "asynchronously" to messages when the async process fetched the file (not in the actual async process log, though). Add "repo1" to make it clear what archive we are talking about. This is not very useful by itself but soon we'll be able to add the archive id, which is very useful. Add constants for messages that are used multiple times to ensure they stay consistent.	2021-01-13 10:24:47 -05:00
David Steele	c6aaf66e9d	Add FUNCTION_LOG_RETURN_STRUCT() and update where appropriate. The FUNCTION_LOG_RETURN() macro requires logging macros (e.g. FUNCTION_LOG__TYPE and FUNCTION_LOG__FORMAT) when returning a struct but these macros don't deliver much value since they only output the name of the struct rather than the contents. A copy of the struct is also made during this operation, which is wasteful. FUNCTION_LOG_RETURN_STRUCT() does not make a copy of the struct and does not require any logging macros. Returned structures are logged as "struct" but this could be made more accurate using __typeof in the future. Structures as parameters are not addressed here and work as before, i.e. they require logging macros.	2021-01-13 07:49:47 -05:00
David Steele	b21ed97982	Check for missing files in queueNeed(). Missing files would indicate that another process is running on the same spool path, which would be a very bad thing. This check doesn't cost any additional I/O so it seems like a good idea.	2021-01-12 18:47:19 -05:00
David Steele	aeee83044d	Fix resume after partial delete of backup by prior resume. If files other than backup.manifest.copy were left in a backup path by a prior resume then the next resume would skip the backup rather than removing it. Since the backup path still existed, it would be found during backup label generation and cause an error if it appeared to be later than the new backup label. This occurred if the skipped backup was full. The error was only likely on object stores such as S3 because of the order of file deletion. Posix file systems delete from the bottom up because directories containing files cannot be deleted. Object stores do not have directories so files are deleted in whatever order they are provided by the list command. However, the issue can be reproduced on a Posix file system by manually deleting backup.manifest.copy from a resumable backup path. Fix the issue by removing the resumable backup if it has no manifest files. Also add a new warning message for this condition. Note that this issue could be resolved by running expire or a new full backup.	2021-01-12 12:38:32 -05:00
David Steele	96fd678662	Add job-retry and job-retry-interval options. These options specify the number of local worker job retries and the retry interval after one immediate retry. There is some value in allowing retries to be specified by the user but for the most part these options are for suppressing retries during testing, which can save a lot of time. The bug introduced in `d1d25c7` and fixed in `8b86d5e` also suggests it is better not to use retries in tests. Remove the default delayed retries for archive-get/archive-push, leaving only the immediate retry. These commands are retried by PostgreSQL so it doesn't make sense to do too many retries internally. These options are currently internal.	2021-01-11 15:15:25 -05:00
David Steele	8b86d5ea7a	Restore storageRepo() call in archiveGetProtocol(). This call was removed by `d1d25c71`, which worked for archivePushProtocol() and verifyProtocol() since the encryption options are passed from the main process. archiveGetProtocol() still retrieves these options in the local process so the repo storage must be loaded first.	2021-01-11 11:34:03 -05:00
David Steele	8567b7f733	Make archive-get locality error generate a global.error file. Moving this error into the try block ensures that a global.error file is generated, which will be seen by archive-get.	2021-01-08 16:29:56 -05:00
David Steele	951cfa9e90	Remove repo option. This option was added in advance of the multi-repo functionality but it has no purpose and it is not clear what the validity rules should be. The option will be added back when multi-repo functionality is committed.	2020-12-31 08:12:35 -05:00
Cynthia Shang	cc90163233	Add empty archive array to info command JSON when stanza is missing. There is an inconsistency when the JSON is output for the case when a stanza is requested and it does not exist in the repo. This was the only case where the archive array was not added to the JSON. Adding it will simplify the upcoming multi-repo support code. Also, a redundant test was removed rather than updating it for this case.	2020-12-30 16:17:56 -05:00
David Steele	23f5712d02	Allow option validity to be determined by command role. Validity by command was not granular enough so numerous options needed be marked internal so users would not stumble across them. Options were also needlessly being passed to roles that had no use for them. Introduce per-role validity lists that depend on what roles are valid per command. Also add a check to ensure that only valid roles are used with a command. This commit adds the functionality but does not introduce any new behavior, i.e. all options are valid for all roles that the command is valid for. A subsequent commit will introduce the new role restrictions to make the changes easier to audit.	2020-12-28 09:43:23 -05:00
David Steele	9e9e7c4a0d	Move all parse-related rules to parse module. Data required for parsing was spread between the config and defined modules, mostly for historical reasons because the same data was used by Perl. Requiring all the parse rules to be accessed with function interfaces makes the code more complicated and new rules harder to implement. Instead, move the data to the parse module so in the most complex cases no interface functions are needed. This reduces the total amount of code and paves the way for more complex parse rules.	2020-12-17 09:32:31 -05:00
David Steele	f520ecc89a	Move help data from define.auto.c/config.auto.c to a pack. The help data can be represented more compactly in a pack and this separates data needed for help from data needed for parsing, freeing each to have a more appropriate representation.	2020-12-16 15:59:36 -05:00
David Steele	996de0a3e6	Remove cfgCmdNone from CFG_COMMAND_TOTAL. cfgCmdNone is used to indicate a missing or invalid command so should not be used in the total used for command process.	2020-12-16 11:33:51 -05:00
David Steele	87996558d2	Replace double type with time in config module. The C code does not use doubles to represent seconds like the Perl code did so time can be represented as an integer which reduces the number of data types that config has to understand. Also remove Variant doubles since they are no longer used. Note that not all double code was removed since we still need to display times to the user in seconds and it is possible for the times to be fractional. In the future this will likely be simplified by storing the original user input and using that value when the time needs to be displayed.	2020-12-09 08:59:51 -05:00
David Steele	d4211d3aaf	Add retries to PostgreSQL sleep when starting a backup. Inaccuracies in sleep time or clock skew might make a single sleep insufficient to reach the next second. Add a few retries to make the process more reliable but still avoid an infinite loop if something is seriously wrong.	2020-12-02 22:41:14 -05:00
David Steele	d1d25c710d	Remove extraneous storageRepo() calls. These calls are not required since cipher info is passed explicitly. They are probably a copy-pasto from some past time when one of these functions required it.	2020-11-30 18:03:24 -05:00
Stefan Fercot	5488de8b6a	Report page checksum errors in info command text output. This feature currently only works for text output. JSON output is planned for the future.	2020-11-25 12:14:03 -05:00
David Steele	117f03eba1	Prepare configuration module for multi-repository support. Refactor the code to allow a dynamic number of indexes for indexed options, e.g. pg-path. Our reliance on getopt_long() still limits the number of indexes we can have per group, but once this limitation is removed the rest of the code should be happy with dynamic numbers of indexes (with a reasonable maximum). Add an option to set a default in each group. This was previously handled by the host-id option but now there is a specific option for each group, pg and repo. These remain internal until they can be fully tested with multi-repo support. They are fully tested for internal usage. Remove the ConfigDefineOption enum and use the ConfigOption enum instead. They are now equal since the indexed options (e.g. cfgOptRepoHost2) have been removed from ConfigOption. Remove the config/config test module and add required tests to the config/parse test module. Parsing is now the only way to load a config so this removes some redundancy. Split new internal config structures and functions into a new header file, config.intern.h. More functions will need to be moved over from config.h but that will need to be done in a future commit to reduce churn. Add repoIdx to repoIsLocal() and storageRepo*(). Multi-repository support requires that repo locality and storage be accessible by index. This allows, for example, multiple repos to be iterated in a loop. This could be done in a separate commit but doesn't seem worth it since the code is related. Remove the type parameter from storageRepoGet(). This parameter existed solely to provide coverage for the case where the storage type was invalid. A better pattern is to check that the type is S3 once all other types have been ruled out.	2020-11-23 15:55:46 -05:00
David Steele	7fda83b31e	Allow multiple remote locks from the same main process. Improve locking on remote processes by introducing an exec-id that is unique to the main process and passed to all remote processes. This allows the remote processes to determine if a lock is held by a remote from the same main process. If so, the lock is allowed. The exec-id is also useful for associating remote logs with main logs for debugging purposes.	2020-11-23 12:41:54 -05:00
Stefan Fercot	191b8ec18b	Create standby.signal only on PostgreSQL 12 when restore type is standby. When restore type standby is provided, the recovery.signal isn't needed and may lead to some confusion (see #1236). Lately, when using pg_basebackup --write-recovery-conf, only the standby.signal file is created. This change would then align with that behaviour.	2020-11-19 16:57:19 -05:00
Cynthia Shang	9aacd3c54a	Move retrieval of archiveResult before while loop in verifyArchive(). The result structure for the archive id being processed only needs to be retrieved once so moving it outside of the WAL path list processing loop is more efficient.	2020-11-18 18:19:49 -05:00
David Steele	3f7a66fffc	Remove obsolete call to storageRepo() in archivePushProtocol(). This call to storageRepo() was used to fetch cipher options from a remote to determine if a repo cipher was enabled. Now the main process does this work and passes the cipher options directly to the local so there is no need to pre-load the repo storage here.	2020-11-09 16:19:37 -05:00
David Steele	cbb9b8fd2b	Check archive push queue limit before checking repository. If the push queue limit has been exceeded then nothing will be pushed to the repo so there is no point in checking it. Worse, a failure in the check would cause drop not to run and potentially fill up the disk, exactly the case this feature was designed to prevent. The async version already checks the push queue limit before checking the repository so now both versions have the same behavior.	2020-11-09 16:16:41 -05:00
David Steele	8cf47f82e4	Log warnings in archive-push async log. These warnings were only being reported to PostgreSQL on the console. Now they are also recorded in the async log increasing the chance that they will be seen. This also improves coverage by requiring a warning during async processing to have a test case, which has been added.	2020-11-09 16:10:59 -05:00
David Steele	d5d1ec6f6f	Use a constant to check restore target action. Checking the default here was fragile. If the default were to change the code would break. This also removes the only dependency on cfgOptionDefault() outside of the help command.	2020-11-04 11:09:05 -05:00
Stefan Fercot	abe9d90c89	Improve info command output when a stanza is specified but missing. Return a path missing error when a stanza is specified for the info command but the stanza does not exist in the repository. Previously [] was returned, which is still the case if no stanza is specified and the repository does not exist.	2020-10-27 08:34:18 -04:00
David Steele	d452e9cc38	Use zero-based indexes when referring to option indexes. There were a number of places in the code where "hostId" was used, but hostId is just the option group index + 1 so this led to a lot of +1 and -1 to convert the id to an index and vice versa. Instead just use the zero based index wherever possible. This is pretty much everywhere except when the host-id option is read or set, or where a message is being formatted for the user. Also fix a bug in protocolRemoteParam() where remotes spawned from the main process could get process ids that were not 0. Only the locals should spawn remotes with process id > 0. This seems to have been harmless since the process id is only a label, but it could be confusing when debugging.	2020-10-26 10:25:16 -04:00
Cynthia Shang	ae35c4f029	Remove unused FUNCTION_LOG_VERIFY_WAL_RANGE* defines. The defines for FUNCTION_LOG_VERIFY_WAL_RANGE* are not used in the current verify.c and are currently not planned in the continuing development of the verify command, so they are dead code and are therefore being removed.	2020-10-26 07:41:08 -04:00
David Steele	176cf0bf60	Use harnessCfgLoadRaw() in command/command and common/exit unit tests. The tests were originally written by loading values directly into the configuration before the parser was available. Update to use harnessCfgLoadRaw() to simplify the tests and make them compatible with upcoming config changes. Note that some unreachable conditions were removed since they could not be reached via a parsed config, only by munging values directly into the config. cfgOptionTest(optionId) was removed because a non-default value must always be set. cfgOptionValid(cfgOptLogTimestamp) was removed because it is true for all commands except for cfgCmdNone, which is checked with an assert.	2020-10-20 14:54:28 -04:00
David Steele	156b7d48cc	Get target-action default from cfgOptionDefault() in restore command. cfgDefOptionDefault() worked but the default is available without having to peek into config definitions.	2020-10-20 12:39:23 -04:00
David Steele	41789d70d1	Remove cfgOptionId() and replace it with cfgParseOption(). cfgOptionId() did not recognize deprecated options which made the help command throw errors when they were specified on the command line. cfgParseOption() will correctly identify deprecated options. cfgParseOption() can also be used in cfgParse() to reduce code duplication when parsing info out of the option value returned by optionFind(). Finally, code the option key index separately in parse.auto.c. For now they are simply added back together but future code will need them separated.	2020-10-20 11:24:26 -04:00
David Steele	6414ae9707	Remove ConfigDefineCommand enum. This has always been equivalent to the ConfigCommand enum so it just adds complexity. It was created for symmetry with ConfigDefineOption, which will also be removed soon.	2020-10-19 18:17:47 -04:00
David Steele	7d069a2b91	Remove indexed option constants. These constants don't scale well as the index total is increased for an option. The core code rarely uses these options and they are easily replaced with cfgOptionName(). The tests had started to make use of the constants, so provide functions that build the option name from the optionId and, optionally, the optionKey.	2020-10-19 14:03:48 -04:00
Stefan Fercot	86275c4f85	Expire history files. WAL timeline history files were not being expired because they were small and generally not very plentiful. However, in some cases large numbers of history files may be generated so it makes sense to remove useless history files to keep things tidy. The history file for the oldest retained timeline is kept for debugging purposes even though it is not used for recovery.	2020-10-16 07:42:03 -04:00
David Steele	e0f09687e4	Add option groups. Group related options together so operations (e.g. valid, test, index total) can be performed on all options in the group. Previously, options at the top of the hierarchy of the related options were used to do these tests. This was prone to error as option relationships changed and it was not always clear which option (or options) should be used.	2020-10-08 10:52:19 -04:00
Cynthia Shang	ad79932ba5	Add internal verify command. Scan the WAL archive for missing or invalid files and build up ranges of WAL that will be used to verify backup integrity. A number of errors and warnings are currently emitted but they should not be considered authoritative (yet). The command is incomplete so is marked internal.	2020-09-22 11:57:38 -04:00
David Steele	927d9adbee	Improve PostgreSQL version identification. Previously, catalog versions were fixed for all versions which made maintaining the catalog versions during PostgreSQL beta and release candidate cycles very painful. A version of pgBackRest which was functionally compatible was rendered useless by a catalog version bump in PostgreSQL. Instead use only the control version to identify a PostgreSQL version when possible. Some older versions require a catalog version to positively identify a PostgreSQL version, so include them when required. Since the catalog number is required to work with tablespaces it will need to be stored. There's already a copy of it in backup.info so use that (even though we have been ignoring it in the C versions).	2020-09-18 16:55:26 -04:00
David Steele	fc77c51182	Improve working directory error message. Improve the wording of the error message and add a hint to make it clearer what is wrong and how the user can fix it. Also change the assert to a regular error since this is not an internal error.	2020-09-11 10:10:25 -04:00
Cynthia Shang	b8efb13bcb	Move archiveIdComparator() to archive/common module.	2020-09-08 12:28:56 -04:00
David Christensen	9fd31913a8	Add hint about starting the stanza when WAL segment not found. If a stop command has been issued the check command fails due to archiving timing out. Provide a hint to document this situation and point the user in the proper direction.	2020-09-03 07:49:49 -04:00
David Steele	8c2960fab3	Add archive-mode option to disable archiving on restore. When restoring a cluster that will be promoted but is not intended to be the new primary, it is important to disable archiving to avoid polluting the repository with useless WAL. This option makes disabling archiving a bit easier.	2020-08-25 15:05:41 -04:00
David Steele	1812725c8e	Fix typo.	2020-08-25 12:50:06 -04:00
David Steele	859b8a50fd	Remove unused parameter from cmdBegin().	2020-08-20 14:16:36 -04:00
David Steele	fccca0d716	Refactor option logging into a general-purpose function.	2020-08-20 14:11:40 -04:00
David Steele	959f77cd6a	Add general-purpose statistics collector. Currently each module that needs to collect statistics implements custom code to do so. This is cumbersome. Create a general purpose module for collecting and reporting statistics. Statistics are output in the log at detail level, but there are other uses they could be put to eventually. No new functionality is added. This is just a drop-in replacement for the current statistics, with the advantage of being more flexible. The new stats are slower because they involve a list lookup, but performance testing shows stats can be updated at about 40,000/ms which seems fast enough for our purposes.	2020-08-20 14:04:26 -04:00
David Steele	7fdbd94e39	Implement IoClient/IoSession interfaces for SocketClient/SocketSession. Following up on `111d33c`, implement the new interfaces for socket client/session. Now HTTP objects can be used over TLS or plain sockets. This required adding ioSessionFd() and ioSessionRole() to provide the functionality of sckSessionFd() and sckSessionType(). sckClientHost() and sckClientPort don't make sense in a generic interface so they were replaced with ioSessionName().	2020-08-10 16:03:38 -04:00
Floris van Nee	54c3c39645	Delay backup remote connection close until after archive check. Only close the remote connection after verifying that the WAL files have been received. This is necessary if the archive_command on the PostgreSQL host is conditional, i.e. archiving only happens while a backup lock is held, to ensure all WAL segments are archived.	2020-08-10 11:35:09 -04:00
David Steele	4d22d6eeca	Move file descriptor read/write ready into IoRead/IoWrite. Move sckSessionReadyRead()/Write() into the IoRead/IoWrite interfaces. This is a more logical place for them and the alternative would be to add them to the IoSession interface, which does not seem like a good idea. This is mostly a refactor, but a big change is the select() logic in fdRead.c has been replaced by ioReadReady(). This was duplicated code that was being used by our protocol but not TLS. Since we have not had any problems with requiring poll() in the field this seems like a good time to remove our dependence on select(). Also, IoFdWrite now requires a timeout so update where required, mostly in the tests.	2020-08-08 11:23:37 -04:00
David Steele	cde2c756ea	Rename handle to fd. Pretty much everywhere handle is used what is really meant is file descriptor (fd). This terminology got migrated over from Perl and is just not quite correct, or at least not as correct as fd. There were also plenty of places fd was used so now all uses are consistent. The Perl code was not updated but might be in a future commit.	2020-08-05 18:25:07 -04:00
David Steele	94d3a01f73	Proactively close file descriptors after forking async process. PostgreSQL may be using most of the available file descriptors when it executes the the archive-get/archive-push commands (especially archive-get). This can lead to problems depending on how many file descriptors are needed for parallelism in the async process. Proactively free file descriptors between 3 and 1023 to help ensure there are enough available for reasonable values of process-max, i.e. <= 300.	2020-08-04 13:20:01 -04:00
David Steele	3e9dce0d76	Rename strPtr()/strPtrNull() to strZ()/strZNull(). We use the Z suffix in many functions to indicate that we are expecting a zero-terminated string so make this function conform to the pattern. As a bonus the new name is a bit shorter, which is a good quality in a commonly-used function.	2020-07-30 07:49:06 -04:00
Cynthia Shang	78ef442a18	Add storage parameter to pgWalFromFile().	2020-07-21 16:28:05 -04:00
David Steele	1783e0490a	Remove lstNew() constructor in favor of lstNewP(). The old constructor was left around to reduce code churn during the migration but it just makes the code harder to read and search. Remove the old constructor and rename all remaining instances to lstNewP(), which by default has the same semantics.	2020-07-20 15:22:33 -04:00
Stefan Fercot	047d85c263	Automatically determine cipher passphrase in repo-get command. The prior code was only able to use the main passphrase automatically and expected sub passphrases to be specified for each operation. This was fine for testing but hardly sufficient for a user-facing feature. Update the code to determine which passphrase to use for any file in the repository and error when an invalid file or location is selected. The repo-get command is still internal for now, but with this improvement it should be ready to be made public.	2020-07-16 12:24:03 -04:00
David Steele	620a8d17cf	Automatic retry for backup, restore, archive-get, and archive-push. If a local command, e.g. backupFile(), fails it will stop the entire process. Instead, retry local commands to deal with transient errors. Remove special logic in the S3 storage driver to retry RequestTimeTooSkewed errors since this is now handled by the general retry mechanism in the places where it is most likely to happen, i.e. file read/write. Also, this error should have been entirely eliminated by the asynchronous TLS implementation.	2020-07-14 15:05:31 -04:00
David Steele	91c7adc834	Allow redactions for HTTP queries. The Azure storage driver exposes secrets in the query when using SAS authorization. These secrets can show up during logging or when an error occurs. Allow redaction of queries to prevent secrets from being exposed in logs and errors.	2020-07-14 13:09:48 -04:00
David Steele	682ac656f5	Fix restore --force acting like --force --delta. This caused restore to replace files based on timestamp and size rather than overwriting, which meant some files that should have been updated were left unchanged. Normal restore and restore --delta were not affected by this issue.	2020-07-06 15:03:24 -04:00
David Steele	3f4371d7a2	Azure support for repository storage. Azure and Azure-compatible object stores can now be used for repository storage. Currently only shared key authentication is supported but SAS will be added soon.	2020-07-02 16:24:34 -04:00
David Steele	c5892d1291	Asynchronous S3 multipart upload. When uploading large files the upload is split into multiple parts which are assembled at the end to create the final file. Previously we waited until each part was acknowledged before starting on the processing (i.e. compression, etc.) of the next part. Now, the request for each part is sent while processing continues and the response is read just before sending the request for the next part. This asynchronous method allows us to continue processing while the S3 server formulates a response. Testing from outside AWS in a high-bandwidth, low-latency environment showed a 35% improvement in the upload time of 1GB files. The time spent waiting for multipart notifications was reduced by ~300% (this measurement included the final part which is not uploaded asynchronously). There are still some possible improvements: 1) the creation of the multipart id could be made asynchronous when it looks like the upload will need to be multipart (this may incur cost if the upload turns out not to be multipart). 2) allow more than one async request (this will use more memory). A fair amount of refactoring was required to make the HTTP responses asynchronous. This may seem like overkill but having well-defined request, response, and session objects will also be advantageous for the upcoming HTTP server functionality. Another advantage is that the lifecycle of an HttpSession is better defined. We only want to reuse sessions that complete the request/response cycle successfully, otherwise we consider the session to be in a bad state and would prefer to start clean with a new one. Previously, this required complex notifications to mark a session as "successfully done". Now, ownership of the session is passed to the request and then the response and only returned to the client after a successful response. If an error occurs anywhere along the way the session will be automatically closed by the object destructor when the request/response object is freed (depending on which one currently owns the session).	2020-06-24 13:44:00 -04:00
David Steele	45d9b03136	Add strCatZ(). strCat() did not follow our convention of appending Z to functions that accept zero-terminated strings rather than String objects. Add strCatZ() to accept zero-terminated strings and update strCat() to accept String objects. Use LF_STR where appropriate but don't use other String constants because they do not improve readability.	2020-06-24 12:09:24 -04:00
David Steele	04b2e4a831	Increase log level of checkManifest() to debug. This function is only called once and is very likely throw errors so debug level is more appropriate.	2020-06-23 09:24:18 -04:00
David Steele	3d74ec1190	Use PostgreSQL instead of postmaster where appropriate. Using postmaster in messages was not very helpful since users rarely interact directly with the postmaster. Using PostgreSQL instead seems clearer.	2020-06-17 15:14:59 -04:00
David Steele	c4fe09dabe	Fix incorrect param log types.	2020-06-16 19:25:16 -04:00
David Steele	6fe60a2428	Improve behavior of the repo-ls command. * Exclude linefeed when there is no output to avoid a blank line. * Honor filter when adding . path or listing a single file.	2020-06-11 13:17:35 -04:00
David Steele	d5f451a8b9	Add missing asserts.	2020-05-20 08:52:15 -04:00
David Steele	688ec2a8f5	Use an extension to denote vendorized code. Vendorized code is copied from another project when a library is not available and a git subproject won't work. Currently all the vendorized code is copied from PostgreSQL but it makes sense to have a more general mechanism for indicating vendorized code. The .vendor extension will be used to denote vendorized code in the same way that .auto is used to denote auto-generated code.	2020-05-18 19:11:26 -04:00
David Steele	22d260ad53	Allow more tests to run outside of containers. These tests required sudo to achieve complete coverage. Add a new coverage exception, vm_covered, that applies to code that can only be covered in a container. When the test is run outside of a container code sections that require a container will be excluded with TEST_CONTAINER_REQUIRED and the coverage exception will be added to prevent a coverage error. This does require marking up the core code with vm_covered, which in some modules (e.g. common/io/tls/client) can be extensive. It's possible that some of these tests can be rewritten to be less dependent on sudo but no attempt was made to do that here. Only allow coverage summaries in a vm since coverage summaries outside a vm will not be complete, which was true even before this commit.	2020-05-09 09:17:33 -04:00
Cynthia Shang	cdebfb09e0	Add time-based retention for full backups. The --repo-retention-full-type option allows retention of full backups based on a time period, specified in days. The new option will default to 'count' and therefore will not affect current installations. Setting repo-retention-full-type to 'time' will allow the user to use a time period, in days, to indicate full backup retention. Using this method, a full backup can be expired only if the time the backup completed is older than the number of days set with repo-retention-full (calculated from the moment the 'expire' command is run) and at least one full backup meets the retention period. If archive retention has not been configured, then the default settings will expire archives that are prior to the oldest retained full backup. For example, if there are three full backups ending in times that are 25 days old (F1), 20 days old (F2) and 10 days old (F3), then if the full retention period is 15 days, then only F1 will be expired; F2 will be retained because F1 is not at least 15 days old.	2020-05-08 15:25:03 -04:00

1 2 3 4 5 ...

389 Commits