There were a number of places in the code where "hostId" was used, but hostId is just the option group index + 1 so this led to a lot of +1 and -1 to convert the id to an index and vice versa.
Instead just use the zero based index wherever possible. This is pretty much everywhere except when the host-id option is read or set, or where a message is being formatted for the user.
Also fix a bug in protocolRemoteParam() where remotes spawned from the main process could get process ids that were not 0. Only the locals should spawn remotes with process id > 0. This seems to have been harmless since the process id is only a label, but it could be confusing when debugging.
The defines for FUNCTION_LOG_VERIFY_WAL_RANGE* are not used in the current verify.c and are currently not planned in the continuing development of the verify command, so they are dead code and are therefore being removed.
bufSize() should only be used whem checking the total size of the buffer, not how much of it is currently used.
In these cases bufUsed() and bufSize() are returning the same value but benign-looking code changes could break this assumption.
iniLoad() was trimming lines which meant that a leading space would not pass checksum validation when a manifest was reloaded. Remove the trims since files we write should never contain extraneous spaces. This further diverges the format for the functions that read conf files (e.g. pgbackrest.conf) and those that read info (e.g. manifest) files.
While we are at it also allow [ and # as initial characters. # was reserved for comments but we never put comments into info files. [ denotes a section but we can get around this by never allowing arrays as values in info files, so if a line ends in ] it must be a section. This is currently the case but enforce it by adding an assert to info/info.c.
The tests were originally written by loading values directly into the configuration before the parser was available.
Update to use harnessCfgLoadRaw() to simplify the tests and make them compatible with upcoming config changes.
Note that some unreachable conditions were removed since they could not be reached via a parsed config, only by munging values directly into the config. cfgOptionTest(optionId) was removed because a non-default value must always be set. cfgOptionValid(cfgOptLogTimestamp) was removed because it is true for all commands except for cfgCmdNone, which is checked with an assert.
cfgOptionId() did not recognize deprecated options which made the help command throw errors when they were specified on the command line. cfgParseOption() will correctly identify deprecated options.
cfgParseOption() can also be used in cfgParse() to reduce code duplication when parsing info out of the option value returned by optionFind().
Finally, code the option key index separately in parse.auto.c. For now they are simply added back together but future code will need them separated.
This has always been equivalent to the ConfigCommand enum so it just adds complexity.
It was created for symmetry with ConfigDefineOption, which will also be removed soon.
Currently indexes above 1 do not have dependencies checked, so this doesn't error.
In a future commit we will enable those checks and this will error if it is not fixed.
These constants don't scale well as the index total is increased for an option.
The core code rarely uses these options and they are easily replaced with cfgOptionName().
The tests had started to make use of the constants, so provide functions that build the option name from the optionId and, optionally, the optionKey.
WAL timeline history files were not being expired because they were small and generally not very plentiful.
However, in some cases large numbers of history files may be generated so it makes sense to remove useless history files to keep things tidy.
The history file for the oldest retained timeline is kept for debugging purposes even though it is not used for recovery.
Instead of using memmove() to manage the internal output buffer for every small read, track the current buffer position and only move data when the small read cannot be satisfied and more data is needed.
Group related options together so operations (e.g. valid, test, index total) can be performed on all options in the group.
Previously, options at the top of the hierarchy of the related options were used to do these tests. This was prone to error as option relationships changed and it was not always clear which option (or options) should be used.
Bug Fixes:
* Error with hints when backup user cannot read pg_settings. (Reviewed by Stefan Fercot, Cynthia Shang. Reported by Mohamed Insaf K.)
Features:
* PostgreSQL 13 support. (Reviewed by Cynthia Shang.)
Improvements:
* Improve PostgreSQL version identification. (Reviewed by Cynthia Shang, Stephen Frost.)
* Improve working directory error message. (Reviewed by Stefan Fercot.)
* Add hint about starting the stanza when WAL segment not found. (Contributed by David Christensen. Reviewed by David Steele.)
* Add hint for protocol version mismatch. (Reviewed by Cynthia Shang. Suggested by loop-evgeny.)
Documentation Improvements:
* Add note that pgBackRest versions must match when running remotely. (Reviewed by Cynthia Shang. Suggested by loop-evgeny.)
* Move info command text to the reference and link to user guide. (Reviewed by Cynthia Shang. Suggested by Christophe Courtois.)
* Update yum repository path for CentOS/RHEL user guide. (Contributed by Heath Lord. Reviewed by David Steele.)
Since links are not possible in the command line help just display the name of the linked section.
Also, during reference text rendering there is no out key so make sure it is defined before trying to use it.
This means the same text will appear in both places, which should make it easier to find.
Also update the link code to allow both page and section to be specified rather than only one or the other.
Update the documentation to explicitly state that versions must match across hosts when running remotely.
Add a hint to the protocol version mismatch error to help the user identify the problem.
Add older PostgreSQL versions to the u18 container that were not available before.
This also updates all minor versions for prior versions of PostgreSQL.
Scan the WAL archive for missing or invalid files and build up ranges of WAL that will be used to verify backup integrity. A number of errors and warnings are currently emitted but they should not be considered authoritative (yet).
The command is incomplete so is marked internal.
Previously, catalog versions were fixed for all versions which made maintaining the catalog versions during PostgreSQL beta and release candidate cycles very painful. A version of pgBackRest which was functionally compatible was rendered useless by a catalog version bump in PostgreSQL.
Instead use only the control version to identify a PostgreSQL version when possible. Some older versions require a catalog version to positively identify a PostgreSQL version, so include them when required.
Since the catalog number is required to work with tablespaces it will need to be stored. There's already a copy of it in backup.info so use that (even though we have been ignoring it in the C versions).
Apparently backtrace has not been used for debugging since it was broken in 7fba1f0b.
Even though this is test code it might be good to find a way to test it to prevent regressions.
These values are not used by the Perl integration tests so maybe it would be better to remove them, but for now just update since they should not be changing again for PG13.
This condition used to give a not-very-clear error which we have been intending to improve. But in the meantime the changes in fbff299 resulted in a segfault for this condition instead because the data_directory was assumed to be non-NULL.
Fix this by explicitly throwing an error with hints when any row in pg_settings cannot be selected.
This file is created by pg_basebackup so might be in the data directory if the cluster was restored from a pg_basebackup backup. Also exclude backup_manifest.tmp since it is possible to find that in the backup directory.
Improve the wording of the error message and add a hint to make it clearer what is wrong and how the user can fix it.
Also change the assert to a regular error since this is not an internal error.
If a stop command has been issued the check command fails due to archiving timing out.
Provide a hint to document this situation and point the user in the proper direction.