Now that the StringIds are not stored in parse.auto.c.inc it may not be obvious if an invalid StringId is generated. By moving the encoding to after the value has been validated we can error if the StringId is invalid rather than throwing an error that the value does not exist in the allow list.
Mostly this just makes the code cleaner but there are also a few function calls that were replaced with the macro.
Also tighten up the logic in cfgParseOptionValuePack() a bit even though is not related to this macro change.
These StringIds use space and increase churn when new ids are added. It is easy and efficient to use strings for comparison and convert to StringId in code.
Fetch credentials automatically using EKS pod identity, which removes the need for static configuration. Credentials are automatically updated before they expire to support long-running commands.
Since the server exits on error in this test there is no need to shut it down when the script completes. This usually worked because the message would arrive before the server had shutdown completely but sometimes it would error on a broken pipe.
Configuration of specific cipher suites may be required for compliance or to use preferred ciphers for security.
Cipher suites are applied to the entire process and cannot be configured on a per-connection basis, except that for object store clients (e.g. S3) verification can be disabled.
TLS 1.0 and 1.1 have been deprecated since 2021 (see RFC-8996) and TLS 1.2 has been available since 2008. As such it makes sense to require TLS >= 1.2 when verification is enabled. Verification is always enabled for TLS protocol sessions within pgBackRest but can be disabled for object stores (.e.g. S3) to support self-signed certificates on internal servers.
There is a slight change in behavior when verification is disabled. In prior versions SSL 2/3 would be disabled but now they are allowed (as well as TLS 1.0/1.2). With verification disabled it doesn't seem useful to be picky about protocol versions and disabling TLS 1.0/1.1 could easily cause breakage on older TLS servers.
Fetch credentials automatically using managed identities, which removes the need for static configuration. Credentials are automatically updated before they expire to support long-running commands.
In practice StringIds that could not be output as the original string were not useful. Remove this functionality to simplify the code and reclaim the bit for other purposes.
Functions for creating String and StringZ objects should be in their respective modules so move them there. This simplifies the dependencies for StringId and is more modular.
Moving these constants to stringStatic.h reduces the dependency on stringZ.h for low-level debug logging. This makes it possible to add new capabilities to stringZ.c.
After JsonRead was introduced the performance-sensitive areas of info were migrated from Variant. However, other areas were left using Variant because they were not important enough to update at that time.
Migrate remaining Variant usage to JsonRead wherever possible for consistency. Also improve memory management to avoid some cruft that would end up in the object mem context and avoid switching to the object mem context when possible.
Allow users to specify HTTP in the endpoint but default to using HTTPS in all other scenarios to preserve the existing behavior.
Extend HttpUrl with a `defaultType` parameter to support either:
- Explicitly specifying a protocol via `.type` and enforcing that protocol is used in the URL.
- Allowing protocol to be parsed from URL, but providing default via `.defaultType` if no protocol is found in the URL.
Add partial write handling in fdWrite() to support non-blocking socket operations. The write loop now handles EAGAIN errors by waiting for the file descriptor to become writable, and continues writing the remaining bytes when write() returns fewer bytes than requested. This is required for HTTP, which may use non-blocking sockets, but doesn't have built in handling like the TLS client we are using for HTTPS. Also wrap the write() call and add a shim and additional logging for easier unit testing.
Previously it was possible to achieve a deadlock in a signal handler, for example when SIGTERM (i.e. sent by `pgbackrest stop --force`) arrives when a lock used in `gmtime_r` is taken. Then the next time logging is done, it will deadlock on `gmtime_r`.
In general, most stdlib functions are not safe to call in signal handlers, only so called async-signal safe functions are. In particular, `snprintf` isn't safe since it is allowed to internally call `malloc`. The `exitSafe` function isn't safe due to extensive use of allocations. Because of this, we need to use a simpler logging format in signal handlers, one that only uses async-signal safe functions.
Prior to this commit it was difficult to expire just the oldest full backup while ignoring current retention settings. The user had to manually update retention or script something to automate it. Expiring the oldest full backup is useful when disk space is running low.
Add the --oldest option to allow expiration of the oldest full backup and any dependent backups regardless of the current retention settings. Archive retention is also adjusted to expire WAL before the oldest retained full backup.
The reasoning in the FAQ and code about RFC-2818 is only valid when using host style URIs. According to AWS S3 bucket naming standard the allowed characters is any lowercase alphanumeric including dash and dot.
Most self hosted S3 services utilize path based URIs where dots are valid in a bucket name so this check should only apply for host based buckets. Even though its use is not recommended and path-based access is being phased out of AWS S3 it is still valid and should function for other providers.
Options with unresolved dependencies can have an implied default specified. This makes the code a bit simpler since we don't need to check for option validity.
However, there was an edge case where if an option was specified in the config file and ultimately the dependency was not resolved then the option would not be marked as default and therefore show up in the option logging at the beginning of a command. The default value was correct so everything operated as expected but the logging was confusing.
In the case of an implied default, reinitialize the option struct so that any leftover settings will be reset.
The prior code allocated the entire chunk buffer when the file was opened. However, in practice many files are smaller than the chunk buffer, especially in the main process.
Instead grow the chunk buffer as data comes in to save memory when smaller files are being processed. This adds some overhead for reallocations but modern processors do this very efficiently so it should not be significant compared to the cost of compressing, encrypting, and transferring files. Even so, the growth is fairly aggressive when the input buffers are full so only one or two reallocation are required to get to the default chunk size.
Previously an S3 upload with default repo-storage-upload-chunk-size would only work for files <= 50GiB because of the limited number of chunks allowed. GCS has a smaller chunk size default so it topped out at 40GiB. Azure allows 50,000 chunks so it allowed up to 200GiB.
These are all far larger than files PostgreSQL will create but these days a data directory might also contain files created by plugins that can be much larger.
Since the eventual file size is not known in advance (due to compression) it is hard to pick an appropriate chunk size in advance. Instead, dynamically grow the chunk size over time to reach 5TiB for S3 and GCS (their upper limit). Azure has more parts so it will reach 45TiB, which is smaller than the upper limit of 190TiB, but seems sufficient for now.
The default buffer size is used for the first GiB (plus some) to provide compatibility with any clones that do not support variable block sizes. There is no evidence that this is a problem but better to be safe.
The minimum values for repo-storage-upload-chunk-size have been increased to match vendor minimums and simply the chunk size algorithm.
Since PostgreSQL 10 these settings have been defaulted to values required by the user guide so there is no need to explicitly set them.
PostgreSQL 9.5/9.6 are still supported by pgBackRest but are not represented in the user guide since they are EOL.
32-bit testing was broken by 24802a08, which was attempting to fix multi-architecture builds by using docker to set the architecture.
i386 is not a special case but the prior alternate architectures did not run integration tests. This requires passing the architecture around since the integration test main process runs on the host system, which may be a different architecture.
This makes maintenance easier. Also fix the command list for db-timeout so it matches pg-database, i.e. all the commands that can connect to the database.
Add +inherit, +role, and -command to help with command maintenance. These allow command lists to automatically add new commands without them needing to be added manually. They should also be easier to read than long command lists.
In many cases the valid commands are based on the commands valid for roles. In these cases derive the commands from a role list rather than an explicit command list.
Not only is this notation more compact but it helps prevent new commands from being missed.
This exposed a few issues:
1) The cmd option should only be valid when a command supports the local role since it is used to execute the local process. A number of commands were included before that did not have the local role.
2) cmd-ssh should be valid for any command that allows remotes. The annotate command was missing from this list.
3) compress-level-network should be valid for any command that allows remotes. The repo-rm command was missing from this list.
Restoring to a remote pg-host is not supported but the options were a bit untidy. Many options were marked as internal but should be invalid. repo-host-type and repo-host are required to let restore know if a pg-host is configured and remain internal but the rest of the pg-host-* options are now invalid for restore.
The same applies to the archive-get and archive-push commands although these were less likely to cause confusion.
Also reverse the dependency of pg-host and pg-host-type, i.e. make pg-host-type depend on pg-host, and alter pg-host-cmd and pg-host-user to depend on pg-host-type=ssh.
931435c0 added the ability to backup to a remote repo but did not quite get the option updates right. It worked, but a number of options were marked as internal so would not be visible to the user in command-line help.
Also reverse the dependency of repo-host and repo-host-type, i.e. make repo-host-type depend on repo-host, and alter repo-host-cmd and repo-host-user to depend on repo-host-type=ssh.
Previously internal state was not included because it does not affect how commands/options are used -- only whether they are shown in help. However, this makes it hard to know when the internal state changes because help is generated at build time and in any case is just a binary blob.
Internal state is not stored in the config structures since the macros resolve to nothing but it is handy for debugging to see when internal state has changed.
The Perl processing of config.yaml put hard limits on the format of that file. To allow flexibility in the file format remove all Perl processing on config.yaml.
This is just the beginning of migrating the preprocessor to C but even this small bit allows the removal of a lot of Perl code.
55e9969 updated the meson version but only reverted one of the changes implemented in 0eccbc8 where the version was lowered to >= 0.45.
Use get_option() as allowed by the updated version for clearer and more robust build code.
Per our policy to support five EOL versions of PostgreSQL, 9.5 is no longer supported by pgBackRest. Remove all logic associated with 9.5 and update the tests.
An effort was made to advance versions as much as possible in the tests while still providing coverage. Hopefully this will reduce churn when future versions expire, though it has created a bit more here.
Tests for 9.4/9.5 are left in the expire/info tests to demonstrate that these commands work with old versions present.
The 9.6 pg_control struct was being used for 9.5. This was not detected by testing because the new field introduced for 9.6 fit into an alignment hole in the 9.5 struct so the size of the struct and offset of all other members did not change. Since the new member was not used there was no impact on functionality.
9.5 is being removed in the next release so the only reason to fix this is to make the diff for that change more sensible, and to document that this happened.
Some of these types were versioned at one time. Others were not but it seemed better to version all of them for consistency. In fact this just creates churn when PostgreSQL versions are expired.
Also move the uint64 type to version.vendor.h since it is only used by versioned types.