The prior code would only connect to the first address provided by getaddrinfo().
Instead try each address in the list. If all connections fail then wait and try them all again until timeout.
Currently a round robin approach is used where each connection attempt must fail before the next connection is attempted. This works fine, for example, when an ipv6 address has no route to the host, but will work less well when a host answers but doesn't respond in a timely fashion.
We may consider a Happy Eyeballs approach in the future, but since pgBackRest is primarily a background process, it is not clear that slightly improved response time (in the case of failed connections) is worth the extra complexity.
The prior code did one list command against the storage for each WAL segment. This led to a lot of lists and was especially inefficient when the WAL (or the majority of it) was already present.
Optimize to keep the contents of a WAL directory and use them on a subsequent search. Leave the optimizations for a single WAL segment since other places still use that mode.
On certain file systems (e.g. ext4) pg_control may appear torn if there is a concurrent write while reading the file. To prevent an invalid read, retry until the checksum matches the control data.
Special handling is required for the pg-version-force feature since the offset of the checksum is not known. In this case, scan from the default position to the end of the data looking for a checksum match. This is a bit imprecise, but better than nothing, and the chance of a random collision in the control data seems very remote considering the ratio of data size (< 512 bytes) to checksum size (4 bytes).
This was discovered and a possible solution proposed for PostgreSQL in [1]. The proposed solution may work for backup, but pgBackRest needs to be able to read pg_control reliably outside of backup. So no matter what fix is adopted for PostgreSQL, pgBackRest need retries. Further adjustment may be required as the PostgreSQL fix evolves.
[1] https://www.postgresql.org/message-id/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de
The prior example (::*) was not valid and would result in the following error:
ERROR: [049]: unable to get address for '::*': [-2] Name or service not known
Correct values are either * for IPv4 or :: for IPv6. The IPv4 value is used as the example since only one example is allowed.
The release.xml file was getting pretty unwieldy so break release notes into separate files. Also break contributors into a separate file.
In theory most of release.xml could now be generated automatically but adding a new release does not represent a serious maintenance burden, so for the time being it does not seem worth it.