1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2025-07-17 01:12:23 +02:00
Commit Graph

4597 Commits

Author SHA1 Message Date
922e9f0775 Verify recovery target timeline.
If the user picks an invalid timeline (or the default is invalid) they will not discover it until after the restore is complete and recovery starts. In that case they'll receive a message like this:

FATAL:  requested timeline 2 is not a child of this server's history
DETAIL:  Latest checkpoint is at 0/7000028 on timeline 1, but in the history of the requested timeline, the server forked off from that timeline at 0/600AA20.

This message generally causes confusion unless one is familiar with it. In this case 1) a standby was promoted creating a new timeline 2) a new backup was made from the primary 3) the new backup was restored but could not follow the new timeline because the backup was made after the new timeline forked off. Since PostgreSQL 12 following the latest timeline has been the default so this error has become common in split brain situations.

Improve pgBackRest to read the history files and provide better error messages. Now this error is thrown before the restore starts:

ERROR: [058]: target timeline 2 forked from backup timeline 1 at 0/600aa20 which is before backup lsn of 0/7000028
       HINT: was the target timeline created by accidentally promoting a standby?
       HINT: was the target timeline created by testing a restore without --archive-mode=off?
       HINT: was the backup made after the target timeline was created?

This saves time since it happens before the restore and gives more information about what has gone wrong.

If the backup timeline is not an ancestor of the target timeline the error message is:

ERROR: [058]: backup timeline 6, lsn 0/4ffffff is not in the history of target timeline B
       HINT: was the target timeline created by promoting from a timeline < latest?

This situation should be rare but can happen during complex recovery scenarios where the user is explicitly setting the target time.
2025-02-04 10:06:17 -05:00
322e764f29 Add Coverity build to release instructions. 2025-01-30 21:43:48 -05:00
6e437defa9 Refactor backupBlockIncrMapSize() range handling to satisfy Coverity.
Coverity complained about a possible overflow of result in the prior implementation.

It appears that Coverity was not able to follow the logic through the try block, but refactor and add an assertion to silence the complaint.
2025-01-30 14:28:28 -05:00
89615eee65 Refactor loop in restoreManifestMap() to satisfy Coverity.
Coverity complained that decrementing targetIdx would result in it equaling UINT_MAX. While this is true it had no impact overall (at it least in the current code) since targetIdx was immediately incremented in the loop.

However, Coverity's suggestion is better and safer for future code updates so it makes sense to change it.
2025-01-30 13:59:42 -05:00
5421ef3e92 Add cast to suppress Coverity complaint about volatile used in assert().
Coverity had this complaint:

assert_side_effect: Argument openData of ASSERT() has a side effect because the variable is volatile. The containing function might work differently in a non-debug build.

It appears this can also be fixed by assigning the volatile variable to an automatic but the cast seems to work just as well.
2025-01-30 13:48:59 -05:00
d5cefb7290 Fix error reporting for queries with no results.
If a query that expected no results returned an error then it would incorrectly report that no results were expected because the error was interpreted as a result.

Switch the order of the checks so that an error is reported instead and add a test to prevent regression.
2025-01-29 13:48:26 -05:00
d50b01b485 Add assertions to satisfy Coverity about possible underflows.
Coverity complained about possible underflows so add assertions to demonstrate that the values in question are greater than zero.
2025-01-28 18:48:11 -05:00
e46374dc7d Lower log level of backupDbPing()/dbPing() to trace.
These functions get called very frequently even though they generally result in a noop at the protocol level.

Lower the log level to reduce noise in the log at debug level.
2025-01-28 15:30:23 -05:00
e625ed8be2 Caveat --tablespace-map-all regarding tablespace creation.
If a tablespace is created after the backup starts then it cannot be mapped using --tablespace-map-all since there is no record of it in the manifest.

This would be extremely complex to fix but it can be documented.
2025-01-28 09:14:30 -05:00
dde1b04772 Add StringId linter.
Verify that all StringIds in the project have been generated correctly.

This also makes it easy to generate new StringIds by copying an existing StringId and modifying the string. The error message will provide the required value.
2025-01-27 17:14:34 -05:00
d582739d82 Convert 5-bit test StringId to 6-bit.
The original string was valid as either 5-bit or 6-bit but since we're trying to test 6-bit update the string to something only valid for 6-bit.
2025-01-27 15:51:57 -05:00
6df96f505f Separate version into component parts.
This guarantees a consistent version representation and allows the version to be easily represented in other ways.
2025-01-23 17:12:05 -05:00
6776940c3b Use three part version in development builds.
This makes the versioning more consistent and is required by a subsequent commit that will separate the version components.
2025-01-23 14:55:44 -05:00
e59385718c Update CI containers to include newest PostgreSQL patch releases. 2025-01-23 08:10:37 -05:00
6fbb28fa2d Do not set recovery_target_timeline=current for PostgreSQL < 12.
PostgreSQL < 12 defaults recovery_target_timeline to current but if current is explicitly set it behaves as if latest was set. Since current is not handled in the PostgreSQL code it looks as if there should be an error during the integer conversion but that doesn't happen due to incorrect strtoul() usage (not checking endptr).

Handle this by omitting recovery_target_timeline from recovery.conf when it is explicitly set by the user to current.
2025-01-23 07:58:41 -05:00
e58d468e27 Fix typo. 2025-01-21 18:39:51 -05:00
931435c017 Allow backup command to operate on remote repositories.
The backup command has always been limited to working only when the repository is local. This was due to some limitations in storage (addressed in 01b81f9) and the protocol helper (addressed in 4a94b6be).

Now that there a no limitations preventing this feature it makes sense to enable it. This allows for more flexibility in where backups are run.
2025-01-21 11:45:50 -05:00
844f91fe3f Specify length of encoding strings.
This saves a byte per string but more importantly makes them match the declaration of encodeHexLookup.
2025-01-20 15:12:27 -05:00
4bc9376d6f Remove "Additional Notes" header from release notes.
This was intended to separate the code changes from documentation and test suite changes but it arguably does not add any clarity.

Since documentation and test suite changes are explicitly marked as such that should be clear enough.
2025-01-20 14:19:25 -05:00
23bd392bdc Improve hex encode performance with bytewise lookup.
Previously, hex encode looked up each nibble of the input separately. Instead use a larger lookup table containing the two-byte encoding of every possible input byte, resulting in a 1/3 reduction in encoding time.

Inspired by and mostly cribbed from PostgreSQL commit e24d7708.
2025-01-20 14:09:54 -05:00
713f6657d3 Merge v2.54.2 release. 2025-01-20 10:57:27 -05:00
7a33d6168b Replace constant version with macro in backup test module. 2025-01-14 13:10:32 -05:00
6244f02bb3 Update runner versions on Github actions.
Ubuntu 20.04 will be EOL soon so update all actions that are using it. Update other actions as far as possible without making too many changes.
2025-01-14 10:50:48 -05:00
fd23257c6a Remove extraneous const qualifier. 2025-01-06 13:50:14 -05:00
b5bb1aa72c Remove makefile formatting from editor config.
This is no longer required since the makefile has been removed.
2025-01-05 13:32:09 -05:00
5fac1b4058 Update LICENSE.txt and PostgreSQL copyright for 2025. 2025-01-02 09:11:19 -05:00
4a94b6bef9 Refactor protocol helper.
Simplify and improve data structures that track protocol client connections. The prior code could not store pg or repo clients but not both. We don't have a need for that yet, but tracking clients only by hostIdx was not flexible for some upcoming improvements. It is important to be able to identify and free clients very precisely.

In general this code should be easier to understand and removes duplicated code for local/remote clients.
2024-12-27 13:51:50 -05:00
13f23f2168 Fix issue after disabling bundling with block incremental enabled.
When bundling and block incremental are both enabled the bundleRaw flag is set to indicate that headers are omitted (whenever possible) for encryption and compression. This is intended to save space, especially when there are very large numbers of small files.

If bundling is disabled this flag needs to be preserved so that existing bundles from prior backups are read correctly. However, the prior code was only saving the flag when bundling was enabled, which caused prior backups to be unreadable if bundling was disabled.

Fix so that the flag is preserved and backups are not broken.
2024-12-26 12:01:59 -05:00
9ee3b2c593 Fix compression type in integration tests.
Due to this bug the compression type in integration tests was always set to none. There are sufficient other tests for compression that this was not masking any bugs, but it was obviously not ideal.
2024-12-26 10:45:11 -05:00
8b9e03d618 Move linkCreate interface function to alphabetical order. 2024-12-23 10:30:41 -05:00
48ecbe422d Clarify behavior of multiple configuration files. 2024-12-19 13:52:59 -05:00
3210c9283f Clarify that unhandled errors may occur in edge cases. 2024-12-16 14:55:44 -05:00
690c9803c3 Add missing const qualifier. 2024-12-16 12:56:03 -05:00
005c7e974f Merge v2.54.1 release. 2024-12-16 12:04:21 -05:00
4d4d23131c Rephrase invitation to star on Github. 2024-12-15 11:11:04 -05:00
fbb31eefca Change "find" to "visit" in introduction. 2024-12-11 10:03:52 -05:00
5c8296df06 Remove reference to disabling network compression in the documentation.
Previously setting compress-level-network=0 would disable compression. This worked because gzip disables compression at this level but still transmits the data in gz format.

lz4 does not provide similar functionality so we would need to disable the compression filter entirely. This does not seem worth it however since lz4 compression is very efficient and 0 is the default fast mode.
2024-12-10 11:22:45 -05:00
d96966065b Add missing const qualifier. 2024-12-09 13:19:55 -05:00
0e143ba7e7 Remove --min-gen option from test.pl.
This option was useful for the Perl code generation and autoconf generation, which were both slow. These are both gone now and the C code generation is fast enough that there is no need to exclude it.

--dry-run will still prevent certain code generation from running. This may not be necessary any more but removing it should be the subject of a separate commit.
2024-11-27 17:05:31 -05:00
cad595f9f8 Full/incremental backup method.
This backup method does a preliminary copy of all files that were last modified prior to a defined interval before calling pg_backup_start(). Then the backup is started as usual and the remainder of the files are copied. The advantage is that generally a smaller set of WAL will be required to make the backup consistent, provided there are some files that have not been recently modified.

The length of the prior full backup is used to determine the interval used for the preliminary copy since any files modified within this interval will likely be modified again during the backup. If no prior full backup exists then the interval is set to one day.

This feature is being committed as internal-only for the time being.
2024-11-26 11:23:43 -05:00
0577b03016 Use lz4 for protocol compression.
lz4 provides much better compression speed and gives similar compression ratios to gz when used at low levels (the gz default was 3).
2024-11-26 11:03:27 -05:00
4af42d93b2 Update release notes for PostgreSQL 17 support.
Accurately reflect when different versions of PostgreSQL were supported since an update was required for beta3.
2024-11-25 10:38:37 -05:00
c351263a1d Fix typos.
Found using `codespell -S *.eps,*.cache,*.xml -L inout,te,fo,bload,fase,collet,hilight,debians,keep-alives` and `typos --hidden --format brief`.
2024-11-22 15:25:43 -05:00
7f2dfc021c Update Fedora test image to Fedora 41. 2024-11-18 13:33:03 -05:00
33d7681347 Enable missing-variable-declarations compiler warning.
Warn if a global variable is defined without a previous declaration. Use this option to detect global variables that do not have a matching extern declaration in a header file.
2024-11-18 10:58:00 -05:00
4ae160aa34 Add wait for async archive log exists check in integration test.
They may be a small delay before the log exists, especially on slower platforms. Add a wait so the test does not fail in this case.
2024-11-15 09:44:15 -05:00
12fe139315 Allow negative values for integer options.
This mostly worked but there was a rendering issue that prevented compilation.
2024-11-13 17:48:14 -05:00
d7c2d2ba1b Move compression driver param list management to a common module.
This code was duplicated in each driver so this means less duplication.

In addition, some drivers were not creating a parameter list for decompression which meant they could not be used remotely. This is not a currently a bug since none of them were being used remotely, but it was a blocker for using lz4 for protocol compression.
2024-11-13 17:28:21 -05:00
274bb24a5a Stabilize async archiving in integration tests.
The integration tests could fail if:

1. After restoring the PostgreSQL instance the recovery process starts, which calls asynchronous archive-get.
2. After archive-get checks the existence of the queue directory, but before writing the WAL file, there are restores when the next test is begun, which leads to the deletion of the queue directory.
3. Since the directory no longer exists, writing the WAL file will fail, and archive-get will write the error file to the queue.
4. A new PostgreSQL instance will start and the recovery process will begin, which requests the WAL file.
5. The new archive-get looks into the queue directory, finds the error file, and throws out the error, after which the PostgreSQL recovery fails because the previous archive-get background process has not finished yet.

This patch fixes the problem by using a separate spool directory for each test.
2024-11-13 09:56:42 -05:00
db912c049c Exclude function void return logging macros from coverage reporting.
An in 355e27d6, it makes sense to exclude FUNCTION_(LOG|TEST)_RETURN_VOID() macros when then they are on the last line of a function because in this case they are a noop (but are still used for debugging).
2024-11-08 10:21:25 -05:00