1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2026-05-22 10:15:16 +02:00
Commit Graph

2532 Commits

Author SHA1 Message Date
Cynthia Shang ceb050e950 Fix flapping test in real/all module.
The restore test function was passing strBackup to the restoreCompare function but when the restore is expected to pick a backup based on a timestamp, then strBackup may not be the one chosen.

Modified the code so that strBackupExpected is set based on the parameters passed to the function and this is then passed to restoreCompare.
2020-02-28 14:50:50 -05:00
Cynthia Shang 089049ec56 Add sleep before/after retrieving timestamp in the user guide.
Adding a sleep before was necessary since only adding a sleep after did not always work. This helps to ensure the backup stop time for the previous backup does not equal time-recovery-timestamp. The sleep after allows enough time between the time retrieval and dropping important_table so PostgreSQL can consistently recover to before the table drop.

Note that these issues were caused by picking a timestamp too close to the restore command or a database operation, not due to any problem in backup selection of the restore command.
2020-02-28 14:30:39 -05:00
David Steele 7d8c0d29fb Remove compress option from config tests.
This option was used for boolean testing but it will soon be deprecated and the semantics changed.  To reduce churn it seems easiest to just use other options for testing.  This will also be helpful when the option is eventually removed.
2020-02-27 14:51:40 -05:00
David Steele dbf6255ab8 Remove compress/compress-level options from commands where unused.
These commands (e.g. restore, archive-get) never used the compress options but allowed them to be passed on the command line. Now they will error when these options are passed on the command line. If these errors occur then remove the unused options.
2020-02-27 12:25:32 -05:00
David Steele 8f5337a136 Add missing static keywords.
Interface functions should be marked static since they can only be called through the IoFilter interface.
2020-02-27 12:21:53 -05:00
David Steele 3f77a83e73 Remove raw option for gz compression.
This was a minor optimization used in protocol layer compression.  Even though it was slightly faster, it omitted the crc-32 that is generated during normal compression which could lead to corrupt data after a bad network transmission.  This would be caught on restore by our checksum but it seems better to catch an issue like this early.

The raw option also made the function signature different than future compression formats which may not support raw, or require different code to support raw.

In general, it doesn't seem worth the extra testing to support a format that has minimal benefit and is seldom used, since protocol compression is only enabled when the transmitted data is uncompressed.
2020-02-27 12:19:40 -05:00
David Steele ee351682da Rename "gzip" to "gz".
"gz" was used as the extension but "gzip" was generally used for function and type naming.

With a new compression format on the way, it makes sense to standardize on a single abbreviation to represent a compression format in the code.  Since the extension is standard and we must use it, also use the extension for all naming.
2020-02-27 12:09:05 -05:00
David Steele 5afd950ed9 Improve performance of MEM_CONTEXT*() macros.
The prior code used TRY...CATCH blocks to cleanup mem contexts when an error occurred. This included freeing new mem contexts that were still being initialized when the error occurred and ensuring that the prior memory context was restored.

This worked fine in production but it involved a lot of setjmp()/longjmp() calls that resulted in longer compilation times and sluggish performance under valgrind, profiling, and coverage testing.

Instead maintain a stack of new contexts and context switches that can be used to do cleanup after an error. Normally, the stack is not used for this purpose and pushing/popping is a cheap operation. In the prior implementation most of the TRY...CATCH logic needed to be run even on success.

One bonus is that the binary is about 8% smaller after this change.  Another benefit is that new contexts *must* be explicitly freed/discarded or an error will occur.  See info/manifest.c for an example of where this is useful outside the standard macros.
2020-02-26 21:15:39 -05:00
Cynthia Shang d68771a4a5 Fix incorrect lcov version in contributing guide. 2020-02-26 20:40:24 -05:00
Cynthia Shang 99b052a38a Update enum formatting and NULL test to project style. 2020-02-25 17:25:12 -05:00
David Steele 9e0dc83e87 Begin v2.25 development. 2020-02-25 17:18:25 -05:00
David Steele 495dec44f0 v2.24: Auto-Select Backup Set for Time Target
Bug Fixes:

* Prevent defunct processes in asynchronous archive commands. (Reviewed by Stephen Frost. Reported by Adam Brusselback, ejberdecia.)
* Error when archive-get/archive-push/restore are not run on a PostgreSQL host. (Reviewed by Stephen Frost. Reported by Jesper St John.)
* Read HTTP content to eof when size/encoding not specified. (Reviewed by Cynthia Shang. Reported by Christian ROUX.)
* Fix resume when the resumable backup was created by Perl. In this case the resumable backup should be ignored, but the C code was not able to load the partial manifest written by Perl since the format differs slightly. Add validations to catch this case and continue gracefully. (Reported by Kacey Holston.)

Features:

* Auto-select backup set on restore when time target is specified. Auto-selection is performed only when --set is not specified. If a backup set for the given target time cannot not be found, the latest (default) backup set will be used. (Contributed by Cynthia Shang.)

Improvements:

* Skip pg_internal.init temp file during backup. (Reviewed by Cynthia Shang. Suggested by Michael Paquier.)
* Add more validations to the manifest on backup. (Reviewed by Cynthia Shang.)

Documentation Improvements:

* Prevent lock-bot from adding comments to locked issues. (Suggested by Christoph Berg.)
release/2.24
2020-02-25 17:05:45 -05:00
David Steele ace41d57d1 Clarify that gzip is always used to compress history files. 2020-02-25 09:34:27 -05:00
David Steele cc743f2e04 Skip pg_internal.init temp file during backup.
If PostgreSQL crashes it can leave behind a pg_internal.init temp file with the pid as the extension, as discussed in https://www.postgresql.org/message-id/flat/20200131045352.GB2631%40paquier.xyz#7700b9481ef5b0dd5f09cc410b4750f6.  On restart this file is not cleaned up so it can persist for the lifetime of the cluster or until another process with the same id happens to write pg_internal.init.

This is arguably a bug in PostgreSQL, but in any case it makes sense not to backup this file.
2020-02-21 11:51:39 -05:00
David Steele 48d0f77fe3 Remove dead LibC macros.
These macros were made obsolete when code was removed from LibC after the C migration was completed.
2020-02-21 11:31:31 -05:00
David Steele dfc5f67233 Fix typo. 2020-02-17 17:12:22 -06:00
David Steele ea0af890d8 Reclassify release note to documentation improvement. 2020-02-12 17:27:44 -07:00
David Steele c6b89d74ec Add reviewer. 2020-02-12 17:20:21 -07:00
David Steele 6353e9428d Error when archive-get/archive-push/restore are not run on a PostgreSQL host.
This error was lost during the migration to C.  The error that occurred instead (generally an SSH auth error) was hard to debug.

Restore the original behavior by throwing an error immediately if pg1-host is configured for any of these commands.  reset-pg1-host can be used to suppress the error when required.
2020-02-12 17:18:48 -07:00
David Steele dac8119bf1 Add pgIsLocalVerify().
This functionality is required in commands other than restore, so centralize it.
2020-02-12 15:47:07 -07:00
David Steele e2c304d473 Prevent defunct processes in asynchronous archive commands.
The main improvement is a double-fork to prevent zombie processes if the parent process exits after the (child) async process. This is a real possibility since the parent process sticks around to monitor the results of the async process.

In the first fork, ignore SIGCHLD in the very unlikely case that the async process exits before the first fork. This is probably only possible if the async process exits immediately, perhaps due to a chdir() failure. Set SIGCHLD back to default in the async process so waitpid() will work as expected.

Also update the comment on chdir() to more accurately reflect what is happening.

Finally, add a test in certain debug builds to ensure the first fork exits very quickly. This only works when valgrind is not in use because valgrind makes forking so slow that it is hard to tell if the async process performed work or not (in the case that the second fork goes missing and the async process is a direct child).
2020-02-12 12:17:23 -07:00
David Steele 1fa3ae2fcd Prevent lock-bot from marking locked issues as "resolved".
This is certainly not true in all cases, e.g. an issue may be closed if it is added to the backlog.
2020-02-11 19:54:17 -07:00
David Steele 1be9e6854e Prevent lock-bot from adding comments to locked issues.
This will hopefully prevent users from getting notifications when an issue is locked.
2020-02-11 19:52:23 -07:00
David Steele 43936c58a8 Fix resume when the resumable backup was created by Perl.
In this case the resumable backup should be ignored, but the C code was not able to load the partial manifest written by Perl since the format differs slightly. Add validations to catch this case and continue gracefully.
2020-02-11 19:44:06 -07:00
David Steele 44adf21c83 Consolidate archive async exec code.
Move duplicated code to the common module.  This will reduce copy and paste between the get and push modules when changes are made.
2020-02-10 21:30:43 -07:00
David Steele 0eaedc9a6a Improve async archive error file removal.
2a06df93 removed the error file so an old error would not be reported before the async process had a chance to try again.  However, if the async process was already running this might lead to a timeout error before reporting the correct error.

Instead, remove the error files once we know that the async process will start, i.e. after the archive lock has been acquired.

This effectively reverts 2a06df93.
2020-02-10 19:17:11 -07:00
David Steele 8cfbc294fc Fix incorrect error code. 2020-02-10 18:48:47 -07:00
David Steele 1ce71b1e9b Add missing linefeed. 2020-02-10 17:44:39 -07:00
David Steele 71b4cc56cb Rename confessOnError to throwOnError.
Confess is awfully Perl-ish and was likely copied verbatim during the migration.  Rename to what we do now, i.e. throw.
2020-02-06 21:11:15 -08:00
David Steele 2a06df93f3 Remove async archive error file when not throwing an error.
This ensures that the error will not be thrown before the async process has a chance to retry.
2020-02-06 20:59:04 -08:00
David Steele 3721e57a0e Clarify why some recovery options are not commented out for PG >= 12. 2020-02-06 18:28:54 -08:00
Mike Palmiotto efff54490f Fix release note typo. 2020-02-04 21:19:21 -08:00
David Steele 296aec03be Update contributor name. 2020-01-31 07:50:03 -07:00
David Steele 0f8ec3e478 Read HTTP content to eof when size/encoding not specified.
Generally, the content-size or content-encoding headers will be used to specify how much content should be expected.

There is a special case where the server sends 'Connection:close' without the content headers and the content may be read up until eof.

This appears to be an atypical usage but it is required by the specification.
2020-01-30 14:51:26 -07:00
Cynthia Shang 856980ae99 Auto-select backup set on restore when time target is specified.
Auto-selection is performed only when --set is not specified. If a backup set for the given target time cannot not be found, the latest (default) backup set will be used.

Currently a limited number of date formats are recognized and timezone names are not allowed, only timezone offsets.
2020-01-30 14:38:05 -07:00
Cynthia Shang f46d1fa74c Add timezone calculations to time module.
Add tzPartsValid() and tzOffsetSecond() to calculate timezone offsets from user provided values.

Update epochFromParts() to accept a timezone offset in seconds.
2020-01-30 11:28:30 -07:00
Cynthia Shang dbaa5e3473 Add linefeeds to function declarations. 2020-01-29 08:21:36 -07:00
David Steele 80687cbe74 Free TLS connection in common/io-http test.
The test that checks for no output from the server was leaving a connection open which valgrind was complaining about.

Wait on the server long enough to cause the error on the client then close the connection to free the memory.
2020-01-28 10:19:58 -07:00
David Steele 846efaa40f Revert 'Add lib path for libpq in case it is in a non-standard location.`
Putting this before AC_CHECK_LIB breaks on many systems because the location of pg_config is not yet known.
2020-01-28 07:36:20 -07:00
Cynthia Shang 324f7cebe0 Designated initializer cleanup.
Cleanup designated initializers created in b134175f by moving struct members in or out for clarity.
2020-01-27 17:50:07 -07:00
David Steele 24d2494c82 Fix incomplete comment. 2020-01-27 11:25:24 -07:00
David Steele 0a845214a1 Fix typo. 2020-01-26 23:10:29 -07:00
David Steele 697150eaf8 Add more validations to the manifest on backup.
Validate that checksums exist for zero size files.  This means that the checksums for zero size files are explicitly set by backup even though they'll always be the same.  Also validate that zero length files have the correct checksum.

Validate that repo size is > 0 if size is > 0.  No matter what compression type is used a non-zero amount of data cannot be stored in zero bytes.
2020-01-26 23:07:07 -07:00
David Steele bb45a80d46 Begin v2.24 development. 2020-01-26 22:47:53 -07:00
David Steele 2358d34485 v2.23: Bug Fix
Bug Fixes:

* Fix missing files corrupting the manifest. If a file was removed by PostgreSQL during the backup (or was missing from the standby) then the next file might not be copied and updated in the manifest. If this happened then the backup would error when restored. (Reviewed by Cynthia Shang. Reported by Vitaliy Kukharik.)

Improvements:

* Use pkg-config instead of xml2-config for libxml2 build options. (Contributed by David Steele, Adrian Vondendriesch.)
* Validate checksums are set in the manifest on backup/restore. (Reviewed by Cynthia Shang.)
release/2.23
2020-01-26 22:38:21 -07:00
David Steele 7ab07dc580 Validate checksums are set in the manifest on backup/restore.
This is a modest start but it addresses the specific issue that was caused by the bug fixed in 45ec694a.  This validation will produce an immediate error rather than erroring out partway through the restore.

More validations are planned but this is the most important one and seems safest for this release.
2020-01-26 21:58:59 -07:00
David Steele 45ec694af2 Fix missing files corrupting the manifest.
If a file was removed by PostgreSQL during the backup (or was missing from the standby) then the next file might not be copied and updated in the manifest. If this happened then the backup would error when restored.

The issue was that removing files from the manifest invalidated the pointers stored in the processing queues.  When a file was removed, all the pointers shifted to the next file in the list, causing a file to be unprocessed.  Since the unprocessed file was still in the manifest it would be saved with no checksum, causing a failure on restore.

When process-max was > 1 then the bug would often not express since the file had already been pulled from the queue and updates to the manifest are done by name rather than by pointer.
2020-01-26 13:19:13 -07:00
David Steele 9b47ff2746 Sort last processing queue on backup from standby.
The last queue was not being sorted when a primary queue was added first.

This did not affect the backup or integrity but could lead to slightly lower performance since large files were not always copied first.
2020-01-26 12:29:53 -07:00
David Steele 0444d37414 Remove obsolete include to ../libc. 2020-01-24 10:43:47 -07:00
Marc Cousin b1c5885017 Add lib path for libpq in case it is in a non-standard location. 2020-01-24 10:40:42 -07:00