pgbackrest

mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-12 10:04:14 +02:00

Author	SHA1	Message	Date
David Steele	c1805134b3	Aggregate error retries in ErrorRetry output. If there are a lot of retries then the output might be very large and even be truncated by the error module. Either way, it is not good information for the user. When a message is repeated, aggregate so that total retries and time range are output for the message. This provides helpful information about what happened without overwhelming the user with data.	2023-09-09 12:54:55 -04:00
David Steele	5314dbffc7	Adjust Wait object to be more accurate when nested. The prior code gave a "free" extra iteration at the end of the wait, functionality that was copied directly from the equivalent code in Perl. This works and is mostly negligible except when wait loops are nested, in which case outer loops will always run twice even if an inner loop times out, which has a multiplying effect. For example, three nested wait loops with a timeout of three seconds will result in the inner loop being run four times (for a total of twelve seconds) even if it times out each time. Instead make waitMore() stop exactly when time is up. This makes more sense because a complete failure and timeout of an inner loop means retrying an outer loop is probably a waste of time since that inner loop will likely continue to fail. Also make waitRemaining() recalculate the remaining time rather than depending on the prior result. Some tests needed to be adjusted to take into account there being one less loop. In general this led to a simplification of the tests. Reinit a begin value in the wait unit tests. This is not related to the current change but it does make the time measurements more accurate and less likely to fail on an edge case, which has been observed from time to time. This change appears to have a benefit for test runtime, which seems plausible especially for nested waits, but a larger sample of CI runs are needed to be sure.	2023-09-09 11:22:33 -04:00
David Steele	d57900839a	Improve backup test harness to handle very large quantities of WAL. The current tests only generate small quantities of WAL per backup but sometimes it is useful to generate large quantities for testing. Fix the issues with generating large quantities of WAL and also improve memory management.	2023-09-04 18:51:44 -04:00
Hartmut Goebel	43524a47a4	Fix configuration reference example for the tls-server-address option. The prior example (::) was not valid and would result in the following error: ERROR: [049]: unable to get address for '::': [-2] Name or service not known Correct values are either * for IPv4 or :: for IPv6. The IPv4 value is used as the example since only one example is allowed.	2023-09-04 13:39:11 -04:00
David Steele	c0935b7992	Split release notes into individual files. The release.xml file was getting pretty unwieldy so break release notes into separate files. Also break contributors into a separate file. In theory most of release.xml could now be generated automatically but adding a new release does not represent a serious maintenance burden, so for the time being it does not seem worth it.	2023-09-04 12:00:06 -04:00
David Steele	6cb9c40fb8	Improve GCS multi-part performance. The prior code avoided uploading a chunk if it was not clear whether the write was complete or not. This was primarily due to the GCS documentation being very vague on what to do in the case of a zero-size chunk. Now chunks are uploaded as they are available. This should improve performance and also reduces the diff against a future commit that absolutely requires zero-size chunks.	2023-09-04 10:39:19 -04:00
David Steele	fd9c6b0e9d	Add missing const qualifier to errorTypeList.	2023-09-01 12:03:39 -04:00
David Steele	4f52015b14	New CI container build for PostgreSQL 16 rc1.	2023-08-31 20:22:01 -04:00
Stefan Fercot	e3fcd8cd9e	Add backup size completed/total to info command JSON output. This allows for backup progress to be monitored by external processes, whereas before backup progress was only available via text output.	2023-08-31 16:36:52 -04:00
David Steele	fbd992adc7	Allow archive-get command to run when stanza is stopped. The restore command can run while the stanza is stopped so it makes sense for the archive-get command to follow the same rule. The important thing is to ensure that all commands that write to the repository are stopped when the stanza is stopped.	2023-08-29 13:49:49 -04:00
David Steele	8424737697	Encode key in HTTP query. The key also needs to be encoded (not just the value). This is not currently an issue because none of the keys in use require encoding.	2023-08-29 12:35:29 -04:00
David Steele	d24180e4da	Fix conflicting group in test containers. The tests expect the group name/id to match between the host system and the container. If there is a conflict rename the group with the required id to the expected name. This could have unintended consequences but it seems reasonably safe since we control everything that runs in the container and there should never be any system processes running.	2023-08-29 12:28:02 -04:00
David Steele	1e6efd73e1	Refactor IoRead handling in backupFile(). Store the value of storageReadIo() rather than calling it each time. This is slightly more efficient, but more importantly it will be needed for an upcoming commit.	2023-08-23 12:56:49 -04:00
David Steele	82c12fed12	Adjust range check per Coverity complaint. Coverity complained that: Overrunning array "optionGroupIndexKeep[optionGroupIdx]" of 512 bytes at byte offset 4294967550 using index "keyIdx" (which evaluates to 4294967294). This does not seem possible but adjust the code to make Coverity happy, as usual.	2023-08-21 13:10:48 -04:00
Reid Thompson	eb32d6de5e	Accept leading tilde in paths for SFTP public/private keys. The documentation indicates that leading tilde file paths for public/private keys are valid but the functionality was omitted from the original implementation.	2023-08-08 08:41:34 +01:00
David Steele	1141dc2070	Multi-stanza check command. Check command now checks multiple stanzas when the stanza option is omitted. The stanza list is extracted from the current configuration rather than scanning the repository like the info command. Scanning the repository is a problem because configuration for each stanza may not be present in the current configuration. Since this functionality is new for check there is no regression. Add a new section to the user guide to cover multi-stanza configuration and provide additional coverage for this feature. Also fix a small issue in the parser when an indexed option has a dependency on a non-indexed option. There were no examples of this case in the previous configuration.	2023-08-07 17:03:09 +01:00
David Steele	995a8e9669	Reload GCS credentials before renewing authentication token. This allows the service key to be updated while a command is running. The new key should be written atomically and ideally the old key should remain valid for some period of time to avoid a race condition if the old token happens to expire at the same time that the new key is being generated.	2023-08-07 13:30:50 +01:00
David Steele	cbafcfabf2	Remove UNCONSTIFY() from gz compression module. The ZLIB_CONST macro may be used since 1.2.5.2 (17 Dec 2011) to specify that the input buffers are const. This is sufficiently old to cover all non-EOL distributions, i.e. everything we test on. Compiling on older distributions may generate warnings but will continue to work.	2023-08-07 12:38:33 +01:00
David Steele	6c75c99d0b	Update xxHash to v0.8.2. This release fixes compilation errors on some platforms and contains numerous performance improvements.	2023-07-31 11:09:17 +02:00
David Steele	f55c8f1b81	Rename HRN_PG_CONTROL_OVERRIDE_PUT(). Rename to HRN_PG_CONTROL_OVERRIDE_VERSION_PUT() since other types of overrides are possible (and coming).	2023-07-26 10:45:12 +02:00
David Steele	65fb3b3b2d	Update config.guess and config.sub to latest versions.	2023-07-24 09:17:46 +02:00
David Steele	c585ddaf48	Begin v2.48 development.	2023-07-24 09:15:38 +02:00
David Steele	b5b033cc1c	v2.47: Performance Improvements and Bug Fixes Bug Fixes: * Preserve block incremental info in manifest during delta backup. (Reviewed by Stephen Frost. Reported by Francisco Miguel Biete Banon.) * Fix block incremental file names in verify command. (Reviewed by Reid Thompson. Reported by Francisco Miguel Biete Banon.) * Fix spurious automatic delta backup on backup from standby. (Reviewed by Stephen Frost. Reported by krmozejko, Don Seiler.) * Skip recovery.signal for PostgreSQL >= 12 when recovery type=none. (Reviewed by Stefan Fercot. Reported by T.Anastacio.) * Fix unique label generation for diff/incr backup. (Fixed by Andrey Sokolov. Reviewed by David Steele.) * Fix time-based archive expiration when no backups are expired. (Reviewed by Stefan Fercot.) Improvements: * Improve performance of SFTP storage driver. (Contributed by Stephen Frost, Reid Thompson. Reviewed by David Steele.) * Add timezone offset to info command date/time output. (Reviewed by Stefan Fercot, Philip Hurst. Suggested by Philip Hurst.) * Centralize error handling for unsupported features. (Reviewed by Stefan Fercot.) Documentation Improvements: * Clarify preference to install from packages in the user guide. (Reviewed by Stefan Fercot. Suggested by dr-kd.)	2023-07-24 09:12:30 +02:00
David Steele	5ed6f8df14	Fix spurious automatic delta backup on backup from standby. When performing backup from standby the file sizes on the standby may not be equal to file sizes on the primary. This is because replication continues during the backup and by the time the file is copied from the standby it may have changed. Since we cap the size of all files copied from the standby this practically applies to truncation and in particular truncation of free space maps (at least, every case we have seen so far is an fsm). Free space maps are especially vulnerable since they are only partially replicated, which amplifies the difference between the primary and standby. On an incremental backup it may look like the size has changed on the primary (because of the final size recorded by the standby in the prior backup) but the timestamp may not have changed on the primary and this will trigger a checksum delta for safety. While this has no impact on backup integrity, checksum delta incrementals can run much longer than regular incrementals and backup schedules may be impacted. The solution is to preserve the original size in the manifest and use it to do the time/size check. In the case of backup from standby the original size will always be the size on the primary, which makes comparisons against subsequent file sizes on the primary consistent. Original size is only stored in the manifest when it differs from final size, so there should not be any noticeable manifest bloat.	2023-07-18 07:35:12 +02:00
David Steele	4c27d74bbd	Preserve block incremental info in manifest during delta backup. It was possible for block incremental info to be lost if a file had been modified in such a way that block incremental would be disabled if the file were new, e.g. if the file shrank below the block incremental limit or the file timestamp regressed far enough into the past. In those cases the block incremental info would not be copied in manifestBuildIncr(). Instead always copy the block incremental info in case the file ends up being referenced to a prior backup. The validation tests were not robust enough to catch this issue so they were improved in `1d42aed`. In the particular case that exposed this bug, a file had a timestamp that was almost four weeks in the past at full backup time. A few days later a fail over occurred and the next incremental ran on the new primary (old standby) in delta mode. The same file had a timestamp just a few hours older than in the full backup, but now four weeks older than the current backup. Block incremental was disabled for the file on initial manifest build because of its age, which meant the block incremental info was not copied into the new manifest. The delta then determined the file had not changed and referenced it to the full backup. On restore, the file appeared to be a normal file stored in a bundle but could not be decompressed because it was in fact a block incremental.	2023-07-17 22:37:02 +02:00
David Steele	cdb7e26350	Fix block incremental file names in verify command. The verify command was not appending the .pgbi extension instead of the compression extension when verifying block incremental files stored outside a bundle. Originally the idea was that verify would not need any changes (since it just examines repo-size and checksum) but at some point the new extension was added and broke that assumption. Use backupFileRepoPathP() to generate the correct filename (Just like backup, restore, etc).	2023-07-14 17:53:58 +03:00
David Steele	1d42aed152	Improve validation of referenced files in command/backup unit test. Referenced files were not being checked for validity unless they were hard linked to the current backup (which a lot of the tests did). Newer tests with bundling do not have hard links and so missed these checks. Improve the validation code to check referenced files in the manifest even when they are not hard linked into the current backup. Add a delta test for bundling/block incremental that includes a file old enough to get a block size of zero. These are good tests by themselves but they also reduce the churn in an upcoming bug fix.	2023-07-14 11:18:25 +03:00
David Steele	c9703b3530	Log when file size changes during backup. It is possible for the size of a file to change during the backup. In most cases we won't notice since files sizes are usually capped but it is possible for some files to grow or shrink between when they are recorded in the manifest and when they are actually copied. The new size is recorded in the manifest but the old size may be useful for debugging. The new code has coverage but no test changes because it is covered by the parallel backup testing, which does not have deterministic log output. It doesn't seem worth creating a new test to check the log format as it is not very critical (though the output was manually verified).	2023-07-13 13:47:27 +03:00
David Steele	b6b13bd634	Mention block-level backups in feature list.	2023-07-12 13:23:16 +03:00
David Steele	c8afbed530	Increase protocolServerError() logging to debug level. Errors should be rare enough that it makes sense to log them at debug level. Right now if there is an error if won't be logged at debug level, which makes it harder to tell why the main process may have terminated the local/remote process.	2023-07-12 13:09:34 +03:00
David Steele	06536b5814	Rename macros in command/verify unit test. These macros were redefined, which worked since they were identical to their prior definitions, but this will not always be true.	2023-07-12 10:20:09 +03:00
David Steele	aa229a1dee	Modify time_t casts based on complaints from Coverity. Coverity complained about time_t being cast directly to unsigned int, so instead cast the result of the operation. We are confident in both cases that the time_t values will not be out of unsigned int range but Coverity has no way to know that. One of these is new (introduced by `9efd5cd0`) but the other one (from `a9867cb0`) remained unnoticed for a while, though it has not caused any production impact.	2023-07-09 21:56:05 +03:00
Stephen Frost	28b6b2d465	Improve performance of SFTP storage driver. The initial implementation used simple waits when having to loop due to getting a LIBSSH2_ERROR_EAGAIN, but we don't want to just wait some amount of time, we want to wait until we're able to read or write on the fd that we would have blocked on. This change removes all of the wait code from the SFTP driver and changes the loops to call the newly introduced storageSftpWaitFd(), which in turn checks with libssh2 to determine the appropriate direction to wait on (read, write, or both) and then calls fdReady() to perform the wait using the provided timeout. This also removes the need to pass ioSession or timeout down into the SFTP read/write code.	2023-07-07 10:36:15 +02:00
David Steele	125676ae0e	Cleanup of init/handshake in storageSftpNew(). Rename handshakeStatus to rc to be consistent with the rest of the module. Add comments and do some reformatting.	2023-07-07 09:56:26 +02:00
David Steele	9efd5cd0bb	Add timezone offset to info command date/time output. This makes it easier to use timestamps from the info command directly in PostgreSQL recovery settings, especially the --target option.	2023-07-06 18:46:31 +02:00
David Steele	762498f4cd	Fix time-based archive expiration when no backups are expired. In the case that no backups were expired but time-based retention was met no archive expiration would occur and the following would be logged: INFO: time-based archive retention not met - archive logs will not be expired In most cases this was harmless, but when retention was first met or if retention was increased, it would require one additional backup to expire earlier WAL. After that expiration worked as normal. Even once expiration was working normally the message would continue to be output, which was pretty misleading since retention had been met, even though there was nothing to do. Bring this code in line with count-based retention, i.e. always log what should be expired at detail level (even if nothing will be expired) and then log info about what was expired (even if nothing is expired). For example: DETAIL: repo1: 11-1 archive retention on backup 20181119-152138F, start = 000000010000000000000002 INFO: repo1: 11-1 no archive to remove	2023-07-06 11:27:00 +02:00
David Steele	e280ed9098	Make result code handling in storage/sftp more consistent. Initializers are useless since rc is always set later. Make rc checks consistent with the rest of the module.	2023-07-06 08:58:16 +02:00
David Steele	1fd8845c7f	Remove unresolved todo from `87087fac`. Seems easiest just to make the additional config required since it tests that custom ports are being used correctly. The test for synthetic was a noop since SFTP is not used in synthetic tests.	2023-06-29 11:08:58 +02:00
David Steele	0051d7ca87	Update comments for removal of chunking and block numbers. `dd4e526` should have updated this comment but failed to do so.	2023-06-29 09:42:12 +02:00
Andrey Sokolov	0ac09344dc	Fix unique label generation for diff/incr backup. If there were at least two full backups and the last one was expired, it was impossible to make either a differential or incremental backup without first making a new full backup. The backupLabelCreate() function identified this situation as clock skew because the new backup label was compared with label of the expired full backup. If the new backup is differential or incremental, then its label is now compared with the labels of differential or incremental backups related to the same full backup. Also convert a hard-coded date length to a macro.	2023-06-28 18:19:20 +02:00
David Steele	5cbef3ade2	Fix missed memory auditing in FUNCTION_LOG_RETURN_VOID(). `9ca492c` missed adding auditing to this macro and as a result a few memory leaks have slipped through. Add auditing to the macro to close this hole. Of the leaks found the only possibly serious one is in blockIncrProcess(), which would leak a PackRead of about eight bytes with every superblock. Since superblocks max out at a few thousand per file this was probably not too bad. Also change the ordering of auditing in FUNCTION_TEST_RETURN_VOID(). Even though the order does not matter, having it different from the other macros is confusing and looks like an error.	2023-06-25 17:36:57 +02:00
David Steele	ecae001653	Add test for configuration hash type override behavior. This behavior is different than regular options where a repeated value will result in an error. It appears to be a legacy of the original Perl implementation, which used a hash as the underlying data type in the built-in command-line parser, and the C command-line parser was written to match.	2023-06-24 10:18:29 +02:00
David Steele	434938e32b	Update Fedora test image to Fedora 38. This adds testing for the latest tool versions, e.g. gcc.	2023-06-22 18:23:06 +02:00
David Steele	5531e2662d	Add --tz param missed in C unit test migration. This was missed in the C unit test migration and since then a new test was added that was not setting its timezone correctly. This feature exists to make sure the tests will run on systems with different timezones and has no impact on the core code.	2023-06-18 12:55:29 +03:00
David Steele	a36f480c85	Clarify preference to install from packages in the user guide.	2023-06-14 14:39:40 +02:00
David Steele	9b1bc7d11b	Use new LibXML error handler. The old error handler has been deprecated.	2023-06-14 12:28:00 +02:00
David Steele	4adf6eed09	Clarify comments for the handling of pg_control during restore.	2023-06-12 23:10:14 +02:00
David Steele	396e237d53	Update NEWS.md for new version and features.	2023-06-12 15:41:39 +02:00
David Steele	6875358d11	Fix typo in "Upgrading PostgreSQL" section of user guide.	2023-06-12 11:26:03 +02:00
David Steele	818ef4b71d	Improve comments in protocol/command module.	2023-06-08 16:36:25 +03:00

1 2 3 4 5 ...

4286 Commits