As calculated this size is not correct since it does not include the parts of prior block incrementals that are required to make the current block incremental valid. At best this could be approximated and the resulting values might be very confusing.
For now, at least, exclude this metric for block incremental backups.
xxHash is significantly faster than SHA-1 so this helps reduce the overhead of the feature.
A variable number of bytes are used from the xxHash depending on the block size with a minimum of six bytes for the smallest block size. This keeps the maps smaller while still providing enough bits to detect block changes.
Small blocks sizes can lead to reduced compression efficiency, so allow multiple blocks to be compressed together in a super block. The disadvantage is that the super block must be read sequentially to retrieve blocks. However, different super block sizes can be used for different backup types, so the full backup super block sizes are large for compression efficiency and diff/incr are smaller for retrieval efficiency.
Forks may update pg_control version or WAL magic without affecting the structures that pgBackRest depends on.
This option forces pgBackRest to treat a cluster as the specified version when it cannot be automatically identified.
When restoring an offline backup on PostgreSQL >= 12, skip writing recovery.signal by default since this will error if the backup was made with wal_level=minimal. If the user explicitly sets the type option to something other than none, then write recovery.signal as usual since it is possible to do Point-In-Time-Recovery from an offline backup as long as wal_level was not minimal.
Raw encryption was already being used for block incremental. This commit adds raw compression to block incremental where possible (see da918587).
Raw compression/encryption is also added to bundling for a backup set when block incremental is enabled on the full backup. This prevents a break in backward compatibility since block incremental is not backward compatible.
Raw format saves 12 bytes of header for gzip and 4 bytes of checksum for lz4 (plus CPU overhead). This may not seem like much, but over millions of small files or incremental blocks can really add up. Even though it may be a relatively small percentage of the overall backup size it is still objectively a large amount of data.
Use raw format for protocol compression to exercise the feature.
Raw compression format will be added to bundling and block incremental in a followup commit.
Make the interface object the parent of the driver object rather than the interface being allocated directly in the driver object.
The prior method was more efficient when mem contexts had a much higher cost. Now mem contexts are cheap so it makes more sense to structure the objects in a way that works better with mem context auditing. This also means the mem context does not need to be stored separately since it can be extracted directly from the interface object.
There are other areas that need to get the same improvement before the specialized objMoveContext() and objFreeContext() functions can be removed.
This flag does not currently affect restore behavior but it will in an upcoming commit. Set the flag here to simplify the test diff in the upcoming commit.
It is possible for a group index to be created for an option that is later found to not meet dependencies. In this case all values would be default leading to a phantom group, which can be quite confusing.
Remove group indexes that are all default (except the final one) and make sure the key for the final all default group index is 1.
Add an explicit statement that there is nothing special to do when upgrading between 2.x versions.
Leave the previous paragraph about the default location that changed between 2.00 and 2.02, as it is more a matter of transitioning from 1.x to 2.x.
The code is not completely reflowed yet so there are some cases that uncrustify will not catch. The formatting will be improved over time.
Some block of code require special formatting so have been surrounded with the {uncrustify-off}/{uncrustify-on} markers. These exceptions should be kept to a minimum.
Add --code-format (to reformat code) and --code-format-check (to check formatting) to test.pl.
Add a CI test that will check code formatting. Code must be correctly formatted before it can be merge to integration.
Add documentation to the coding standards for code formatting.
uncrustify has been configured to be as close to the current format as possible but the following changes were required:
* Break long struct initializiers out of function calls.
* Bit fields get extra spacing.
* Strings that continue from the previous line no longer indented.
* Ternary operators that do not fit on a single line moved to the next line first.
* Align under parens for multi-line if statements.
* Macros in header #if blocks are no longer indented.
* Purposeful lack of function indentation in tests has been removed.
Currently uncrustify does not completely reflow the code so there are some edge cases that might not be caught. However, this still represents a huge improvement and the formatting can be refined going forward.
Support code for uncrustify will be in a followup commit.
Bring the format(printf) attribute in line with the FN_NO_RETURN and FN_INLINE_ALWAYS macros.
This is simpler to read and can be customized for different compilers.
stackTraceToZ() was split this way in c8264291 to allow complete coverage. 0becb6da added a shim to improve coveage but missed simplifying the function.
Improvements:
* Remove support for PostgreSQL 9.0/9.1/9.2. (Reviewed by Stefan Fercot.)
* Restore errors when no backup matches the current version of PostgreSQL. (Contributed by Stefan Fercot. Reviewed by David Steele. Suggested by Soulou.)
* Add compress-level range checking for each compress-type. (Reviewed by Stefan Fercot. Suggested by gkleen, ViperRu.)
Documentation Improvements:
* Add warning about enabling "hierarchical namespace" on Azure storage. (Reviewed by Stefan Fercot. Suggested by Vojtech Galda, Pluggi, asjonos.)
* Add replacement for linefeeds in monitoring example. (Reviewed by Stefan Fercot. Suggested by rudonx, gmustdie, Ivan Shelestov.)
* Clarify target-action behavior on various PostgreSQL versions. (Contributed by Chris Bandy. Reviewed by David Steele, Anton Kurochkin, Stefan Fercot. Suggested by Anton Kurochkin, Chris Bandy.)
* Updates and clarifications to index page. (Reviewed by Stefan Fercot.)
* Add dark mode to the website. (Suggested by Stephen Frost.)
Running valgrind and backtrace together has been causing tests to timeout in CI, mostly likely due to limited resources. This has not been a problem in normal development environments.
Since it is still important to run backtraces for debugging, split the u22 test that was doing all this work to run coverage and backtrace together and valgrind-only as a separate test. As a bonus these tests run faster separately and since they run in parallel the total execution time is faster.
If this feature is enabled expire will fail since directories need to be deleted separately.
Ideally we would add support for this feature but for now we'll just document the issue.
The primary goal of the block incremental backup is to save space in the repository by only storing changed parts of a file rather than the entire file. This implementation is focused on restore performance more than saving space in the repository, though there may be substantial savings depending on the workload.
The repo-block option enables the feature (when repo-bundle is already enabled). The block size is determined based on the file size and age. Very old or very small files will not use block incremental.
The callbacks in iniLoad() made the downstream code more complicated than it needed to be so use an iterator model instead.
Combine the two functions that were used to load the ini data to remove code duplication. In theory it would be nice to use iniValueNext() in the config/parse module rather than loading a KeyValue store but this would mean a big change to the parser, which does not seem worthwhile at this time.
It is possible for functions to accidentally leak child contexts into the calling context, which may use a lot of memory depending on the use case and where it happens.
Use the function return type to determine what should be returned and error when something else is returned. Add FUNCTION_AUDIT_*() macros to handle exceptions.
This checking is only performed during unit tests on the code being covered by the specific unit test.
Note that this does not work yet for memory allocations, i.e. memNew(). These are pretty rare so are not as much of an issue and they can be added in the future.
Allocating memory made these functions simpler but it meant that memory was leaking into the calling context when logging was enabled. It is not clear that this was an issue but it seems that trace level logging could result it a lot of memory usage depending on the use case.
This also makes it possible to audit allocations returned to the calling context, which will be done in a followup commit.
Also rename objToLog() to objNameToLog() since it seemed logical to name the new function objToLog().
Maybe this once had a deeper purpose but now at least it is just used to avoid a few trivial allocations. If we really wanted to do that a flag would be better but it does not seem worth the trouble.