The initial implementation used simple waits when having to loop due to getting a LIBSSH2_ERROR_EAGAIN, but we don't want to just wait some amount of time, we want to wait until we're able to read or write on the fd that we would have blocked on.
This change removes all of the wait code from the SFTP driver and changes the loops to call the newly introduced storageSftpWaitFd(), which in turn checks with libssh2 to determine the appropriate direction to wait on (read, write, or both) and then calls fdReady() to perform the wait using the provided timeout.
This also removes the need to pass ioSession or timeout down into the SFTP read/write code.
Double spaces have fallen out of favor in recent years because they no longer contribute to readability.
We have been using single spaces and editing related paragraphs for some time, but now it seems best to update the remaining instances to avoid churn in unrelated commits and to make it clearer what spacing contributors should use.
These were intended to allow the block list to be scanned without reading the map but were never utilized. They were left in "just in case" and because they did not seem to be doing any harm.
In fact, it is better not to have the block numbers because this allows us set the block size at a future time as long as it is a factor of the super block size. One way this could be useful is to store older files without super blocks or a map in the full backup and then build a map for them if the file gets modified in a diff/incr backup. This would require reading the file from the full backup to build the map but it would be more space efficient and we could make more intelligent decisions about block size. It would also be possible to change the block size even if one had already been selected in a prior backup.
Omitting the block numbers makes the chunking unnecessary since there is now no way to make sense of the block list without the map. Also, we might want to build maps for unchunked block lists, i.e. files that were copied normally.
Centralize the code to allow it to be used in more places and update the protocol/server module to use the new code.
Since the time measurements make testing difficult, also add time and errorRetry harnesses to allow specific data to be used for testing. In the case of errorRetry, the production behavior is turned off by default during testing and only enabled for the errorRetry test module.
Each call to lockAcquireP() passed enough information to initialize the lock system. This was somewhat inefficient and as locks become more complicated it will lead to more code duplication. Since a process can only take one type of lock it makes sense to do most of the initialization up front.
Also reduce the log level of lockRelease() since it is only called at exit and the lock will be released in any case.
Output a manifest in text or JSON format. Neither format is complete but they cover the basics.
In particular the manifest command outputs the complete block restore when the filter option is specified and the block delta when the pg option is also specified. This is non-destructive so it is safe to execute against a running cluster.
xxHash is significantly faster than SHA-1 so this helps reduce the overhead of the feature.
A variable number of bytes are used from the xxHash depending on the block size with a minimum of six bytes for the smallest block size. This keeps the maps smaller while still providing enough bits to detect block changes.
Small blocks sizes can lead to reduced compression efficiency, so allow multiple blocks to be compressed together in a super block. The disadvantage is that the super block must be read sequentially to retrieve blocks. However, different super block sizes can be used for different backup types, so the full backup super block sizes are large for compression efficiency and diff/incr are smaller for retrieval efficiency.
The primary goal of the block incremental backup is to save space in the repository by only storing changed parts of a file rather than the entire file. This implementation is focused on restore performance more than saving space in the repository, though there may be substantial savings depending on the workload.
The repo-block option enables the feature (when repo-bundle is already enabled). The block size is determined based on the file size and age. Very old or very small files will not use block incremental.
The callbacks in iniLoad() made the downstream code more complicated than it needed to be so use an iterator model instead.
Combine the two functions that were used to load the ini data to remove code duplication. In theory it would be nice to use iniValueNext() in the config/parse module rather than loading a KeyValue store but this would mean a big change to the parser, which does not seem worthwhile at this time.
It is possible for functions to accidentally leak child contexts into the calling context, which may use a lot of memory depending on the use case and where it happens.
Use the function return type to determine what should be returned and error when something else is returned. Add FUNCTION_AUDIT_*() macros to handle exceptions.
This checking is only performed during unit tests on the code being covered by the specific unit test.
Note that this does not work yet for memory allocations, i.e. memNew(). These are pretty rare so are not as much of an issue and they can be added in the future.
Allocating memory made these functions simpler but it meant that memory was leaking into the calling context when logging was enabled. It is not clear that this was an issue but it seems that trace level logging could result it a lot of memory usage depending on the use case.
This also makes it possible to audit allocations returned to the calling context, which will be done in a followup commit.
Also rename objToLog() to objNameToLog() since it seemed logical to name the new function objToLog().
This fills in backtrace info at the bottom of the call stack when the stack trace is incomplete due to testing. This does not affect release builds, which is why it did not make the first cut, but it turns out to be useful for testing and barely changes the release code (when we do release this).
The recursion test in common/error was simplified because it would now return a very large trace.
When this code was migrated to C the unit tests were not included because there were more important priorities at the time.
This also requires some adjustments to coverage because of the new code location.
Similar to b9be4fa5, these functions are not used by the core code so move them to the build module. The new implementation is a little less efficient but that is much less of a worry in the build/test code.
Also remove regExpMatchSize() since it was not longer needed.
Neither of these functions were used by the core code. strReplace() is only used in the tests but it doesn't hurt to put it in build since the build code is not distributed.
This was done by checking the extension but it is possible to include a module that does not have a vendor or auto extension. Instead make it explicit that the module is included in another module.
Also change the variable from "include" to "included" to make it clearer what it indicates.
This allows test backups to be run in other test modules.
It is likely that more logic will be moved here but for now this suffices to get test backups working in the restore module.
The prior code required coverage in the storage/remote module for all filters that could be used remotely.
Now the filter handlers are set at runtime so any filter list can be used with a remote. This is more flexible and makes coverage testing easier. It also resolves a test dependency.
Move the command/remote unit test near the end so it will have access to all filters without using depends.
This makes it more efficient to read/write (especially read) varint-128 to/from IO.
Update the Pack type to take advantage of the more efficient read and remove some duplicate code.
Direct link creation via Posix functions has been moved to the Posix driver.
This change allows adding SFTP softlink creation in the SFTP driver using the standard interface.
Allow key/value annotations to be added with the backup command and added/modified/removed with the new annotate command.
Annotations can be viewed with the info command in text mode when --set is specified and are always included in JSON output.
Previously a callback was used to list path contents and if no sort was specified then a snapshot was not required. When deleting files from the path some filesystems could omit files that still existed, which meant the path could not be removed.
Filter . out of lists in the Posix driver since this special entry was only used by test code (and filtered everywhere in the core code).
Also remove callbacks from the storage interface and replace with an iterator that should be easier to use and guarantees efficient use of the snapshots.
This module has dependencies on command/command so it does not make sense for it to be in the common module. Also move protocolFree() to main() since this is a very large dependency.
Adjust the tests so command/exit can be tested later. This is a bit messy but will get adjusted as we improve the test harness.
Maintaining the version interfaces was complicated by the fact that each interface needed to be in separate compilation unit to avoid type conflicts. This also meant that various build/test files needed to be updated to add the new interfaces.
Solve these problems by auto-generating all the interfaces into a single file. This is made possible by parsing defines and types out of the header files and creating macros to rename the types. At the end of the version interface everything is undef'd. Another benefit is that the auto-generated interfaces can be static and included directly into postgres/interface.c.
Since some code generation is now always required for tests, change --no-gen to --min-gen in test.pl.
It would also make sense to auto-generate the version defines in postgres/version.h, but that will be left for a future commit.
PostgreSQL 15 drops support for exclusive backup and renames the start/stop backup commands.
This is based on the pgdg-testing repo since beta1 has not been released yet, but it seems unlikely that breaking changes will be made at this point. beta1 should be tagged just before our next release so we'll retest before the release.
This column has been removed in PostgreSQL 15. Rather than add a lot of special handling, it seems better just to update all versions to not depend on this column.
Add centralized functions to identify the type of database (i.e. system or user) by name and use FirstNormalObjectId when a name is not available.
The new query in the db module will still return the prior result for PostgreSQL <= 15, which will be stored in the manifest. This is important to preserve behavior when downgrading pgBackRest. There are no concerns here for PostgreSQL 15 since older versions of pgBackRest won't be able to restore backups for PostgreSQL 15 anyway.
Previously read/writing JSON required parsing/render via a variant, which add many more memory allocations and loops.
Instead allow JSON to be read/written serially to improve performance and simplify the code. This also allows us to get rid of many String and Variant constant which are no longer required.
The goal is to be able to read/write very large (e.g. gigabyte manifest) JSON structures, which would not be practical with the current code.
Note that external JSON (GCS, S3, etc) is still handled using variants. Converting these will require more consideration about key ordering since it cannot be guaranteed as in our own formats.
Sometimes we need to read a lock from another process. This was done two different ways and in the case of cmdStop() was definitely hacky.
Centralize the logic to make it easier to read the locks for another process. This will also make it easier to add new lock data.
Check for invalid path in repo-* commands. Perform path validation and throw an error when appropriate. Path may not contain '//'. Strip trailing '/' from path. Absolute path must fall under repo path.
Improve the stop command, when force and stanza options are specified, to terminate only processes holding lock files for the given stanza. Prior to these changes, termination of all processes holding lock files regardless of stanza occurred.
As much as possible it is better to get coverage with more realistic tests. Merging these modules will allow the page checksum code to be covered with real backups.
Manifests with a very large number of files can use a considerable amount of memory. There are a lot of zeroes in the data so it can be stored more efficiently by using base-128 varint encoding for the integers and storing the strings in the same allocation.
The downside is that the data needs to be unpacked in order to be used, but in most cases this seems fast enough (about 10% slower than before) except for saving the manifest, which is 10% slower up to 10 million files and then gets about 5x slower by 100 million (two minutes on my M1 Mac). Profiling does not show this slowdown so I wonder if this is related to the change in memory layout. Curiously, the function that increased most was jsonFromStrInternal(), which was not modified. That gives more weight to the idea that there is some kind of memory issue going on here and one hopes that servers would be less affected. Either way, they largest use cases we have seen are for about 6 million files so if we can improve that case I believe we will be better off.
Further analysis showed that most of the time was taken up writing the size and timestamp fields, which makes almost no sense. The same amount of time was used if they were hard-coded to 0, which points to some odd memory issue on the M1 architecture.
This change has been planned for a while, but the particular impetus at this time is that small file support requires additional fields that would increase manifest memory usage by about 20%, even if the feature is not used.
Note that the Pack code has been updated to use the new varint encoder, but the decoder remains separate because it needs to fetch one byte at a time.
Manifest defaults for user, group, and mode were previously generated by scanning the data to find the most common values. This was very accurate but slow and complicated. It could also lead to surprising changes in the manifest when a default value suddenly changed.
Instead, use the $PGDATA path to generate defaults. In the vast majority of cases the same user/group should own all the path/files and the default file mode is easily derived from the path mode. There may be some edge cases where this generates larger manifests, but in general it reduces time and complexity when saving the manifest.
Remove the MCV code since it is longer longer used.
There is no evidence that users need 8.3/8.4 anymore but it does cost us in terms of development and testing, especially now that we have a number of new backup/restore features planned.
It seems to make sense to remove this support now. If there are users who need to use/migrate from these versions they can use an older version of pgBackRest.
This leak was caused by the file descriptor variable getting clobbered after a long jump. Mark it as volatile to fix.
Testing this is a bit complex because the issue only happens in optimized builds, if at all. Put the test into the performance suite, which is always optimized, until a better idea presents itself.