1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2025-04-21 11:57:01 +02:00

136 Commits

Author SHA1 Message Date
Reid Thompson
6e635764a6
Match backup log size with size reported by info command.
Properly log the size of files copied during the backup, matching the backup size returned from the info command.

In the reference issue, the incremental backup after switchover logs the size of all files evaluated rather than only the size of the files copied in the backup.
2021-11-09 13:24:56 -05:00
David Steele
bc352fa6a8
Simplify strIdFrom*() functions.
The strIdFrom*() forced the caller to pick an encoding, which led to a number of TRY...CATCH blocks in the code. In practice the caller does not care which encoding is used as long as the string is valid for some encoding.

Update the strIdFrom*() function to try all possible encodings and only throw an error when the string is not valid for any of them.
2021-11-01 10:08:56 -04:00
David Steele
d74fe7a222 Add coverage for empty CATCH() blocks.
Currently empty CATCH() blocks are always marked as covered because of the loop structure of error handling.

A prototype implementation of error handling without looping has shown that these CATCH() blocks are not covered without new tests. Whether or not that prototype gets committed it is worth adding the tests.
2021-10-26 13:53:44 -04:00
David Steele
90f7f11a9f Add missing static keywords in test modules. 2021-10-18 12:22:48 -04:00
David Steele
01b20724da Rename PostgreSQL pid file constants and tests. 2021-10-13 19:36:59 -04:00
David Steele
5701620408 Rename manifest file primary flag in tests. 2021-10-13 19:02:58 -04:00
David Steele
c8ea17c68f Convert page checksum filter result to a pack.
The pack is both more compact and more efficient than a variant.

Also aggregate the page error info in the main process rather than in the filter to allow additional LSN filtering, to be added in a future commit.
2021-09-24 17:40:31 -04:00
David Steele
15e7ff10d3 Add Pack pseudo-type.
Rather than working directly with Buffer types, define a new Pack pseudo-type that represents a Buffer containing a pack. This makes it clearer that a pack is being stored and allows stronger typing.
2021-09-23 08:31:32 -04:00
David Steele
0e76ccb5b7 Convert filter param/result to Pack type.
The Pack type is more compact and flexible than the Variant type. The Pack type also allows binary data to be stored, which is useful for transferring the passphrase in the CipherBlock filter.

The primary purpose is to allow more (and more complex) result data to be returned efficiently from the PageChecksum filter. For now the PageChecksum filter still returns the original Variant. Converting the result data will be the subject of a future commit.

Also convert filter types to StringId.
2021-09-22 10:48:21 -04:00
David Steele
475b57c89b Allow additional memory to be allocated with a mem context.
The primary benefit is that objects can allocate memory for their struct with the context, which saves an additional allocation and makes it easier to read context/allocation dumps. Also, the memory context does not need to be stored with the object since it can be determined using the object pointer.

Object pointers cannot be moved, so this means whatever additional memory is allocated cannot be resized. That makes the additional memory ideal for object structs, but not so much for allocating a list that might change size.

Mem contexts can no longer be reused since they will probably be the wrong size so their memory is freed on memContextFree(). This still means fewer allocations and frees overall.

Interfaces still need to be freed by mem context so the old objMove() and objFree() have been preserved as objMoveContext() and objFreeContext(). This will be addressed in a future commit.
2021-09-01 11:10:35 -04:00
David Steele
a0bdfa436c
Log backup file total and restore size/file total.
The backup size was a bit off because it did not include any files (e.g. backup_label, WAL files) that were added to the manifest after the main copy. To fix this move the log message to the very end of the backup.

Add size/file total log message to restore since it did not exist before.
2021-08-11 13:39:36 -04:00
Cynthia Shang
1522cb4ed2 Allow NULL path in HRN_STORAGE_PATH_REMOVE() macro.
Update command/backup test to pass NULL where appropriate.
2021-07-19 15:21:40 -04:00
Cynthia Shang
4ad0bbda53
Update command/backup tests to use standard patterns.
Includes backup and backupCommon tests.

Some tests in backupTest were split out where they were originally combined into a single boolean check - which made it difficult to determine which part of the conditional failed.

String values were also removed where they were no longer needed.
2021-07-15 17:00:20 -04:00
David Steele
4a6ca54b47 Remove last vestiges of key replacement missed in b270253a. 2021-07-15 13:19:49 -04:00
David Steele
849ab343aa Change level of backup/restore copied file logging to detail.
The log level for copied files in the backup/restore commands has been changed to detail. This makes the info log level less noisy but if these messages are required then set the log level for the backup/restore commands to detail.
2021-07-09 13:50:35 -04:00
David Steele
6839335633 Increase harness log level to detail in command/backup test.
Also clean up some unneeded calls to harnessLogLevelReset().
2021-07-08 11:17:13 -04:00
David Steele
6a1c0337dd
Binary protocol.
Switch from JSON-based to binary protocol for communicating with local and remote process. The pack type is used to implement the binary protocol.

There are a number advantages:

* The pack type is more compact than JSON and are more efficient to render/parse.
* Packs are more strictly typed than JSON.
* Each protocol message is written entirely within ProtocolServer/ProtocolClient so is less likely to get interrupted by an error and leave the protocol in a bad state.
* There is no limit on message size. Previously this was limited by buffer size without a custom implementation, as was done for read/writing files.

Some cruft from the Perl days was removed, specifically allowing NULL messages and stack traces. This is no longer possible in C.

There is room for improvement here, in particular locking down the allowed sequence of protocol messages and building a state machine to enforce it. This will be useful for resetting the protocol when it gets in a bad state.
2021-06-24 13:31:16 -04:00
David Steele
c6a8528e31 Add optional remove to TEST_STORAGE_EXISTS().
This allows TEST_STORAGE_EXISTS() to be used in most cases where TEST_STORAGE_REMOVE() was used before.

Rename TEST_STORAGE_REMOVE() to HRN_STORAGE_REMOVE() now that is is no longer used as a test. Still allow an error when the file is missing just to help keep tests tidy.
2021-06-08 14:51:23 -04:00
David Steele
8250990afb Replace harnessCfgLoad*() functions with HRN_CFG_LOAD() macro.
HRN_CFG_LOAD() handles the majority of test configuration loads and has various options for special cases.

It was not clear when to use harnessCfgLoadRaw() vs harnessCfgLoad(). Now "raw" functionality is granular and enabled by parameters, e.g. noStd.
2021-06-01 09:03:44 -04:00
David Steele
bd40156c22 Rename TEST_SYSTEM*() to HRN_SYSTEM*().
These calls are not tests, rather they setup data for tests.
2021-05-22 14:22:51 -04:00
David Steele
73885f8c2e Replace hrnLogResult() with TEST_RESULT_LOG/_FMT().
The macros provide more information when there is an error and may be updated in the future without changing the test code.
2021-05-22 14:09:45 -04:00
David Steele
b270253a69 Add defines for many test*() getter functions.
A define was already added for TEST_PATH but it was not widely used. Replace all occurrences of testPath() with TEST_PATH in the tests.

Replace testUser() with TEST_USER, testGroup() with TEST_GROUP, testRepoPath() with HRN_PATH_REPO, testDataPath() with HRN_PATH, testProjectExe() with TEST_PROJECT_EXE, and testScale() with TEST_SCALE.

Replace {[path]}, {[user]}, {[group]}, etc. with defines and remove hrnReplaceKey(). This is better than having two ways to deal with replacements.

In some cases the original test*() getters were kept because they are used by the harness, which does not have access to the new defines. Move them to harnessTest.intern.h to indicate that the tests should no longer use them.
2021-05-22 09:30:54 -04:00
David Steele
aed3d468a1 Rename strNew() to strNewZ() and add parameter-less strNew().
Replace all instances of strNew("") with strNew() and use strNewZ() for non-empty zero-terminated strings. Besides saving a useless parameter, this will allow smarter memory allocation in a future commit by signaling intent, in general, to append or not.

In the tests use STRDEF() or VARSTRDEF() where more appropriate rather than blindly replacing with strNewZ(). Also replace strLstAdd() with strLstAddZ() where appropriate for the same reason.
2021-05-21 17:36:43 -04:00
David Steele
ef63750e0b Add local process shim.
Run the local process inside a forked child process instead of exec'ing it. This allows coverage to accumulate in the local process rather than needing to test the local protocol functions directly, resulting in better end-to-end testing and less test duplication. Another advantage is that the pgbackrest binary does not need to be built for the test.

The backup, restore, and verify command tests have been updated to use the new shim for coverage.
2021-05-21 12:45:00 -04:00
David Steele
ae7f0af202 Move PostgreSQL version interface test functions to a test harness.
Some version interface test functions were integrated into the core code because they relied on the PostgreSQL versioned interface. Even though they were compiled out for production builds they cluttered the core code and made it harder to determine what was required by core.

Create a PostgreSQL version interface in a test harness to contain these functions. This does require some duplication but the cleaner core code seems a good tradeoff. It is possible for some of this code to be auto-generated but since it is only updated once per year the matter is not pressing.
2021-05-17 07:20:28 -04:00
David Steele
87df6d7a58
Convert BackupType enum to StringId.
Allows removal of backupType()/backupTypeStr() and improves debug logging of the enum.

Move BackupType enum and string constants to info/infoBackup.h so they are available to more modules. Also convert InfoBackup to use BackupType instead of a String.
2021-05-03 12:15:39 -04:00
David Steele
85fc3da4c3
Update CipherType/CipherMode to StringId.
As in 6cc521b, this allows option values and enums to be easily mapped together.
2021-04-28 11:36:20 -04:00
David Steele
aa72c19a83
Do not write files atomically or sync paths during backup copy.
There is no need to write the file atomically (e.g. via a temp file on Posix) because checksums are tested on resume after a failed backup. The path does not need be synced for each file because all paths are synced at the end of the backup.

This functionality was not lost during the migration -- it never existed in the Perl code, though these settings are used in restore. See 59f1353 where backupFile() was migrated to C.
2021-04-23 12:33:25 -04:00
David Steele
ed0d48f52c Add StringId type.
It is often useful to represent identifiers as strings when they cannot easily be represented as an enum/integer, e.g. because they are distributed among a number of unrelated modules or need to be passed to remote processes. Strings are also more helpful in debugging since they can be recognized without cross-referencing the source. However, strings are awkward to work with in C since they cannot be directly used in switch statements leading to less efficient if-else structures.

A StringId encodes a short string into an integer so it can be used in switch statements but may also be readily converted back into a string for debugging purposes. StringIds may also be suitable for matching user input providing the strings are short enough.

This patch includes a sample of StringId usage by converting protocol commands to StringIds. There are many other possible use cases. To list a few:

* All "types" in storage, filters. IO , etc. These types are primarily for identification and debugging so they fit well with this model.

* MemContext names would work well as StringIds since these are entirely for debugging.

* Option values could be represented as StringIds which would mean we could remove the functions that convert strings to enums, e.g. CipherType.

* There are a number of places where enums need to be converted back to strings for logging/debugging purposes. An example is protocolParallelJobToConstZ. If ProtocolParallelJobState were defined as:

typedef enum
{
    protocolParallelJobStatePending = STRID5("pend", ...),
    protocolParallelJobStateRunning = STRID5("run", ...),
    protocolParallelJobStateDone = STRID5("done", ...),
} ProtocolParallelJobState;

then protocolParallelJobToConstZ() could be replaced with strIdToZ(). This also applies to many enums that we don't covert to strings for logging, such as CipherMode.

As an example of usage, convert all protocol commands from strings to StringIds.
2021-04-20 15:22:42 -04:00
David Steele
442b2e41b1 Refactor info modules with inline getters/setters.
Extend the pattern introduced in 79a2d02c to the info modules.
2021-04-09 13:48:40 -04:00
David Steele
b715c70b46 Refactor storage modules with inline getters/setters.
Extended the pattern introduced in 79a2d02c to the storage modules: Storage, StorageRead, StorageWrite.
2021-04-07 14:04:38 -04:00
David Steele
2016fac0d9
Improve protocol handlers.
Make protocol handlers have one function per command. This allows the logic of finding the handler to be in ProtocolServer, isolates each command to a function, and removes the need to test the "not found" condition for each handler.
2021-03-16 13:09:34 -04:00
David Steele
ec347847e5 Pass cipher type directly to backupFileProtocol().
Cipher type was inferred from the presence of cipherSubPass rather than being passed explicitly in order to maintain compatibility with Perl backupFile().

Now that Perl is gone it makes sense to pass it explicitly, as we do elsewhere.
2021-03-12 15:19:32 -05:00
David Steele
e07040c2e4 Add HRN_STORAGE_TIME() harness macro.
Makes updating the time of a path/file more streamlined in tests.

Also update all tests where utime() was being used directly.
2021-03-12 12:54:34 -05:00
David Steele
3c85a497a6 Add backup delta unit test.
This test was added to take the place of another test, which turned out not to be workable.

Even so, it adds coverages at little cost so it seems worth keeping.
2021-03-11 14:40:14 -05:00
David Steele
dc1052f1da Remove extra spaces before macro continuation. 2021-03-11 14:11:21 -05:00
David Steele
c862e9654a
Log archive copy during backup.
Copying can be a fairly expensive operation so it makes sense to log it so the user gets some status during long copy operations.
2021-03-11 08:22:44 -05:00
David Steele
28301199eb Rename FUNCTION_HARNESS_RESULT*() macros to FUNCTION_HARNESS_RETURN*().
When the FUNCTION_*_RESULT*() macros were renamed to FUNCTION_*_RETURN_*() in the core code the test harness macros were missed.

Update them to make the naming consistent.
2021-03-10 18:42:22 -05:00
Cynthia Shang
13dc8e68d7 Make --repo optional for backup command.
If there are multiple repos and the --repo option is not specified then backup will automatically select the highest priority repo.
2021-02-26 14:49:50 -05:00
David Steele
456a300bb7 Remove too-verbose braces in switch statements.
The original intention was to enclose complex code in braces but somehow braces got propagated almost everywhere.

Document the standard for braces in switch statements and update the code to reflect the standard.
2021-01-26 12:10:24 -05:00
David Steele
f95850c546 Fix logical -> bitwise boolean operator in backup unit test.
This unset more than the storageFeatureCompress flag but the test was not affected.

Found on MacOS M1.
2021-01-24 08:12:31 -05:00
Cynthia Shang
f32eb9b94e
Partial multi-repository implementation.
Multi-repository implementations for the archive-push, check, info, stanza-create, stanza-upgrade, and stanza-delete commands.

Multi-repo configuration is disabled so there should be no behavioral changes between these commands and their current single-repo implementations.

Multi-repo documentation and integration tests are still in the multi-repo development branch. All unit tests work as multi-repo since they are able to bypass the configuration restrictions.
2021-01-21 15:21:50 -05:00
David Steele
aeee83044d
Fix resume after partial delete of backup by prior resume.
If files other than backup.manifest.copy were left in a backup path by a prior resume then the next resume would skip the backup rather than removing it. Since the backup path still existed, it would be found during backup label generation and cause an error if it appeared to be later than the new backup label. This occurred if the skipped backup was full.

The error was only likely on object stores such as S3 because of the order of file deletion. Posix file systems delete from the bottom up because directories containing files cannot be deleted. Object stores do not have directories so files are deleted in whatever order they are provided by the list command. However, the issue can be reproduced on a Posix file system by manually deleting backup.manifest.copy from a resumable backup path.

Fix the issue by removing the resumable backup if it has no manifest files. Also add a new warning message for this condition.

Note that this issue could be resolved by running expire or a new full backup.
2021-01-12 12:38:32 -05:00
David Steele
87996558d2
Replace double type with time in config module.
The C code does not use doubles to represent seconds like the Perl code did so time can be represented as an integer which reduces the number of data types that config has to understand.

Also remove Variant doubles since they are no longer used.

Note that not all double code was removed since we still need to display times to the user in seconds and it is possible for the times to be fractional. In the future this will likely be simplified by storing the original user input and using that value when the time needs to be displayed.
2020-12-09 08:59:51 -05:00
David Steele
d4211d3aaf Add retries to PostgreSQL sleep when starting a backup.
Inaccuracies in sleep time or clock skew might make a single sleep insufficient to reach the next second.

Add a few retries to make the process more reliable but still avoid an infinite loop if something is seriously wrong.
2020-12-02 22:41:14 -05:00
David Steele
63c12d2541 Fix spacing and typos in backup/manifest modules. 2020-11-09 16:26:43 -05:00
David Steele
d452e9cc38
Use zero-based indexes when referring to option indexes.
There were a number of places in the code where "hostId" was used, but hostId is just the option group index + 1 so this led to a lot of +1 and -1 to convert the id to an index and vice versa.

Instead just use the zero based index wherever possible. This is pretty much everywhere except when the host-id option is read or set, or where a message is being formatted for the user.

Also fix a bug in protocolRemoteParam() where remotes spawned from the main process could get process ids that were not 0. Only the locals should spawn remotes with process id > 0. This seems to have been harmless since the process id is only a label, but it could be confusing when debugging.
2020-10-26 10:25:16 -04:00
David Steele
76cfd8ca70
Allow [, #, and space as the first character in database names.
iniLoad() was trimming lines which meant that a leading space would not pass checksum validation when a manifest was reloaded. Remove the trims since files we write should never contain extraneous spaces. This further diverges the format for the functions that read conf files (e.g. pgbackrest.conf) and those that read info (e.g. manifest) files.

While we are at it also allow [ and # as initial characters. # was reserved for comments but we never put comments into info files. [ denotes a section but we can get around this by never allowing arrays as values in info files, so if a line ends in ] it must be a section. This is currently the case but enforce it by adding an assert to info/info.c.
2020-10-24 11:07:07 -04:00
David Steele
7d069a2b91 Remove indexed option constants.
These constants don't scale well as the index total is increased for an option.

The core code rarely uses these options and they are easily replaced with cfgOptionName().

The tests had started to make use of the constants, so provide functions that build the option name from the optionId and, optionally, the optionKey.
2020-10-19 14:03:48 -04:00
David Steele
927d9adbee
Improve PostgreSQL version identification.
Previously, catalog versions were fixed for all versions which made maintaining the catalog versions during PostgreSQL beta and release candidate cycles very painful. A version of pgBackRest which was functionally compatible was rendered useless by a catalog version bump in PostgreSQL.

Instead use only the control version to identify a PostgreSQL version when possible. Some older versions require a catalog version to positively identify a PostgreSQL version, so include them when required.

Since the catalog number is required to work with tablespaces it will need to be stored. There's already a copy of it in backup.info so use that (even though we have been ignoring it in the C versions).
2020-09-18 16:55:26 -04:00